- SELECT * is bad if you don’t know how large the amount of data can be you get back.
- … doing a while loop over an unknown amount of data from said SELECT * is even worse.
- … doing this while some end user is waiting for the application to respond while the application happily runs in a loop over an unknown amount of data and calls a remote system (an waits for answer) for each data set is kinda suicidal. For your system and also probably the remote system.
- … tricking the application into NOT deleting data that was already transmitted to the remote system and trying to send it all again on the next run is what you might call a self regulating system. Your application will get slower from call to call and eventually is gets so slow that people stop using the feature in question. Mission accomplished, system saved.
There’s also the usual variation like: end user waits for a page to load while the application tries to load an unknown amount of data from a remote system BEFORE delivering the page. End user thinks page does not load, clicks again and again and triggers a new data fetching call on each click.
Lessons to be learned:
- Unless you know what you get back, don’t use SELECT *, only fetch from each row of data what you need. If there’s the chance you get too much add a LIMIT before you choke.
- Don’t do loops without adding an emergency brake, bail out if a maximum execution time or some maximum count of cycles is reached.
- Had I only had a unit test in the right place 😉
- Add some fail counter on to be transmitted data, if the remote application does not like it after n retries forget trying to send it, it only lessens the chance to get the valid stuff transmitted.
I learned some equivalent of that lesson long ago. But somehow that’s a kind of booby trap every developer somehow has to step into at some point. Read, remember and step around that trap 😉