M1ke

Find Failure Faster

Computers are powerful tools, and learning how to harness them is a useful and maybe even essential skill. Unfortunately whilst a computer should be pretty much infallible, even the best of developers can't guarantee that they've thought of every possibility, accounted for every way a user could make mistakes or just used the right variable or function names. When things go wrong we tend to call it a bug (a term dating back way before computers, although there is a story of an actual bug - a moth - being found in a Harvard computer in 1947) but this trivialises the issue.

Whilst we accept that bugs are a fact of life in development we should be looking for every possible way to eliminate them - and fortunately the last decade and the development of agile programming has significantly improved things. It's a misconception, of course, that software projects end up over time or budget because of bugs - even a junior developer can account for time spent fixing things. Similarly the idea that producing code with bugs is bad is ridiculous - if you're a good developer you should be running code often enough that it's constantly breaking and you're constantly fixing it. In fact one of the main methodologies of agile that I've only been exposed to in the last 8 months is Test Driven Development.

TDD is something which I wish I'd known about back when I learned how to code. There's nothing wrong with being self-taught, but somehow almost every tutorial I've read from back in 2005 learning simple principles right up to 2011 developing complex data systems managed not to mention this wonderful concept. TDD takes the idea in the last paragraph that you have to break something before you can make it work, and turns it into a hard and fast rule. In TDD you write tests for how your code behaves before you even write the code; this helps in writing the code but also means that once it is written you have a way of proving that it works.

Get a bug report? If your tests are passing then there's a situation you didn't anticipate, but it's OK because you can write a test and guarantee that will never go wrong again. If you know someone who's coding with any intention of actually making a product (as opposed to random hacking) then insist that they learn TDD before they go any further - I'm slowly building tests for old projects but it's going to take a while! That said, just having tests won't solve everything; twice already I've mentioned that bug reports come through which aren't because of bad code but because of unforseen actions or scenarios. How can we solve problems we don't even know about?

Because of the stigma attached to "buggy code" it can be quite easy as a developer to fall into the trap of hiding the bugs and error messages away. Whilst there are also security benefits to having a web server configured to hide errors, users presented with a blank screen or clicking a button that does nothing are more likely just to go somewhere else than provide you with useful debuging information. In the case of back-end (e.g. PHP) errors we can read those server logs afterwards and whilst we don't quite have the context often the error message is enough to help us, but the errors that always seem to get through for me are those with JavaScript.

With JS reacting to user actions in a browser it can be quite regular for the unexpected to happen - they'll press a key which has a certain action in their browser, or an event that you don't expect fires on an action you didn't predict. Unfortunately these errors often have the most subtle ways of getting past the user - an undeclared property on a class that's been accessed by an object that should have been hidden will throw a nice error in the console, but the user just sees a loading bar that goes on forever, blaming their internet connection or your server instead of helping you identify the problem. jQuery in particular has a high level of risk for errors - often if a variable doesn't exist the return type of a method changes, leaving the page broken but with no obvious reason in the code as to why.

Recently the majority of bug reports I was getting from a system were of this sort, and as there's a way to go before my tests for it will be finished I decided to take a bold step. Learning about JavaScript's window.onerror method (incidentally I've also stopped using function name() declarations in JS, switching everything to window methods to remind me what I'm actually writing) I wrote a function to pop up a box (a subtle fade in box in a corner) each time an error was detected, along with a link to email me the error. For the first day or so I had a huge rush of these errors, but despite knowing that the client's must have been confused, and despite the vague hit to my pride of seeing these problems I was impressed by how many I could fix in one day. A couple of days later and the system was still under regular use but the reports had already stopped - a true win for making failure obvious and reportable rather than hiding it away.

Whether you're in a team developing for a client or working for a company themselves you could do a lot worse than make your site fail with whistles and bells rather than sneaking out the back. It may cause an initial loss of confidence or confuse a few visitors, but in a short amount of time you'll have a system that works better and involves your users in testing, rather than waiting for them to get annoyed and leave as a loading bar taunts them for five minutes.

Find us on StuRents