Programming, as much as it is about machines, is conducted by humans, and is as susceptible to human error as any process. These bugs can be frustrating, particularly when found by a client, but they can give an insight into underlying problems, and sometime suggest solutions – or at least ways of catching errors early. This article looks at four different classes of bugs: mechanical, mental, social and environmental, their causes, and some preventative steps.
Before discussing these, it is worth noting a couple of points. Bugs and errors are problems where the result is not what the programmer had intended - where if you asked the developer whether they expected X to happen, they would say no. This is quite different from the, equally common, problem of the functionality not being what the client desired, or missing some unspoken assumption - in both those cases the wrong thing was built right, rather than the right thing being built wrong.
It's also worth mentioning that bugs will be much more likely if a programmer is tired, hungry, unmotivated, distracted, or stressed, either from the project or in their personal life. These problems don't have a technical solution, but addressing them might yield better dividends than all the rest of the techniques put together.
Mechanical bugs are the most straightforward of errors, what we might call slips of the finger or lapses in concentration. These include typos, where
& for example. It also includes slips of mind, often where one very familiar sequence overrides another and 'types itself'. If you spend all day typing
$this->loadData() and the particular object you're working on needs
function_load_data(), you're going to put in the wrong one in a few times.
So why do they happen? These kind of errors are execution failures or skill based errors, lapses in something that we are otherwise good at. The typo is an example of an attention lapse - we hit the wrong keys all the time, but normally quickly spot and correct it. The 'slip of the mind' is often a recognition or memory lapse - we either incorrectly recall the item we're trying to add, or fail to recognise the situation properly. We apply a mistaken mental model to the problem, leading to an equally mistaken solution.
In general, programmers attempt to match each small development hurdle to one we've experienced before, and a more common problem will often take precedence over a novel or unfamiliar one - for example, it took me a couple of tries in that sentence to remember the word 'precedence', as I kept recalling the more commonly used (for me at least) word 'preference'. This is called attentional capture, and in various forms accounts for a huge number of everyday mistakes.
So what can be done about it? There are a number of good defenses that can help pick up these kinds of errors, prevalent as they are:
Syntax Checkers - Easily the biggest, and often simplest, aid is syntax highlighting and parsing. Most IDEs and programmers' editors will offer some syntax checking. Often the colours will show a missing quote mark a long time before you would have spotted it just by looking at the code. The more advanced editors can show a lot of helpful extras, such as indicating variables that have been used before being assigned. If not using a PHP specific editor, it's worth checking for a parser plugin that allows for these more detailed conditions.
Most people will have a preferred tool, but if you've not settled on a PHP editor or are looking for a change, there are many good choices. Some I've used, and use, are:
Code Sniffers - Coding standards have a number of benefits, not least that once you're using one you have the opportunity to employ a code sniffer. A code sniffer checks for adherence to a set of conditions, and can usually be hooked to run automatically on commits. Time spent setting up something like the excellent PEAR PHP code sniffer to use your standards is well spent, but can be saved on new projects by following previously published standards like Zend's or PEAR's own.
Peer Review - It's a given that, when stuck on a tricky bug, another coder will spend three seconds looking at your work and point out the error. It can be beneficial to formalize this process. Peer review benefits you by providing a second set of eyeballs to give the code another look, and simply knowing that your code will be reviewed helps focus attention on quality and readability.
This could be a traditional review – another developers looking over and examining the code - but it could as easily be pair programming, with the driver benefiting from the observer's contributions and observations in real time. Both can work well with remote teams, if given the right tools.
Mental errors are primarily errors of logic, the code not really reflecting the developer's intentions. These include boundary conditions and edge cases, such as a classic divide by zero error. They're usually syntactically correct - well written pieces of code that don't do the right thing.
Most of these errors would be classified as mistakes or rule based errors, a misapplication of techniques during our mental planning. This can be using a good mental rule in the wrong situation, or using a model from experience that is incomplete.
This incompleteness can include knowledge of the language, such as understanding the operator precedence rules in PHP, or a more informal misunderstanding or bias. For example, if you always use positive operations in your conditions, such as
if($a && b() || c()), then when editing someone else's code that makes use of negations, such as
if(!$a || b() || c()), you might parse the results of the conditional in your head incorrectly, and with that wrong model in place add bugs in new code.
These errors tend to be harder to defend against than mechanical errors as they often appear correct, until that unusual state is encountered, or unexpected input given. This isn't to say there aren't tools and methods to highlight and prevent them:
Unit Testing - Unit testing is the process of testing small, discrete parts of the code for correctness, usually in a way that lets the tests be easily and regularly run. This has two benefits for these kind of errors. Firstly, the developer when writing tests might encounter the edge cases they hadn't considered when writing the code. Secondly, and probably more significantly, it allows tests to be added when these bugs have been found that should ensure they're not regressed to in later development.
The downside is that it's rather tricky to add unit testing to a project that has been written without it - it can be hard to effectively test without the code being written with testing in mind. In that case, it often involves a messy refactoring to whip the project into shape and generate the tests before a good baseline can be established.
Once unit tests are in a project though, a Continuous Integration system such as PHPUnderControl or Xinc can be used to automatically run the tests and alert developers to failures they may have missed.
Code Formatting - One of the more interesting ways of stopping these kind of problems is clear and consistent code formatting. While this will catch only a percentage of logic errors, formatting code the same way means that noticing some bugs becomes a visual pattern matching problem, something humans are extremely good at. Simply spotting that a function 'looks wrong' is often a big clue that a factor hasn't been taken into account.
With many of these errors the developer will format the code along the lines of what they thought they were writing instead of what they actually were, so automatic formatting tools such as the ones built in to Eclipse or Zend Studio can be a helpful check.
Small Code Blocks - Breaking code into smaller functions has two advantages. One is that the code block is visually smaller, enough so the whole function can be seen at once, which enhances many of the pattern recognition benefits of consistent code formatting. The other is that discrete units are naturally less complicated, can be analyzed independently, and combined more reliably, particularly if clear naming conventions are used.
Social errors are the problems with integration, either in the macro level of web services, databases and so on, or the micro in terms of the usage of functions and APIs internally.
These errors have their basis in two areas, cognitive bias on the part of the developers, which results in reading of documentation and other examples in a way that fits their expectations, and incomplete or incorrect documentation in the first place.
The result is usually a latent or dormant failure, where the system is correct until triggered by something outside the norms the programmer was working under. In many ways these are similar to the logic errors above, but are the result of multiple developers' mental processes, and clashes between them, rather than problems within the work of a single developer.
The defences to these kind of errors are slightly different than with the mental errors as well, with testing being secondary to making intentions clear, and communicating them better.
Test Driven Development/Behavior Driven Development - These methodologies both involve functionally documenting what code is supposed to do before actually writing it. While using one these methodologies doesn't solve the problem per se, it does at least make crystal clear what was intended. It also forces the developer to spend time thinking about the interface they are creating before thinking about how to solve the technical problem it poses.
Test Driven Development (TDD), in a nut shell, is writing unit tests before the functionality itself, and doesn't need anything more than a regular unit test harness. BDD is a refinement on this concept that focuses more on writing a specification for the behavior the software should have, which is then automatically tested. The main framework to support this in PHP is PHPSpec.
Being reliant on unit testing both ideas work best when the entire project is being executed with them, rather than in maintenance or extension to an old project.
PHPDoc Comments - PHP Doc comments are probably the overall best way of documenting PHP APIs, whether for internal or external consumption. In many editors those comments are immediately accessible when using the documented functions, and sometimes even enhance the autocomplete facilities when return type and argument types are documented.
They also have the benefit of prompting developers to consider people other than themselves using the functionality, which can be something that easily slips the mind in the rush of development.
One warning is that there is a risk in writing empty documentation, where there are lines of comments with very little actual information, or even incorrect documentation that no longer reflects the code it pertains to. The whole team has to have be dedicated to fixing incorrect documentation, even if it's in another developer's code, and to spend the time needed to write accurate, but not necessarily lengthy, documentation.
Big Design Up Front / Team Planning - Social errors happen much less if everyone's expectations are of the same thing, so at least for internal conflicts they can be addressed in the planning process.
If development is being managed in a traditional manner, then making sure that the specification is sufficiently detailed at the planning stage can alleviate many of these issues. Keeping the documentation 'live' via a wiki or online documents means that when changes do occur they can be reflected in the specs.
If development is following an agile methodology, then involving the whole team in the planning should allow the communication between the developers to iron out differences in expectations.
In both cases, a clear map of the touch points between modules is a valuable resource if things should change, and it also gives testers and QA staff a good idea of the kind of areas that are most likely to have problems.
Environmental errors are problems that arise outside the code - issues at the level of PHP itself, the OS, the servers, or the network. Environmental bugs are particularly difficult, being another case of latent or dormant error. They are created by a combination of a failure at some earlier point in design or development, and exposed by a trigger factor, such as high load.
Generally these should be regarded as failures in the process rather than with one particular developer, and often involves those outside the traditional development team - system administrators and network engineers for example.
One of the big root causes of this kind of problem is simply a lack of experience, perhaps in systems or database administration, which lends itself to mistakes. A programmer works from mental models, their experience, but when that knowledge is not sufficient they move to working 'online', consciously processing the sequences of actions they need to take to solve a problem. The difference between the amount of time spent working from models and consciously processing is basically the difference between a beginner and an expert – experts have a wide range of experience which they can match to different challenges, with a much higher success rate than conscious thought.
Though these are probably the most difficult kind of error to stop, there are some ideas that can help mitigate or expose them early.
Development and preproduction - One of the best ways to address, if not prevent, these kind of problems is to ensure that the development and staging environments resemble production as closely as possible. Of course, it's not usually reasonably to request a duplicate set of systems, but using virtual machines it's possibly to create a similar setup of servers with much less physical hardware.
At the minimum, making sure the development environment uses the same versions of PHP, database and webserver will remove a large number of variables, and shouldn't involve too much of an upset to an existing setup.
Load testing - There aren't that many ways to test load handling without throwing traffic at the real application on the real system. At the most basic level a tool such as apache bench (ab) or siege can make a number of connections to a server, while more advanced tools such as JMeter can simulate user journeys. There are also hosted services, which while not free have a good range of reporting and allow testing from outside the network.
The most important thing when using the advanced tools is to simulate realistic conditions. Just hitting the front page of a site if there are expected to be a large number of comment posts isn't going to test for a realistic write load. Additionally, assuming a 10 second wait time between pages might not make sense if the page will make new connections automatically with AJAX once it loads.
If you don't have a chance to test the real environment, then testing a preproduction or even a development system regularly will still show benefits. By comparing previous results you can pick up changes in relative performance, and so notice if the code suddenly gets a lot slower, or generates a lot more errors.
Cross Functional Teams - If it's feasible, involving everyone on whom the success of the project depends in its development can be a big aid in avoiding these kinds of errors. Having system administrators integrated into the process adds a whole new layer of knowledge to the mix, and will expose to them at least a hint of the problems and trials that occur during development. Some of these may have a solution at the system or hardware level which would never occur to a programmer deep in the weeds of development.
Looking at what kinds bugs and errors occur might benefit the project in the long term just as much as fixing the bug itself. Storing the root cause of the fault with the bug report is achievable in many task tracking applications, and is worth considering adding to to any development or maintenance process. Identifying the kinds of bugs that are most common in your projects gives a guide to which areas to focus on, for future development, for training, and for quality assurance.
Along the lines of the non-technical solutions mentioned in the introduction, there is likely to be a non-technical problem here as well. If programmers view the code defensively and possesively, or are stigmatised for bugs and errors, then they are less likely to be forthcoming when it comes to the causes, or even the existence, of those bugs. With regards to experience, there is simply no way of avoiding beginner bugs until those people aren't beginners anymore, and the best result is often to fail early and loudly. This means creating a working relationship where the bugs are not about blame, but about improvement.
For a real psychological study of mistakes, from which this article borrows rather imprecisely, the best reference is Human Error by James Reason, which covers the Generic Error Modeling System and contains a wealth of information on how and why people get things wrong.