In February and March we ran a bug hunt contest. We asked people to download a small application and tell us any issue they found. We promoted the contest at PHPBenelux and PHP UK conferences and it was open to all residents of the EU. We got a large set of responses and, besides being able to determine a winner, we were able to get some interesting analytic data from the contest entries. I will share those later in this post. But first things first…

The Winners

We had 3 categories:

  • Juniors (0-3 years of PHP experience)
  • Experienced developers (3-5 years) and
  • Seniors (5+ years)

Points are awarded for bugs found and/or fixed, where the amount of points ranges from 0.5 to 2 depending on how difficult it is to find something. In each category, the highest scoring contestant is the category winner and in the three categories, we’ve randomly selected (using a dice) who would get the Ipad and who would get the Dutch PHP Conference tickets.

The top 3 in each category:

Seniors

Position Name Score
1 Jani Hartikainen (Finland) 37
2 Joshua Thijssen (Netherlands) 28.5
3 Jeroen Keppens (Belgium) 28

Experienced

Position Name Score
1 Dominic Hull (United Kingdom) 26
2 David Winterbottom (United Kingdom) 22.5
3 Attilla Nyeste (Hungary) 21.5

Juniors

Position Name Score
1 Matt Haynes (United Kingdom) 20.5
2 Tom van Looy (Belgium) 17
3 Anthony van de Gejuchte (Belgium) 16.5

The dice determined that the winner of the Ipad is…. Jani Hartikainen! Congratulations Jani, your score indicates you are a clear winner; have fun with the Ipad! And Dominic and Matt, congratulations and have fun at the Dutch PHP Conference!

The Analysis

The contest gave some interesting insights. One was the clear difference between junior, mid-level and senior developers as can be seen in the results above. If we look further than the top three, then we can see that there are seniors that score lower than mid-level developers, and mid-level developers that score lower than junior developers, but on average, the quantity of bugs found reflected the skill level. This is why we had categorised the contest, to give less experienced developers an equal chance of winning.

Looking over the results, it is also clear that senior developers paid more attention to non-syntax related things, such as decoupling, mixed responsibilities, architecture, and consistency in error checking, where as the less experienced developers focused mostly on the syntactical or logical errors. This was expected, but it’s nice that the results confirm this. Interestingly some seniors tended to ignore some of the more detailed code bugs. Senior developers often look at the code from a higher level but it’s interesting to see that this sometimes means that code errors are overlooked.

A final interesting conclusion is that the contest demonstrates that code reviews work. None of the contestants found all bugs that we’ve hidden in the application (the maximum score to be achieved was 54, and even Jani only scored 37). Similarly, there was no bug that was caught unanimously. Every bug was missed by at least one reviewer. On the other hand, all of the contestants combined did find all of the bugs, and managed to find 5 bugs that we did not intentionally put in! This proves the old open source adagio “Given enough eyeballs, all bugs are shallow”.

The Bugs

Below is a list of all the bugs that were hidden in the application. I’ve also added a few comments where we received interesting responses. The percentages between parentheses indicate how many of the contestants found that particular bug. If you don’t remember the code, you can still download it.

HTML
The HTML in the application contained numerous bugs. There was no doctype/encoding (41%), the semantics were wrong since it used strong instead of h1 and a mix of strong/b tags (21%); also HTML and XHTML were mixed (38%). Some people argued that it would be better to use POST instead of GET (20%) for various reasons (Security was mentioned, performance (?), cleaner urls). Considering W3′s recommendations, GET is the better choice here. We however did award half a point to people with a good reason why it should be POST. Another thing people mentioned was that PHP and HTML code should be separated (40%) or even that PHP should be kept out of the document root (1 contestant). Finally the </html> tag was missing (50%).

Security
The security issues were rather obvious. There was no input validation (25%) and no output escaping (58%). There was little scope to abuse the input which explains the lower percentage there, still I had hoped more people would have mentioned output escaping.

Syntax errors/typos
There was a syntax error in one of the array definitions: “newbie”, “noob” instead of “newbie”=>”noob” (70%). The date format used “H:m:s” to display time; m is for month, i is for minutes. (“Ymd” was used in the same format string, still only 33% caught the bug). Line 61 contained an ‘assignment instead of comparison’ (= was used in stead of ==, 79%). Sneakily, the same line also had missing quotes around the constant ‘s’ (45%). In decryptString, 2 was encoded to ‘k’ which should’ve been ‘z’ (67%). Also in decryptString, the first bit of the function operated on the $string variable while the second bit used the $result variable, ignoring any changes made to $string (54%). Leetchar had a long list of if-statements checking the $result variable, but one of them used $return instead (42%).

There was one really cleverly hidden syntax error. One of the lines of code was:

$word = ($last !="s" and $pluralize ? $word."z" : $word);

One person mentioned that ‘and looks like an error but is valid php so this is a trick question’. While this is indeed valid php, there is another problem. ‘and’ and ‘&&’ have a different operator precedence. In fact, ‘and’ has lower precedence than the ‘?’ operator, causing $word to be assigned a boolean value instead of a string! I was glad to see however that 45% caught this error.

Finally, 4% of the contestants reported the missing ‘?>’ at the end of the file as a bug. This is however not a bug, and some coding standards even promote omitting the PHP end tag in plain code files.

Code Structure
This is an area where the seniors shone (which is why the percentages are a bit lower). The include_once in index.php should be require_once, because the code after it cannot function without the included file (12.5%). Two people mentioned that () should not be used because include is not a function but a statement. Although they are correct that the parentheses are not required, it’s not a bug if they are used and I personally prefer to use them for readability. There were various erroneous isset combinations (e.g. a variable was read before isset was called, some issets were used in places where the code couldn’t have reached if it wouldn’t have been set) (20-30%).

There was code duplication (“Don’t Repeat Yourself” wasn’t practiced fully) which a few people noticed (13%). There was also an inconsistent mix of error checking methods (magic return values, exceptions and booleans; 50%). Decrypt and encrypt functions used different methods to accomplish their goals (arrays versus function calls, local variables vs static variables) (29%). The LeetCrypt class has mixed responsibilities (10%). The few exceptions that were thrown were generally not caught properly (13%). One method returned -1 or -2 as an error, but since it was evaluated as a character (“-”) or a boolean, this doesn’t work properly. (38%).

Two people argued that the Leetify method should be removed because it was not used. In such a utility class it is however logical to assume that in a later version or in a different scenario, the method might be used, so it’s not a 100% clear case that the method should be removed. In general, unused code only obfuscates the codebase however, so the argument is valid.

Object Oriented Issues
Private and public methods were mixed randomly (8%), Private and Protected methods should follow a naming convention such as starting with an underscore (17%). Much debate revolved around whether or not certain things should be static or not. Some argued everything should be static (1 contestant), some argued nothing should be static (12%), some had some more nuance in their opinion. In general, you can say that if a method is independent of an instance, it should be static. For properties, if they are shared across instances and not change value between instances, they should be static. Someone mentioned ‘this method shouldn’t be static because it’s only called by instances’. This is only partly correct. Even if only instances call the method, the method might still be instance independent and can still be declared static for good reasons. Remember also that everything that is not static consumes space in memory as every instance repeats it.

A few people mentioned that protected should be used instead of private unless there is a good reason (8%). While this is convenient if you are certain that your methods will be extended some point, there are equal reasons to use private; encapsulation is a good reason, sometimes you don’t want your derived classes to care about the internals of the class they extend. And if it’s not library code, you can easily make a private function protected later on if you need it.

Someone mentioned that a constructor was lacking. This is however not a bug, a constructor is not a requirement. Opinions on whether or not you should have one anyway vary.

Functional Errors
There were some bugs in the code that were not caused by typos or syntax errors but that are simply incorrect behavior. The str_replace in encryptString caused substrings to be replaced as well (29%), sentences with punctuation were not processed properly (20%). The algorithm isn’t symmetric: “2 beer” would be encrypted to “2 b33r” but then a decrypt would lead to “z beer”. (21%). Several contestants mentioned the code was not compatible with Unicode strings (8%). Only one person mentioned that LeetCrypting is not a viable encryption algorithm, which is of course indeed the case.

One interesting functional error had to do with the ascii range check of the character ($char > 128). Many people discovered that there was a missing ord() call around $char (62%), only very few people noticed that 128 itself is higher ascii, so it should’ve been >= 128 at least (8%). Many people debated that this check was pointless (40%) which is a valid argument, it did nothing to enhance the functionality of this application.

In the index.php, hardly anybody noticed that in the decrypt input box, the unencrypted string was displayed instead of the encrypted one (13%). Finally, the strstr check near the end was supposed to check for ‘crypt’, but strstr matches ‘decrypt’ as well, so this check did not work. (62%)

Best Practices
There are a few things that can be said about the application that are not necessarily bugs, but violations of commonly applied best practices. For starters, the inline documentation was abysmal (46%). Also, coding standards were not consistently applied: indenting was 2 or 4 spaces, character padding varied, braces were sometimes on the same, sometimes on the next line, method camelCasing was inconsistent etc. (50%).

Performance
Although you cannot easily classify performance issues as bugs if you don’t know an application’s performance requirements, we have awarded points for the various optimisation suggestions we received. These ranged from calling strlen() inside a loop (8%) via some refactoring that made the code go from O(n) to O(1) complexity (2 contestants), to microoptimizations such as using echo ‘a’, ‘b’ instead of echo ‘a’.’b’ (7%).

Errors in the Unit Tests
Most contestants argued that the testcases did not have proper coverage (71%). This was very clear, because the app contains many bugs, even though all provided testcases pass. Some people mentioned that it’s a bad idea to test 2 features in a single test (29%). One person had such a keen eye that he noticed that $actual and $expected parameters were mixed up (this was unintended :)).

Bonus Points
And that covers most of the bugs that were in there! We’ve awarded some bonus points for people that had not only identified bugs, but also provided fixes. Some people even provided completely rewritten, highly optimized, super quality LeetCrypt implementations.

The Next Challenge

We had a lot of fun organizing this challenge, so we decided to do a few more this year. Keep an eye on http://www.ibuildings.com/challenge, the next one is closer than you think …