We recently wrapped up the Ibuildings Elephpant Challenge, a contest where a PHP Elephpant traveled around the world visiting a given set of landmarks. Contestants had to write a script that calculated the shortest route for the Elephpant. The contest has several side goals: entries were not not only validated against the given landmarks and a second secret set of landmarks, they also were scored based on their performance, code complexity and code size.
In this post, the winners will be announced and we will make some observations based on the contest.
The shortest route based on the given CSV file is 41799 kilometers and 231 meters, calculated with the Haversine Formula. The route is: Big Ben; Area 51; Zendcon; Golden Gate Bridge; Sydney Opera; Harbour, Tokyo; Mount Everest; Taj Mahal, Eiffeltower; Elephpant Heaven.
The secret CSV file we used can be found here. The shortest route now is 48701 km and 353 meters (Sheffield; php|tek; Mexico City; Apple Cupertino; Vancouver; Hawaii; Chinese Wall; Cape Town; Tower of Pisa; Manneke Pis).
Let us have a look at the winners. We had 3 categories; Junior developers, Medium developers and Senior developers. The contest was apparently harder than we anticipated because none of the junior candidates managed to find the correct distance. Two candidates came really close (Elias Sorensen and Matthew Haynes at 42089 and 42090 km on the initial CSV respectively).
In the medium category, there were a few candidates that had the initial CSV right, but only one candidate that also had the correct solution to the secret CSV file. Since we only score on performance, complexity and code size among the candidates with the right solution (see further down for a few observations on this), this person automatically became the winner:
Category Winner Medium: Andy Thompson
Andy’s average performance on 10 landmarks was 333.79 seconds and he used 38 lines of code.
In the senior category, we had many more correct solutions and a variety of algorithms. The top 5 entries with the correct routes:
|Contestant||Performance / Complexity / Lines of Code||Total score|
|1. Michiel Brandenburg||50.25 seconds / 14 / 46||26|
|2. Remi Woler||11.05 seconds / 19 / 125||23|
|3. Joris van de Sande||69.91 seconds / 14 / 48||22|
|4. Ben Spencer||19.98 seconds / 28 / 61||20|
|5. Bastian Gorke||68.99 seconds / 30 / 164||14|
Among the 2 category winners we raffled an iPad and a ticket to the Dutch PHP Conference. Michiel won the iPad, Andy won the DPC ticket.
We made a few observations when evaluating the contest and we thought we’d share them.
Adhering to specifications
We automated the test so that we could easily produce results. To do this we had very specific requirements; the script should be called ‘contest.php’, accept 1 command line input parameter indicating the input CSV file, and required the output to be a list of landmark names, one on each line.
It is amusing to see that we received scripts called index.php or ibuildings.php, scripts that had CSV output, and we even received two scripts that given a CSV file of 10 landmarks, would produce a route that would only visit 9 of them. True; elephpants are picky when it comes to choosing places to visit, but the elephpant was required to visit them all. This should not be a problem for an elephpant, especially with landmarks such as Elephpant Heaven in the list; interesting that nobody wondered what Elephpant Heaven is. You will see what we meant when you’re around; it’s not far from the Dutch PHP Conference venue.
We noted that Remi was almost twice as fast as Ben, and 5 times faster than Michiel, but complexity and code size were just as important and since Michiel had top score in both categories, he became the overall winner. When I look at the various entries, it’s striking that almost everybody chose to optimize performance as if that were the main goal; I guess it’s more fun to focus on performance. Some candidates used really really fast algorithms (producing results at around 0.24 seconds for both CSV files!), but those algorithms did not guarantee accurate results (ranging from ‘close’ at 42090km to ‘very wrong’ at 76536km. In a real world situation those algorithms might be the best choice if performance is essential and accuracy is not required. In our case however we mentioned that winners must produce the correct result; a requirement like that should be key in selecting an approach to any software problem.
The reason we used a secret CSV file to validate the entries was to prevent optimizations that were aware of the route; this proved to be useful since there were indeed contestants that had a very fast correct solution but only because they had knowledge about the landmarks and had for example ‘started their elephpant in a certain direction’. These contestants fell through on the secret CSV file, where they were still performing very fast but not producing the shortest route.
The Traveling Salesman
Most contestants had recognized immediately that this problem is essentially the ‘Traveling Salesman Problem’, which is used in many computer science courses to teach algorithms and optimization.
We had entries from every country in Europe, still it’s intersting to note that the top 3 were all Dutch candidates. Maybe the Dutch have more traveling salesmen?
Running a contest can be hard!
One observation was that we completely underestimated the time it would take to get the results. While some had written very efficient algorithms, others had chosen brute force attacks that ran for quite a while. Since we wanted to get accurate timing results, we ran scripts multiple times and calculated averages. After the first run had run for 3 days and still only had 25% of the entries processed, I changed the script to not run multiple times if the initial result wasn’t under 3 minutes. It eventually still took my test server a few days to go over all the entries.
Although we did not state how many landmarks there would be in the secret CSV file, a larger file could have made the contest virtually impossible. To illustrate how brute force solutions are very difficult to scale: a script that would run for 3 minutes on 10 landmarks could run for 3 hours with just 2 landmarks more.
Another difficulty was the lines of code count and the complexity numbers. We used proper tools to generate the numbers, but the lines of code counter we used would ignore files called .inc and the complexity calculator would ignore code not in a function and so on. So we had to fiddle quite significantly to get accurate numbers there. A lesson for future contests.
For us as organizers, this was a fun challenge, and seeing the buzz on Twitter surrounding the contest it certainly seems the contestants had fun as well.
Let me finish up by saying: Winners: congratulations! And everybody else, thanks for participating, I hope you had fun!
In the meantime the next challenge, dubbed the Test Driven Challenge, has been launched; develop a class that satisfies a given set of testcases. This new challenge will run until June 30 and the prize is once more an iPad. This one is slightly easier than the previous challenge, to give the junior and medium developers another shot at the iPad!