Creating an ORM for PHP is not an everyday task but writing one is a good way to improve your PHP skills, especially if you use some of the additional features PHP 5.3 adds to the language. There are many excellent ORMs (Object Relational Mappings) already in existence and for a real-world project it would probably better to use one of these, but this tutorial uses the task of creating an ORM as a way to take a look at applications for some PHP 5.3 features.

Getting started

We will start out by creating a database schema which we want to access with the ORM. To keep things simple we use the MySQL sample database Sakila, which can be downloaded from their website. The Sakila database is designed to represent a DVD rental store and we will use the film and actor tables for this tutorial. More information about the Sakila database, including installation instructions, can be found on the related pages. For this exercise you will need to install both the schema and the data from the Sakila database.

The next thing we need is a way to access our database. Because we are going to concentrate on writing an ORM and not a complete database layer we will use something that might be already familiar to most people, Zend Framework. Zend Framework already ships with its own ORM, Zend_Db_Table, but we ignore that for now and simply make use of the Zend_Db adapters to access the database.

Because we want to concentrate on the ORM and nothing else we are not going to create a full-fledged MVC application. Instead we will create a simple PHP script that can be run on the command-line or in the browser which interacts with our ORM. I recommend creating the following directory structure:

The Zend directory should contain a copy of Zend Framework (1.9.6 was used to write this tutorial, but somewhat older versions should work just as well). The ORM directory will contain the sources for our ORM. The tutorial.php file will be used to run some of the example code given in this tutorial. Before we begin we will initialize tutorial.php with the following content:

Defining our data layer

First we need to think of a way to communicate with the database. As mentioned earlier we are going to use Zend_Db to access the database, however we can still add a little abstraction which makes it easier to switch to another database layer or maybe even a web-service in the future. We will start out by defining an interface for data retrieval:

Our interface so far consists of a two methods find and count which accept an object name (in case of a database this will be the table name) and optional search criteria. The find method also has optional parameters to specify an order in which the data should be sorted and to limit the amount of data returned starting at a given offset.

In this tutorial we keep the criteria a simple key/value array, which means we can only perform exact matches and with a little help some simple wildcard searches. This is not exactly ideal but makes a good example for this tutorial; if you were really going to develop your own ORM you would probably want to use a more complex structure to specify your criteria. Some ORMs (like Zend_Db_Table) let you specify the criteria using where clauses and bind parameters, however the downside to this approach is that you tie your code to the database platform.

If you are new to PHP 5.3, you might have noticed something new in the code example above. We added a namespace declaration for the namespace ORMDataSource at the top, which helps us avoid name collisions. Our interface DataSource now cannot conflict with other classes with the same name since these are not part of the same namespace, something that is far less likely than if we had defined it in the global namespace. In PHP namespaces need not map 1:1 to the filesystem but, just like having one class per file for autoloading, it does make things more clear. We will save the interface inside library/ORM/Datasource/DataSource.php inside our application structure. In this tutorial you can assume a similar mapping for other interfaces and classes we are going to develop, the file locations will not be repeated each time.

Now that we have defined a way to retrieve data from the database, what about writing data to the database? For this we define another interface:

First you might wonder why these methods were not added directly to the DataSource interface. By not doing this we are more flexible, because now we can also support read-only data sources. The methods in this interface are probably self-explanatory by their name and parameters, but you might wonder why we have defined the $data parameter as a reference for the add and update methods. By doing so the data source can communicate changes to the data back to our ORM. For example, if our data source uses a database and we insert a new row into a table with an auto-increment column we can communicate the new primary key value back to the ORM. For the update method this can come in handy for timestamp columns or columns which values might be changed in triggers.

Finally we declare one more interface:

As you might have guessed already the purpose of this interface is to support transactions in our ORM. For a database this will be a database transaction, a web-service may not support transactions at all. By defining it as a separate interface a data source is not required to implement it and we can simply check for it in our ORM.

We have created all the interfaces we need, and now we move on to create an implementation of a data source (which uses Zend_Db) because we are going to need it for the more interesting part of this application that follow:

We reference Zend_Db_Adapter_Abstract as Zend_Db_Adapter_Abstract because it is a class in the global namespace and our Db class is defined in the separate ORMDataSource namespace. Beyond that notable feature, hopefully the rest of the code made sense as you read through and we will now use this as the basis for the rest of the application.

Creating the base class for our objects

Now that we have a way to retrieve and store data we can move on to defining our object base class. We start small by creating only the methods for setting and retrieving the data source for our object. Later on we will keep adding methods to this class until it is complete. Let us think for a minute about a good name for our base class… We are building an Object Relational Mapping, so we can simply call our base class Object. With the proper use of namespaces such a generic name should not cause a problem (the ORM namespace could be lacking distinction but for a tutorial it suits our needs).

The code above has a few elements that would bear some further explanation. First you might have noticed the line use ORMDataSourceDataSource just after the namespace declaration. We could also have written use ORMDataSourceDataSource as DataSource here. This is called namespace aliasing or importing, in other words we say that everywhere you will read DataSource we actually mean ORMDataSourceDataSource. We also could have used a different alias for the DataSource class for example to prevent a naming collision with another class which is already named the same and placed in the current namespace. This is an important feature of namespaces, because now we can refer to different classes with the same name using different aliases in the same block of code and it also removes the need to use the fully qualified name of our class everywhere (e.g. ORMDataSourceDataSource). An example of this can be seen in the declaration of the setDataSource method. Without aliasing this method should have been defined as public static function setDataSource(ORMDataSourceDataSource $dataSource).

A second thing which may be unfamiliar is the use of the keyword static inside the method bodies of the getDataSource and setDataSource methods. Prior to PHP 5.3 the self keyword was used to access static class properties, however the static keyword was introduced in PHP 5.3. Beware though, there is an important difference between the two keywords! The existing self keyword always resolves the name of the class to the place where the code was originally declared, but the new static keyword can be used anywhere and will always resolve to the executing class, regardless of whether the code was declared there or inherited from some parent. To get a feel for how these two keywords differ, try changing them around in your working example to observe their behavior first-hand.

We can add the following lines to tutorial.php to set the default data source for our objects:

We will also need to know which table or resource our data is coming from, so we will add some extra methods to the Object class to make this possible. In the next code examples I will illustrate changes to the Object class by redeclaring it and only defining the new methods, in your own code examples you can simply add these methods to the existing class.

Inside the _getName method we are again using the static keyword to make sure we refer to the class in which the method is called. To set the resource name for an ORM class you can simply redeclare the static $_name property and set the resource name, the _getName method will automatically pick it up. If you however don’t set a value for this property we will fallback to the get_called_class method. This method uses late static binding to see which class the method was called upon and will return the class name as string. So if we don’t define a custom $_name property we simply assume our resource is called the same as our class.

Defining our read API

Now we have defined the basics for our Object class we can start thinking about its read API. Reading is something we are probably going to do a lot (most applications do much more reading than writing), so we will need a good API. Something that would allow for the following code would be nice:

Implementing the find methods is not too hard since the most difficult part is already taken care of by the data source class. In this example however we are also referencing columns as object properties, so we need to implement that as well. And what kind of object is $film actually? We will now take another look at the code:

We implement some of the magic methods in PHP (__get, __isset) in order to make the internal data available as object properties. The findFirst method acts as a wrapper method around the find method which simply adds a limit and only returns the first found object. The find method is probably the most interesting method. The actual search takes place inside the data source instance. The results are converted to objects that are instances of our Object class. Or, to be more correct, instances of our Object subclass. Again, thanks to late static binding, we can simply instantiate objects of the class these methods are called upon.

Defining our write API

Creating new rows can be as simple as instantiating our Object subclass. We did do this already for the rows we find with the find methods, so nothing new there. To support the altering of values we can simply implement some additional PHP magic methods (__set and __unset). We will also want a way to save the changes in the row. It would be nice if we could write something like this:

Our Object class will need updating if it is to work with the changes shown above to handle saving changed values of an object. The new code should be something like this:

First of all a new property has been added $_isNew in which we track if an object exists (in the data source) or is new. Both the constructor and the find method have been modified to support this. Rows that are instantiated by creating a new instance of our Object subclass are always considered new, rows found using the find methods are always considered not new. We implemented the __set and __unset methods to support modifying the object’s properties and we implemented a method save which uses the data source to either add or update the row. Also noteworthy is that we check if the data source supports writing and throw an exception otherwise, and that we keep track of the data source at the instance level. The latter allows us to change the data source at the class level after we retrieved some rows, but still save the changes in the rows in the data source they originate from.

Spice things up

To spice things up there are two more things I want to add to this system. First of all we should revisit our read API. Those find methods are fit for purpose however it would be nice to have an alternative to passing an array with key/value pairs for the criteria. It would be nice if we could simply do this:

Supporting this new notation is not too complicated when we make use of more magic methods. PHP 5 gave use __call, PHP 5.3 gives use __callStatic, combine this with a little regex “magic” and you get the following code:

As a final feature for our system I would like to demonstrate an implementation of transaction support. All we need to do is introduce a single new method called transaction:

The transaction method takes one parameter, $callback, which can either be an anonymous function or some other kind of callback. If the data source doesn’t support transactions we simply execute the callback and return. If the data source does support transactions we wrap the callback inside a try/catch statement and we start a transaction before calling the callback and commit the transaction afterwards. In case of an exception we rollback the transaction and re-throw the exception. As you can see we call the callback method using the call_user_func method. We could also have written $callback() here, but by using call_user_func we also keep the ability to support other types of callbacks, e.g. array($object, ‘method’). Here is an example of our new transaction method in action:

This code creates a new actor row which it tries to add to the actors list for the film “RANDOM”. This film doesn’t exist in the database so we are actually assigning null to $link->film_id. This causes a database constraint violation which will result in an exception. Because this code gets executed wrapped in a transaction both the actor and the film_actor rows will be rolled back. Without the transaction logic, the actor row would still exist in the database after our block of code has finished, but without the link to the film.

Some final thoughts

In this tutorial we created our own ORM to look at some of the new language features PHP 5.3 introduced. These features allow us to explore new kinds of APIs that were previously very difficult or impossible. Having all these features available doesn’t necessarily mean you should use them all but understanding their applications is useful.

As mentioned at the very start of the article, if you need an ORM then there are plenty to choose from and the one outlined here certainly had some limitations. A more featured ORM would also have included support for things like:

  • Validations; we don’t have any validations at the moment, we don’t check for mandatory properties, don’t check for uniqueness, in fact we don’t even check if the properties we are referring to actually exist.
  • Relationships; we don’t support any type of relationships. A good ORM should support one-to-many, many-to-one, one-to-one and many-to-many relationships.
  • Performance; our search API fetches all rows at once, in an ideal situation the rows would be fetched one by one on demand. Or better yet, support both.
  • Criteria; our API only supports simple search criteria, far from sufficient for real-life situations. The challenge here is to support complex criteria without exposing too much of the database layer.

Having said all that, writing your own ORM can be a great learning experience, especially if you take the opportunity to get to know some of the features in the language that aren’t currently part of your day-to-day use of it. A project such as building an ORM can be a lot of fun… so don’t be surprised if one day you will notice the release of Yet Another PHP ORM, unless of course you release it first!

The code for this tutorial can be downloaded here. Please note that Zend Framework is not included in the download – simply add it to the library directory of the project.