In this world of sharing data, increasing numbers of sites and applications are making information available over web services. Whether we are building a service as a feature of our own development, or pulling in the information published by others, we will need to understand the different service types and how to work with them in PHP. This article aims to give you the tools to do just that.

Starting at the Beginning: HTTP

HTTP (HyperText Transport Protocol) is the language of the web, the communication channel over which we send our data. As web developers, we often don’t need to really notice it is there but for web services there are some bits and pieces we should be aware of so let us take a few moments to refresh our memories.

When we surf the Internet, we make a series of requests, and receive responses in return containing the data we asked for. There is nothing special about web services; they work in exactly the same way except that the data in the response is usually marked up for machine-consumption rather than for a browser. When we use a browser, we are aware of the URL that we are requesting and we know that forms send additional POST data. We also observe the HTML response, by viewing the source of the rendered page in the browser (or using cURL, if you’re really hardcore!)

Most web developers are also aware that there are other headers available, controlling cookies, caching and so on, but for web services these form an integral part of the information that is getting sent to and fro between client and server. We observe the headers in one of a number of ways probably the simplest is to use cURL from your command line to observe the traffic. Here’s an example of a request to the google homepage:

Some particular headers are key when working with web services, here is an outline of some useful ones:

Cookie (request) and Set-Cookie (response). These control the cookies that the server gives to the client, and which the client provides on every request. Most PHP sessions work via cookies, and these are handled invisibly by our browsers. When we work with servers, we may have to manually receive, store and provide these when making requests and parsing responses. Many web services are stateless, and so don’t use cookies, but some do use them and it is useful to be aware of these. To be consuming the web from PHP is a bit of a conceptual shift, since usually we’re serving the web, but all the same principles apply.

Status Codes (response). Take a look at the headers shown in the response above: the first line shows “HTTP/1.1 200 OK”. This tells us the version of HTTP that is in use (usually 1.1 although 1.0 does exist and is still used) and also the status of the response. The status code is possibly the most important piece of information in the HTTP response header and this is something that will get a mention again and again in the course of this article. The one shown here, 200, is good news and means that everything operated as expected. The response codes are separated into classes and consist of three digits. Anything starting with a 2 means everything was fine. Something starting with a 3 indicates a redirect of some kind, with a 4 means there was a client error, and with a 5 there was a server error. Some common status codes:

Code   Meaning
200 OK
302 Found
301 Moved
401 Not Authorised
403 Forbidden
404 Not Found
500 Internal Server Error

For a full list of status codes, there is a great reference on wikipedia showing various ones (some quite frivolous, in particular look for HTTP 418)

Accept (request) and Content-Type (response). These headers control the kind of content that is included in the response. The client, when making the request, can indicate what format it would like using the Accept header. When responding, the server should set the Content-Type in the response header so that the client will know how to interpret the response.

Having the server include metadata about the response type is really useful and also means that we can use this to check that we got the response we expected. When I run into problems with web services it can often be because in the event of an error, the service returns some default web output as text/html, whereas I have PHP code trying to decode it as XML or JSON … and of course my code then fails horribly because the response format isn’t what I was expecting! Checking the content type of the response before decoding can help to warn us that there has been a problem that we may need to handle.

The Power of HTTP

In the example mentioned above, where checking the content type of a response before decoding can alert us to the problem, we nicely illustrated some of the main advantages of using the functionality given to us by HTTP. If we don’t take the time to set the headers correctly, or we don’t inspect them, we may not be aware that there is a problem or in the worst case our code may just not be able to handle what happened elegantly. Setting these headers correctly in a service, and checking them appropriately when receiving a response, can really help us to write robust services and consumers. If we were to rely purely on information being in the body, as a request parameter or perhaps by decoding the response and then checking for a status code within it, we have lost some of the built-in functionality of HTTP. It is, by design, an envelope protocol to help us understand what is inside without necessarily having to do the work to decode it, or to try to decode it and then fail. Seeing an error status code, a zero content length header, or an unexpected content type will all help us to understand what we have received before we even try to unpack it, which is fantastic!

RPC Services

One of the most common, and in my opinion, easiest service types to relate to is the RPC (Remote Procedure Call). These web services actually feel more like distributed libraries than any kind of new-fangled service-oriented technology … and that is because that is all they are! We make a request to a given endpoint, passing a method name and some arguments, and we get the response back. The easiest way to illustrate what this looks like in PHP is to look at an example. Flickr has a nice RPC service, so let us take a look at an example making a call to their API.

We’re going to call their search method, and look for all photos tagged with “ibuildings”. Flickr has a fairly prescriptive request/response format, so read their docs if you are interested in understanding how this lot goes together. Working with the flickr API is probably a whole article in itself so I’ll try to keep that aspect to a minimum in this post.

Here’s the code, using the PHP curl extension to handle the requests for us.

Now if we inspect $response_xml we find a list of photo tags of results that match (if you’re interested, the content of this is the same as this flickr search) and we could use these results to display on another site or as part of an application.

RPC services in general are a nice way to start working with web services – they are a good fit for most applications, and they are easy for us to understand since we already know very well how to work with functions and arguments. They may not be the “coolest” technology around but they are simple and personally I think that counts for a lot! Using RPC services also means that we can easily work with our existing code, in the event that we aren’t always developing shiny new applications from scratch. RPC makes it easy to publish existing libraries as a remotely-accessible service for another system, and also to switch out an existing local library class for a remote one, since the shape of their interfaces is so similar.

The example I showed here with flickr was an XML-RPC service but it is also very common to see and use JSON-RPC and indeed other formats such as serialised PHP arrays/objects are equally valid in this setting. Choosing the right format depends entirely on your application and each has its advantages and disadvantages. Time for a quick tangent to look at these!

Data Formats

When we think of web services, we commonly think of SOAP and other types of XML. In fact these are really useful with web services and are relatively common as a result; we showed an XML example earlier in this post, which handled the XML using PHP’s SimpleXML extension. XML is ideal for communication between machines because it is quite verbose, which means it is precise and leaves little room for interpretation or ambiguity. XML isn’t terribly easy to read without help from machines, which can make debugging a bit harder than working with a simpler format. It is also fairly large in size terms for the data it represents, which isn’t important in most applications but at some edge cases this could make it a less appealing option.

Another very common format in use in web services is JSON. While this format is named after its origins (JavaScript Object Notation), don’t be misled by the title. JSON is in fact easily read and written from almost all programming languages – although for something like JavaScript or an iPhone where relatively limited string handling is available, JSON is an excellent choice. It is lightweight, small for the data it represents and easy to parse although the small size does mean that it is less comprehensive than some of the other formats. In particular JSON doesn’t include information about data types so it can be unclear what format the data was in when it was encoded. Particularly exasperating is that it doesn’t know whether something was an object or an associative array – of course we can work around this but we must be aware of it.

Other possibilities include a serialised PHP format, which is a great choice if you know that client and server will both always run PHP. This format has a lot of the advantages of JSON but does include some type information with the data, which can make it more useful.

SOAP: A Special Case of XML-RPC

With the adoption of PHP rising in the enterprise, it is increasingly common for PHP applications to come into contact with other systems within an organisation written in languages such as Java and .NET. In fact I am seeing more and more situations where PHP is being used as the “glue” to integrate between disparate systems written in these and other “enterprise” languages. In terms of API access, these types of applications will commonly support SOAP through preference, and indeed even within the PHP ecosystem, we see SOAP implementations from various projects aimed more at business than the casual user. In particular, both SugarCRM and Magento provide remote access via SOAP APIs.

Back in PHP 5, a SOAP extension was included in PHP by default, and this makes working with SOAP services, especially those with a WSDL, trivial. As an example, take a look at this example of pulling information from Magento Ecommerce:

I did say it was trivial! All we need to do is instantiate a SoapClient object, giving the WSDL. We can then call the methods outlined in the WSDL, passing the arguments as we would for a function.

Note: This is an example of the Magento V2 API, for more information about Magento’s API, visit their wiki. This is another topic that would make a whole article by itself.

Strategies for Debugging

As I mentioned earlier, the XML-based formats in general, and SOAP in particular, can be tricky to work with if you need to actually debug them. Part of this is that if your code doesn’t handle SOAP faults or something unexpected goes wrong, it can be really hard to understand what is happening. There is very little more annoying than spending all morning destroying your own code to find the error, only to discover the server has stopped responding correctly! To help with this, I have a few strategies that I would like to share.

First of all, our good friend curl. I worked on one project where I was producing a web service that was consumed by an application my colleagues were building. I told them that I would only accept bug reports against my code if they sent me a replication case which used curl. This was not welcome news to them, but they confessed later that about half of the “bugs” they found turned out to be in their own code when they looked closer. Eliminating the consuming code as a source of problems is really important when working with remote services.

Another key approach, and something I use frequently myself alongside my development, is to use a proxy. Personally I use Wireshark to show me what is happening “on the wire” while I work with web services. I have also heard strong recommendations from industry peers for Charles which is non-free but very inexpensive and seems to be well-loved by its users. Using a proxy, you can check exactly what is getting sent and received without making any alterations to your own code or changing the calls you make, which can very quickly help you diagnose errors.

RESTful Services

Perhaps the purest form of the web services over HTTP is REST. If you are not familiar with RESTful principles then there is a great overview on Wikipedia that explains it really well. Many of the sites publishing information over web services will provide RESTful interfaces and these are easy to understand and simple to consume.

Good examples of these are Flickr who have a great RESTful API alongside all their other offerings, and twitter who use a similar scheme for the URL design throughout their site as well as across their service. This is really interesting, because you can use their services then almost seamlessly alongside their site. Consider this URL:

http://search.twitter.com/search?q=techportal

Opening this in the browser will show the search results for “techportal” on twitter. To request these from PHP in a JSON format is almost as simple, see this example:

We can inspect $json and see that we have the same data in JSON format. The twitter API documentation is rather good so if you’re looking for a good project to practise with, then this could be an ideal place to start!

Web Services

I hope this article has given a good overview of what web services are and how they look when they come in to form part of our code. There is so much more on all of these various topics, and I am sure some of you will want to read more. There are some links for you in the resources section below, and I hope that you will also add your own stories and resources in the comments – thanks for reading!

Resources

flickr API
twitter API
magento API
sugarcrm API
curl – command line http tool
wireshark and charles – debugging proxies

Need to convince your managers about the value of Web Services? Send them over to our excellent whitepaper Introducing Service APIs to explain how services fit in to modern application architecture.