Introduction

Cloud computing refers to the utilization of shared, elastic resources and processing power accessed via the Internet. In some ways, it hails the reversion to the golden age of time-sharing but with significant improvements to the distribution philosophies underlying the delivery infrastructure. So, analogously, we now have the shared wonders of Hyde Park, where everyone and anyone can chill on the bench, throw some Frisbee, instead of having to financially pool money to buy a private park to shoot rabbits.

 

What is a cloud?

Developers today can avoid scalability and availability worries in case their site turns into the next big thing by developing upon the cloud. Traditionally, a typical web application stack will look like that on the left below:

image1

Cloud-based development involves in some sense, the outsourcing, of various parts of the application out of the server and into the cloud. So instead of storing images, videos and other objects in the file system, they are stored on the cloud. Instead of using a local database, a cloud-based database is used instead. Batch-processing and other functionalities are also performed on the cloud. In other words, developers using the cloud will move most or all of the components of a web application into the cloud. The most significant benefit of course is that the cloud´s capacity is theoretically limitless as compared to that of some local servers, saving the need to frantically add hardware or worry at all if traffic explodes.

Underlying the immense expanse of the Amazon cloud is possibly a complex level of virtualised, clustered computers. The good thing is, in most cases, developers do not have to care about it at all. To most developers, the cloud is the always-on entity where their web application lives, and where well-defined web services are provided so that they can control the web application and interact with the cloud.

 

The Amazon Cloud

It has never been cheaper, faster and easier to setup a scalable, on-demand, geographically optimised web application environment. Amazon’s cloud is one of the forerunners that has made these advantages possible.

Among Amazon’s cloud-related offerings are EC2, S3 and CloudFront. EC2 (Elastic Compute Cloud) allows developers to start instances of servers (called Amazon Machine Images) and control them via a web-service interface. S3 provides storage on the cloud. Geographically optimised distribution of S3 objects is easily achieved via CloudFront.

A lot of stuff can be done with these cloud services. One can shoot off MapReduce processing with parallel Hadoop instances on EC2 or use it to run scripts and application that interacts with the enduser. Similarly, S3 can be used as a file storage for disk backups or as public image or video storage. The commodity pricing is a good deal and the natural growth in computing abundance provides a good downward weight on resource pricing.

 

PictureMe: The cloud application

In this tutorial, we will build a simple picture manager to showcase some of the functionalities available on Amazon’s cloud and how to use them. The application allows the end-user to upload, list, view and delete pictures. The entire application, including storage, lives on the cloud. The application is broken into two files; index.php, which manages the presentational part and PictureManager.php, which contains a class that manages meta-information provision with regards to the pictures as well as storage and interaction with the Amazon cloud.

 

Prerequisites

There are a few stuffs that you need to follow this tutorial. These are great tools to use for cloud development with PHP by the way:

    1. Amazon AWS account with access to EC2, S3 and CloudFront
    2. Don Schonknecht’s Amazon S3 PHP class
      This is a updated implementation of an Amazon S3 (REST) client. Very useful for interacting with S3.
    3. cURL
    4. S3Fox
      A useful Firefox plug-in to visually manipulate S3 buckets and objects.
    5. Amazon EC2 API Tools

 

Storage with S3

We first start by creating a S3 bucket to place our pictures. The bucket will form a globally unique namespace in which to locate our pictures. Since we only need to create a single bucket to hold all our pictures in this tutorial instead of creating buckets on-the-fly as users use the application, we’ll use S3Fox. Just start S3Fox with the appropriate credentials and click on the Create Directory (somehow, the creators of S3Fox decided to use this term) icon. Give it a good name that had not been used and we shall have a bucket.

Donovan Schonknecht’s S3 class is an excellent PHP tool to do all sorts of things with the S3 cloud. For example, to place pictures into the bucket, our PictureManager class makes use of a composed S3 instance ($this->_storage) as follows:

When calling the putObjectFile() method, we have to specify the file to be placed, the bucket and the name associated to the object placed. S3::ACL_PUBLIC_READ specifies that the object is publicly available. Objects can be made private (which is the default by the way), publicly readable or even publicly readable and writable.

When successfully executed, a message is returned which will subsequently be displayed by the presentational counterpart index.php. The _refreshList() method is as follows:

The method is called whenever the state of the bucket changes (add pictures, delete) to synchronize the list of pictures between PictureManager and the actual S3 bucket. The MAX_PICTURES constant limits the amount of results returned. Since PictureManager limits the number of picture that can be placed in the bucket by the same constant (in the putPicture() method) this is more of a safety check. Although there is no need for it in this case, the two null parameters can be filled with prefix and delimiter that is useful for situations where searching and hierarchy is required.

Other examples of interacting with S3 are shown in the source code, mostly similar in usage method but different in purpose. The PHP S3 class encapsulates almost everything a typical web application will require when using the S3 cloud. In the background, it manages the generation of the correct REST HTTP requests to Amazon S3 and the processing of the response. Hence, if further detail is required on any parameter of the PHP S3 methods besides those already commented within the class, we can check the corresponding request documentation on Amazon’s S3 Developer Guide. For example, the Common List Request Parameters page details information on the parameters for the getBucket() method.

This tutorial stores pictures with size of up to 500 kb for the sake of economy and simplicity. Amazon S3 can store objects of up to 5 GB in size but the S3 PHP class can only handle objects of up to 2GB in size on 32 bit systems due to PHP’s signed integer limitation. The smallest object size that can be uploaded to S3 is 1 byte but practically, it is more cost efficient to store larger objects because of the way Amazon charges on PUT requests. Data transfer rates are also relatively faster as well with larger objects because of communication overheads.

 

Geographical optimization with CloudFront

When you upload an object on S3, you are able to specify the geographic location of the server that stores the object (with the optional location_constraint parameter). While this location may be suitable for certain cases, most application that includes international users can benefit from the object being within closer geographic proximity. CloudFront allows the developer to do just that. It is essentially a Content Distribution Network (CDN) that is tailored for S3 objects.

With CloudFront, a user from Asia accessing an S3 object in Europe for the first time will trigger a one-time transfer of a copy of the object from the European server to one of Amazon’s server that resides in Asia. The copy of the object will then be stored in the Asian server. The next time the object is requested from a user in Asia, the object duplicate residing in the Asian server will be delivered to the user instead, speeding up delivery as the object will need to traverse less nodes.

It is really easy to enable CloudFront because of its close compatibility with S3. Remember the bucket we created with S3Fox in the PictureMe application? Now, open S3Fox again, and under Remove View, right click on the bucket we created earlier. Choose Manage Distribution, then click on Create Distribution within the dialog box. You should see the InProgress status within the distribution list for the bucket. Refresh your view after a while and the status should change to Deployed.

Now notice the Domain Name information within the dialog box. The domain name will form part of the URL to access your S3 object via CloudFront. In PictureMe for example, a picture object with the name london.jpg for example can be publicly accessed via http://[DOMAIN_NAME]/london.jpg. This is actually how pictures are shown within PictureMe where the url is actually used within the src attribute of an IMG tag. Pictures accessed via the CloudFront domain name will automatically be delivered via the best geographically-located server.

CloudFront unfortunately is not a default service but rather another service that you’ll have to activate and subscribe to. It is payable as well, on top of the storage charges that already occurs for S3. The good news is that it is rather cheap relatively as compared to the enhanced end-user experience that it delivers, especially for high-traffic international sites.

 

Amazon’s Elastic Compute Cloud

Fancy it may sound, the Elastic Compute Cloud does actually place a lot of computing power in the hands of the public at a decently reasonable price. The art with the ECC is on machine creation and management, and the tools available to do these are growing rapidly.

The ECC is in essence a web-service that allows developers to start virtual machines with operating systems via the internet. These machines can be used to do all kinds of stuffs such as batch processing, video encoding, streaming as well as more conventional things like web or database hosting. Depending on what you will be doing, there are hundreds of AMIs (Amazon Machine Images) available that are specialized for specific tasks. For example, one can use the Hadoop AMI to do batch processing with MapReduce.

PictureMe for example, can be placed on an instance of a standard linux-based AMI with Apache and PHP. One such AMI that is publicly available is ami-2e5fba47. The EC2 documentation describes a typical process of starting an instance and controlling it, as well as the generation of the private key required for authenticating access to the instance. To try PictureMe on the ECC, start an instance of ami-2e5fba47. To obtain information about the started instance, do:

which should return something similar to the following:

Some of the notable elements includes instance ID, AMI, hostname and the state of the instance, which should be running if the instance is successfully started. We then place the PictureMe files onto the instance as follows:

Once the files are placed, the PictureMe application can be accessed publicly via the internet at:

The ECC is a very powerful tool, and good utilization of its strength relies heavily on the creativity of the developer to manipulate the cloud, i.e. to automate and balance the creation of new instances, to create masters of instances and other forms of hierarchy. The Cloud Architectures whitepaper details some ideas on cloud utilization for large resource requirements.

 

Conclusion

image2

The above diagram sums up the PictureMe web application ecosystem that is based around Amazon’s cloud. The end-user interacts with the application that is hosted on EC2. Pictures are stored on S3 while CloudFront stores copies of pictures in locations that are closer to the end-user. Theoretically, almost everything a website or web application needs can be done on the cloud, often in more positively innovative ways.

Download source code of PictureMe