HTTP Made Simple, Part 4: Representation And Content Negotation
Here's what we've learned so far:
- In part 1, we said that HTTP views the Internet as a big key-value store.
- In part 2, we established that
DELETEwere the main methods, with
POSTacting as fallback for things that don't fit the key-value store model.
- In part 3, we discussed URLs, the principle of opacity, use of parameters to dynamically construct URLs, and discovery to obtain an initial set of URLs.
In this installment, we're going to look at how we go from a resource to a representation, which is actual data we can use.
Resources and Representations
Let's start with a couple of examples to make the whole idea of a resource and a representation more concrete.
Suppose you write a book. You want to make that book available to potential readers in whatever format is most convenient for them. You might have a PDF version, an HTML version, and, for those with e-readers, ePub and Kindle versions. It's the same book, obviously. So it's the same resource. But there are four formats, or representations, of that resource.
Suppose you are running a video sharing site. You want to make it easy for people to watch the video in their browsers. With HTML5, this is no problem. However, because different browsers use different formats, we have to provide two versions of every video: one in WebM and one in MP4. Of course, over time, new formats may be introduced, and we'll want versions of each video in the new formats, too. Again, conceptually speaking, each format is really the same video. So it makes sense to think of it as the same resource, and the different formats are the representations.<small>So why not just use the word format instead of representation? One reason is that the word format is often less specific than a representation. For example, WebM is a video format, but WebM using the Vorbis codec is a representation.</small>
The reason for distinguishing resources from representations for those resources makes it possible to leave the choice of formats to the client. Alice may share a resource with Bob via a URL without dictating to Bob how to consume it. Yet, at the same time, the resource/representation split doesn't preclude having a separate resource for each format in cases where it may not matter.
Example: Videos (Again)
In our video sharing example, we want people to be able to share videos without having to worry about which format to use. Or, similarly, we want search engines to be able to display links to our videos in their search results, rather than having to display one result for each format. That way, people can use the same link, regardless of the software they use to watch the video.
In practice, we often see an extension (ex:
.jpg) at the end of a URL to identify the format. From the standpoint of HTTP, these are separate resources (because the URLs are different). They just happen to have only one representation (the one indicated by the extension). But, again, HTTP doesn't prevent us from doing this. It just gives us the option to defer the choice of format. If we have many different formats, or we anticipate having many different formats, it's easier to reference a resource with a single URL, rather than one for each format.
If we don't somehow encode the format in the URL itself, how can a client tell the server which format it wants? This brings us to the topic of HTTP headers. Headers are how HTTP enriches a message with metadata. That is, data about a request or response, as opposed to the resource data itself. The basic idea is for a message (a request or response) to be self-describing. This, in turn, minimizes the knowledge that clients and servers must have about each other.
Headers Contain Message Metadata
We'll dive into further detail about that later on. For now, suffice to say that one of these bits of knowledge concerns the format of a resource. This is accomplished using two different metadata properties. One which allows a client to tell the server the format (or formats) it prefers. Another which allows either the client or server to describe the format of an included resource.
Probably the easiest way to get a sense of these headers is to look at a couple of examples. Here's how a client can ask for a video using the WebM format:
GET /videos/cute-kitten Accept: video/webm;codecs="vp8, vorbis", video/mp4;codecs="avc1.42E01E, mp4a.40.2"
Here's how the server might respond (omitting the actual content of the video, as well as some other headers)<small>Notice how the
Accept header is much more flexible than simply appending a file extension to the URL. We don't always need that much flexibility, but it's nice that it's there when we do.</small>:
HTTP/1.1 200 OK Content-Type: video/webm;codecs="vp8, vorbis"
The client has asked for a video in either WebM or MP4 formats, and the server has responded with using the WebM format. This is known as content negotiation, but it's just a fancy way of saying that the client tells the server which formats it prefers and the server picks the best one it can support.
The same approach works when uploading the video. Here's the client making a
PUT request (again, omitting the actual video itself and a few other headers):
PUT /videos/cute-kitten Content-Type: video/webm;codecs="vp8, vorbis"
Basically, we're just including in the message itself information about the representation of the resource. Since HTTP allows separating the resource from its potentially myriad representations, it also must allow us to describe the representation we're using (or want to use) in the messages.
Even when we use an extension in the URL, what's really happening behind the scenes is the client and server are using the extension to determine which headers to add to a message. For example, if I use
curl to get an image of the Golden Gate bridge:
curl -v http://upload.wikimedia.org/wikipedia/commons\ /0/0c/GoldenGateBridge-001.jpg > /dev/null
here's what the server tells me in the response:
even though, of course, the
.jpg extension was in the URL.
Until Next Time…
So far, we've established that HTTP sees the world as a distributed key-value store, manipulated using
DELETE, with opaque URLs as keys, and the choice of representation left up to the client, and described by message metadata. We've discussed how well this basic architecture scales. But is it fast? In our next installment, we'll see how HTTP deals with performance challenges.