Tue 9 Dec 2008
Implementing HTTP Services With Django
Posted at 14:20 +1100 (last edited: 9 Dec 2008, 16:06)
Last week, Kris Jordan posted an article entitled Towards RESTful PHP - 5 Basic Tips. It's one a number of recent articles by Kris on software design and PHP implementation that have been interesting reading. That post, in particular, had a really good, simple idea: Take a number of things you need to do to implement a RESTful (or even REST-like) service and show how to do them in PHP.
The idea is so good, I'm going to blatantly steal it. Then change it slightly and twist the meaning beyond all recognition (well, okay, not the last part). Let's call this piece "part 1" of a series on RESTful Django implementations.
My particular interpretation will be on Django-oriented implementations, with an emphasis on the implementation portion: how to write actual code that does achieves the aim. The particular points are not REST-specific. Rather, they are cases of using HTTP as designed and documented in the specification (and, in future posts, in some auxiliary specs).
HTTP Methods and Content Types
Historically, Django was written to deal with the web through web browsers. Fortunately, it wasn't constrained to just that, but some of the HTTP-oriented API has assumptions that data will be coming from forms submitted through browsers. All this means for our purposes is that you shouldn't expect the automated file handling in Django to work for arbitrary HTTP requests. And don't go looking for the equivalent of request.POST, since that has an implicit assumption that the input is a form submission — it tries to handle the entity body as a form post, which is only appropriate if it actually is a form post and that won't be true for a generic XML submission, or image upload.
If you're wanting to write more HTTP-oriented (less browser-oriented) services, particularly things designed to be accessed from other programs and machines, you are not, and shouldn't be, restricted only to form submission. You have access to a full set of HTTP methods, not only GET and POST. You can also handle a full set of content types. (I'm going to stop writing "not only" for comparison, since I'm not intending to write a piece about limitations of the web browser. Web browsers have a set of constraints, but they work very well for what they're designed for. I don't want to give the impression I think they're dysfunctional pieces of software).
Handling Incoming Requests
Django will dispatch the incoming HTTP request to a view function regardless of the payload and HTTP method used. So your view handler needs to dispatch further processing based on examining that information. This isn't a bad design. It means, for example, that you don't have to change your code layout to extend to full-on HTTP-enabled applications.
Concretely, your code will want to use the following pieces of information (assuming request is the variable holding the django.http.HttpRequest instance that is passed to your view):
request.method |
A string that is the name of the HTTP method. "GET", "POST", "HEAD", "PUT", etc. |
|---|---|
request.META["CONTENT_TYPE"] |
The content type of the submission (a string). Something like "application/atomcat+xml". |
request.META["CONTENT_LENGTH"] |
The length of the incoming entity. |
There might be other HTTP headers that you're interested in. All headers are available in the request.META dictionary. For everything except CONTENT_TYPE and CONTENT_LENGTH, mentally convert the header name to all upper-case, replace any hyphens with underscores and put an HTTP_ prefix on the string and you'll have the key name in request.META. This, and more, is documented in the Django documentation and worth reading when you get to this point.
If you want to split up your functions based on HTTP method type, you might like to use an approach similar to one I wrote about last year. Obviously not the only approach; merely a starting point for thought if you're stuck at the first hurdle. It turns out that I don't use that particular style much, although I do split out distinct verb handling into separate functions as a rule.
Interesting Headers
Why might you care about random header information? Maybe you won't and the short answer to "which headers should I worry about?" is "whichever headers are relevant to your protocol." However, here's a quick dip into the pool of potentially useful information that might be available.
The content type is often a discriminator that is used to determine further processing. For example, in the Atom Publishing Protocol, an Atom Entry submission is treated differently from a media upload (an image, for example), but they could well be POSTed to the same URL initially.
I'm going to look at HTTP authentication in more detail in a later post. For now, I'll just point out that authentication information is submitted via the Authentication header (and accessible through request.META["HTTP_AUTHENTICATION"]). Similarly, ETags are vitally important and something I'll put in their own article. They arrive in a header.
Some protocols submit meta-information through headers. Using Atom Pub as an example, again, the Slug header is used as a request to the server of a string to use in a portion of the newly created URL for a post. Some protocols might even allow working around HTTP proxies that don't transmit anything other than POST and GET requests by allowing the real method to be specified in an HTTP Header for PUTs and DELETEs and accepting a POST submission in lieu of the real thing.
Anyway, know the headers that your protocol cares about and know that you can access them through request.META.
Specifying Response Status Codes, Content Types And Headers
Just as requests can contain various content types and headers, so can responses. You might also need to return a particular HTTP status code. All of this is set up on the HttpResponse object. You pass in the content type, status code and content when creating the class and set up individual headers by treating the HttpResponse object as a map (dictionary) and setting keys. For example:
def submit_entry(request, ...):
# ...
final_content, uri, etag = process_submission(...)
response = HttpResponse(final_content,
content_type="application/atom+xml",
status_code=201)
response["Location"] = uri
response["ETag"] = etag
return response
Into The Future
Today was the basics of HTTP-level interaction: reading and writing protocol-level items. There are a couple of posts on slightly higher level issues to come. In particular,
- HTTP Authentication, particularly HTTP Auth handling (a seriously hairy area).
- Server-side session elimination and achieving legitimate statelessness.
- Output formats.
- ETag usage, both for server-side optimisation and to avoid "lost update" issues.
If there's some particular implementation aspect you'd like to see mentioned, maybe ping me on twitter (I check the "replies" tab about once a day), or drop me an email. I won't promise to meet each request, but there may be some bullet point I've forgotten that is both necessary and fiddly to actually write down in code.
Would You Like To Know More?
(Seriously, how can you read that heading and not think of Starship Troopers?)
As I'm primarily focusing on concrete implementation items here, there's no theory that isn't directly driving towards that goal. Similarly, I'm not discussing any significant design considerations. If you do want to fill in some of the theoretical background or see some of the intermediate steps, such as moving from the theoretical to a protocol to an API, here's a quick sampler:
- There are some links from Kris's post to a couple of REST introductions if you want the elevator pitch.
- The RestWiki is a good online collection of the theoretical concepts. As I write this, it seems to be timing out, so hopefully that's only a temporary thing.
- For a very practical (in the hands-on sense) approach to RESTful design, it's difficult to go past the Richardson & Ruby book, although that's obviously not the only choice.
- Jacob Kaplan-Moss also wrote down some thoughts about designing interfaces.
- Roy Fielding chimes in now and again on design practices (get it wrong and he gets annoyed.
Topics: software/django/tutorials, technology/web