Django 1.2 E-commerce: Data Integration

0
98
7 min read

(Read more interesting articles on Django 1.2 e-commerce here.)

We will be using a variety of tools, many builtin to Django. These are all relatively stable and mature, but as with all open source technology, new versions could change their usage at any time.

Exposing data and APIs

One of the biggest elements of the web applications developed in the last decade has been the adoption of so-called Web 2.0 features. These come in a variety of flavors, but one thing that has been persistent amongst them all is a data-centric view of the world. Modern web applications work with data, usually stored in a database, in ways that are more modular and flexible than ever before. As a result, many web-based companies are choosing to share parts of their data with the world in hopes of generating “buzz”, or so that interested developers might create a clever “mash-up” (a combination of third-party application software with data exposed via an API or other source).

These mash-ups take a variety of forms. Some simply allow external data to be integrated or imported into a desktop or web-based application. For example, loading Amazon’s vast product catalog into a niche website on movie reviews. Others actually deploy software written in web-based languages into their own application. This software is usually provided by the service that is exposing their data in the form of a code library or web-accessible API.

Larger web services that want to provide users with programmatic access to their data will produce code libraries written in one or more of the popular web-development languages. Increasingly, this includes Python, though not always, and typically also includes PHP, Java, or Perl. Often when an official data library exists in another language, an enterprising developer has ported the code to Python.

Increasingly, however, full-on code libraries are eschewed in favor of open, standards-based, web-accessible APIs. These came into existence on the Web in the form of remote procedure call tools. These mapped functions in a local application written in a programming language that supports XML-RPC to functions on a server that exposed a specific, well-documented interface. XML and network transport protocols were used “under the hood” to make the connection and “call” the function.

Other similar technologies also achieved a lot of use. For example, many web-services provide Simple Object Access Protocol (SOAP) interface, which is the successor to XML-RPC and built on a very similar foundation. Other standards, sometimes with proprietary implementations, also exist, but many new web-services are now building APIs using REST-style architecture.

REST stands for Representational State Transfer and is a lightweight and open technique for transmitting data across the Web in both server-to-server and client-to-server situations. It has become extremely popular in the Web 2.0 and open source world due to its ease of use and its reliance on standard web protocols such as HTTP, though it is not limited to any one particular protocol.

A full discussion of REST web services is beyond the scope of this article. Despite their simplicity, there can arise many complicated technical details. Our implementation in this article will focus on a very straightforward, yet powerful design.

REST focuses on defining our data as a resource that when used with HTTP can map to a URL. Access to data in this scheme is simply a matter of specifying a URL and, if supported, any look-up, filter, or other operational parameters. A fully featured REST web service that uses the HTTP protocol will attempt to define as many operations as possible using the basic HTTP access methods. These include the usual GET and POST methods, but also PUT and DELETE, which can be used for replacement, updating, or deletion of resources.

There is no standard implementation of a REST-based web service and as such the design and use can vary widely from application to application. Still, REST is lightweight enough and relies on a well known set of basic architectures that a developer can learn a new REST-based web service in a very short period of time. This gives it a degree of advantage over competing SOAP or XML-RPC web services. Of course, there are many people who would dispute this claim. For our purposes, however, REST will work very well and we will begin by implementing a REST-based view of our data using Django.

Writing our own REST service in Django would be very straightforward, partly because URL mapping schemes are very easy to design in the urls.py file. A very quick and dirty data API could be created using the following super-simple URL patterns:

(r'^api/(?P<obj_model>w*)/$', 'project.views.api')
(r'^api/(?P<obj_model>w*)/(?P<id>d*)/$', 'project.views.api')

And this view:

from django.core import serializers

def api(request, obj_model, obj_id=None):
model = get_model(obj_model.split("."))
if model is None:
raise Http404
if obj_id is not None:
results = model.objects.get(id=obj_id)
else:
results = model.objects.all()
json_data = serializers.serialize('json', results)
return HttpResponse(json_data, mimetype='application/json'))

This approach as it is written above is not recommended, but it shows an example of one of the simplest possible data APIs. The API view returns the full set of model objects requested in JSON form. JSON is a simple, lightweight data format that resembles JavaScript syntax. It is quickly becoming the preferred method of data transfer for web applications.

To request a list of all products, for example, we only need to access the following URL path on our site: /api/products.Product/. This uses Django’s app.model syntax to refer to the model we want to retrieve. The view uses get_model to obtain a reference to the Product model and then we can work with it as needed. A specific model can be retrieved by including an object ID in the URL path: /api/products.Product/123/ would retrieve the Product whose ID is 123.

After obtaining the results data, it must be encoded to JSON format. Django provides serializers for several data formats, including JSON. These are all located in the django.code.serializers module. In our case, we simply pass the results QuerySet to the serialize function, which returns our JSON data. We can limit the fields to be serialized by including a field’s keyword argument in the call to serialize:

json_data = serializers.serialize('json', results,
fields=('name','price'))

We can also use the built-in serializers to generate XML. We could modify the above view to include a format flag to allow the generation of JSON or XML:

def api(request, obj_model, obj_id=None, format='json'):
model = get_model(*obj_model.split())
If model is None:
raise Http404
if obj_id is not None:
results = model.objects.get(id=obj_id)
else:
results = model.objects.all()
serialized_data = serializers.serialize(format, results)
return HttpResponse(serialized_data,
mimetype='application/' + format)

Format could be passed directly on the URL or better yet, we could define two distinct URL patterns and use Django’s keyword dictionary:

(r'^api/(?P<obj_model>w*)/$', 'project.views.api'),
(r'^api/(?P<obj_model>w*)/xml/$', 'project.views.api',
{'format': 'xml'}),
(r'^api/(?P<obj_model>w*)/yaml/$', 'project.views.api',
{'format': 'yaml'}),
(r'^api/(?P<obj_model>w*)/python/$', 'project.views.api',
{'format': 'python'}),

By default our serializer will generate JSON data, but we’ve got to provide alternative API URLs that support XML, YAML, and Python formats. These are the four built-in formats supported by Django’s serializers module. Note that Django’s support for YAML as a serialization format requires installation of the third-party PyYAML module.

Building our own API is in some ways both easy and difficult. Clearly we have a good start with the above code, but there are many problems. For example, this is exposing all of our Django model information to the world, including our User objects. This is why we do not recommend this approach. The views could be password protected or require a login (which would make programmatic access from code more difficult) or we could look for another solution.

LEAVE A REPLY

Please enter your comment!
Please enter your name here