7 min read

Using Python Built-in Object Types to Hold Structured Data

The most common way to hold structured data in Python is to use built-in object types such as lists, list comprehensions, tuples, and dictionaries. In particular, you may find the above Python types useful when dealing with database data.

Dictionaries can be of particular interest to you when you need to represent data stored in a database table that has a primary key column. Turning back to the posts table created and populated with data as discussed in the first article of this series, you might create the following dictionary to represent the records stored in this table:

posts = {}
posts["Layouts in Ext JS"] = {"guid":"http://www.packtpub.com/article/layouts-in-ext-
			      js","pubDate":"Fri, 28 Nov 2008 10:31:03 +0000"}

posts[“WordPress Plug-in Development (Beginner’s Guide)”] = {“guid”:”http://www.packtpub.com/
wordpress-plug-in-development”,”pubDate”:”Fri, 28 Nov 2008 00:00:00 +0000″}

For clarity, you manually set up the dictionary here. In reality, though, you would most likely populate it with the data obtained from the database or the Web.

In the above example, the posts dictionary uses the values of the title column in the posts table as the dictionary’s keys. The dictionary’s values are also dictionaries each of which represents the rest of a record, containing the guid and pubDate fields.

Since the keys within a dictionary cannot be repeated, the above approach guarantees uniqueness of the title field in the posts’ records represented in the dictionary. Now to obtain a certain record, you can use its key like this:

rec = posts["Layouts in Ext JS"]
print rec

This should produce the following output:

{'guid':'http://www.packtpub.com/article/layouts-in-
                      ext-js','pubDate':'Fri, 28 Nov 2008 10:31:03 +0000'}

If you need to get to a certain field in the obtained record, you could use the following syntax:

guid = posts["Layouts in Ext JS"]["guid"]
print guid

The above should give you the following:

http://www.packtpub.com/article/layouts-in-ext-js

To iterate over all of the records in the posts dictionary, you could use a for loop. Here is how you could iterate over the guid field, for example:

for post in posts.items():
       print post[1]['guid']

Note that the value of the first index of the post variable representing a key/value pair of the dictionary is set to 1, meaning you’re interested in the value part of the pair. The above should generate the following output:

http://www.packtpub.com/article/layouts-in-ext-js
http://www.packtpub.com/wordpress-plug-in-development

If you want to iterate over the dictionary keys, you could use the following code:

for post in posts.items():
      print post[0]

This should give you the following lines:

Layouts in Ext JS
WordPress Plug-in Development (Beginner's Guide)

Now that you have an idea of how database data can be represented in Python, let’s look at an example of how you might persist it to a database. Here is a quick example that illustrates how you might persist the posts dictionary to the posts database table.

import MySQLdb
db=MySQLdb.connect(host="localhost",user="usrsample",passwd="pswd",db="dbsample")
c=db.cursor()
for post in posts.items():
      c.execute("""INSERT INTO posts (title, guid, pubDate) VALUES
                          (%s,%s,%s)""", (post[0], post[1]['guid'], post[1]['pubDate']))
db.commit()
db.close()

Assuming you have populated the posts dictionary with data as discussed at the beginning of the article, the above code should insert two records into the posts table.

Fetching Database Records

You typically persist data to the database in order to retrieve it from there later. How can you retrieve data from the posts table? The following script answers this question:

import MySQLdb
db=MySQLdb.connect(host="localhost",user="usrsample",passwd="pswd",db="dbsample")
c=db.cursor()
c.execute("SELECT * FROM posts")
c.fetchall()

The cursor’s fetchall method in the above code fetches all the rows retrieved by the query, making them available as a list of tuples. To iterate over this list, you could use the following loop:

for row in c:
   print row
db.close()

This should produce the following output:

('Layouts in Ext JS','http://www.packtpub.com/article/layouts-in-
                                         ext-js','Fri, 28 Nov 2008 10:31:03 +0000')
("WordPress Plug-in Development (Beginner's Guide)",'http://www.packtpub.com/wordpress-plug-
          in-development','Fri, 28 Nov 2008 00:00:00 +0000')

As you can see, each line in the above output represents a tuple rather than a dictionary. To have a dictionary instead, you will need to obtain the column names along with the rows being fetched. You can do this with the help of the cursor’s description read-only attribute, as illustrated in the updated script below:

import MySQLdb
db=MySQLdb.connect(host="localhost",user="usrsample",passwd="pswd",db="dbsample")
c=db.cursor()
c.execute("SELECT * FROM posts")
heads = [d[0] for d in c.description]
c.fetchall()
for row in c:
   print dict(zip(heads,row))
db.close()

The output should give you a set of dictionaries each of which represents a record in the posts table. But how can you get the data so that it is structured as it were in the posts dictionary discussed at the beginning of the article? To do this, you could revise the above script as follows:

import MySQLdb
db=MySQLdb.connect(host="localhost",user="usrsample",passwd="pswd",db="dbsample")
c=db.cursor()
c.execute("SELECT * FROM posts")
heads = [d[0] for index, d in enumerate(c.description) if index > 0]
c.fetchall()
print heads
posts={}
for row in c:
   posts[row[0]]= dict(zip(heads,[r for index, r in enumerate(row) if index > 0]))
print posts
db.close()

Notice the use of comprehension lists in the above code. First time, you use it to exclude the first column head from the heads list. Then, you use a similar technique to exclude the first field from each row when iterating fetched rows in the loop. As a result, you should have the same posts dictionary as you saw at the beginning of the article.

Customizing Built-in Types to Simulate Trigger Functionality

In the world of relational databases, triggers are programs stored inside the database, which run implicitly in response to a certain event. For example, you can define a BEFORE INSERT trigger on a certain table, so that it fires just before a new record is inserted into that table. It is interesting to note that triggers can be used in MySQL starting with version 5.0. If you have an older MySQL version, you won’t be able to take advantage of triggers. In that case, though, you still can simulate trigger functionality on the Python side of your application.

So, you want to define triggers on the data structures implemented in Python, much like you would do that in the underlying database. To achieve this, you could for example subclass the Python’s dict built-in type, overriding the __setitem__ method so that it takes the appropriate action implicitly whenever a new item is added. Next, you could use this customized dict’s subclass instead of dict.

Consider the following example. Suppose you want to implement the BEFORE INSERT trigger functionality on the posts dictionary, so that it restricts inserting new items to those that represent an article from the Packt Article Network. To achieve this, you will need to override the dict’s __setitem__ method so that it checks to see whether the value of the item’s guid includes the following fragment: http://www.packtpub.com/article/. Below, you create the dict’s subclass called artdict, and then use this subclass to create the posts dictionary, populating it with the same data you used at the beginning of the article:

class artdict(dict):
  def __setitem__(self, key, value):
      x = 'http://www.packtpub.com/article/'
      if (x in value['guid']):
        super(artdict, self).__setitem__(key, value)

posts=artdict()

posts[“Layouts in Ext JS”] = {“guid”:”http://www.packtpub.com/article/layouts-in-ext-
js”,”pubDate”:”Fri, 28 Nov 2008 10:31:03 +0000″}

posts[“WordPress Plug-in Development (Beginner’s Guide)”] = {“guid”:”http://www.packtpub.com/
wordpress-plug-in-development”,”pubDate”:”Fri, 28 Nov 2008 00:00:00 +0000″}

print posts

Although you have tried to insert two records into the posts dictionary, only the first insertion should have succeeded. So the print should generate the following output:

{"Layouts in Ext JS":{"guid":"http://www.packtpub.com/article/layouts-in-
                                     ext-js","pubDate":"Fri, 28 Nov 2008 10:31:03 +0000"}}

The second item, whose title is WordPress Plug-in Development (Beginner’s Guide), was excluded because its guid does not include substring http://www.packtpub.com/article/. This is because this item is not associated with an article but the book.

Summary

As you learned in this article, Python language provides a wide variety of useful tools to deal with structured data. You can utilize lists, tuples, list comprehensions, and dictionaries when it comes accessing and manipulating data stored in the underlying database. You can even customize the above built-in types to meet the requirements of your application.

LEAVE A REPLY

Please enter your comment!
Please enter your name here