This article mini-series by Matt Butcher will look at the Python application programmers interface (API) for the LDAP libraries, and using this API, we will connect to our OpenLDAP server and manipulate the directory information tree. More specifically, we will cover the following in this article series:

Installing and configuring the Python-LDAP library.
Binding to an LDAP directory.
Comparing attributes between the client and server.
Performing searches on the directory.
Modifying the directory information tree with add, delete, and modify operations.
Modifying directory passwords.
Working with LDAP schemas.

This first part will deal with installation and configuration of the Python-LDAP library. We will then see how the binding operation is performed.

Installing Python-LDAP

There are a couple of LDAP libraries available for Python, but the most popular is the Python-LDAP module, which (as with the PHP API) uses the OpenLDAP C library as a base for providing network access to an LDAP server.

Like OpenLDAP, the Python-LDAP API is Open Source. It works on Linux, Windows, Mac OS X, BSD, and probably other UNIX operating systems as well (platforms that have both Python and OpenLDAP available). The source code is available at the official Python-LDAP website: http://python-ldap.sourceforge.net. Here pre-compiled binaries for many platforms are available, but we will install the version in the Ubuntu repository.

Before installing Python-LDAP, you will need to have the Python scripting language installed. Typically, this is installed by default on Ubuntu (and on most flavors of Linux). Installing Python-LDAP requires only one command:

$ sudo apt-get install python-ldap

This will choose the module or modules that match the installed Python version. That is, if you are running Python 2.4 (the stable version, at the time of writing), this will install the python2.4-ldap package.

The library, which consists of several Python packages, will be installed into /usr/lib/python2.4/site-packages/.

In Ubuntu, there is no need to run further configuration in order to make use of the Python-LDAP library. We are ready to dive into the API.

If you install by hand, either from source or from the binary packages, you may need to add the Python-LDAP library to your Python path. See the Python documentation for details.

The Python-LDAP API is well documented. The documentation is available online at the official Python-LDAP website: http://python-ldap.sourceforge.net/docs.shtml. You may find it more convenient to download a copy of the documentation and use it locally.

In previous versions of Ubuntu Python-LDAP documentation was available in the package python-ldap-doc, which could be installed with apt-get.

Also, many of the Python-LDAP functions and objects have documentation strings that can be accessed from the Python interpreter like this:

>>> print ldap.initialize.__doc__

  Return LDAPObject instance by opening LDAP connection to
  LDAP host specified by LDAP URL
  
  Parameters:
  uri
        LDAP URL containing at least connection scheme and hostport,
        e.g. ldap://localhost:389
  trace_level
        If non-zero a trace output of LDAP calls is generated.
  trace_file
        File object where to write the trace output to.
        Default is to use stdout.

The documentation string usually contains a brief description of the function or object, and is a useful quick reference.

A Quick Overview of the Python LDAP API

Now that the package is installed, let's take a quick look at what was installed. The Python-LDAP package comes with nine different modules:

ldap: This is the main LDAP module. It contains the functions necessary for performing LDAP operations, such as binding, searching, adding, and modifying.
ldap.async: Python can do synchronous and asynchronous transactions. This module provides utilities that are useful when performing asynchronous operations.
ldap.cidict: This contains the cidict class, which is a case-insensitive dictionary. Although LDAP is case-insensitive when it comes to attribute names, it is often necessary to perform case-insensitive operations on dictionary keys.
ldap.modlist: Utility functions for creating modification records (for performing the LDAP modify operation) are in this package.
ldap.filter: This module provides a couple of utility functions for creating LDAP search filters.
ldap.sasl: Python-LDAP's SASL support is partially contained in this package. It is not documented in the online documentation, but there are plenty of notes in the doc strings in this module.
ldap.schema: This module contains classes that describe the subschema subentry records. It can be used to access schema information.
ldapurl: This module provides a class for generating and parsing LDAP URLs.
ldif: This module is used to parse or write LDIF-formatted LDAP records.

Most of the commonly used LDAP features are in the ldap module, and we will be focused mainly on using that. Since many of the submodules have only a couple of functions, we will use them in passing but treat them as separate objects of discussion.

A Note on the Python Examples

The Python interpreter (python) can be run interactively. Running Python in an interactive mode can be very useful for discovery and debugging. Further, since it prints useful information directly to the console, it can be useful for demonstration purposes.

In many of the examples below, the code is shown as it would be entered in the interactive shell. Here is an example:

>>> h = "Hello World"
>>> h
'Hello World'
>>> print h
Hello World

Lines that begin with >>> and ... are interpreter prompts (similar to $ in shell). Examples with the >>> are run in the interpreter interactively. Some code, however, will be typed into a file (as usual). This code will not have lines beginning with the interpreter prompt. They tend to look more like this:

h = “Hello World”
print h

Most of the time, features are introduced using the interpreter, but lengthier examples are done in the form of a Python script.

Where it might be confusing, I will explicitly say in the text which of the two methods I am using.

Connecting and Binding to the Directory

Now that we have the library installed, we are ready to use the API. The Python-LDAP API connects and binds in two stages. Initializing the LDAP system is done with the ldap.initialize() function. The initialize() method returns an LDAPObject object, which contains methods for performing LDAP operations and retrieving information about the LDAP connection and transactions. A basic initialization is done like this:

>>> import ldap
>>> con = ldap.initialize('ldap://localhost')

The first line of this example imports the ldap module, that contains the initialize() method as well as the LDAPObject that we will make frequent use of.

The second line initializes the LDAP code, and returns an LDAPObject that we will use to connect to the server. The initialize() function takes a simple LDAP URL (protocol://host:port) as a parameter.

Sometimes, you may prefer to pass in simply host and port information. This can be done with the connect(host, port) function, that also returns an LDAPObject object. In addition, if you need to check or set any LDAP options, you should use the get_option() and set_option() functions before binding. For instance, we can set the connection to require a TLS certificate by setting the OPT_X_TLS_DEMAND option:

>>> con.get_option(ldap.OPT_X_TLS_DEMAND)
0
>>> con.set_option(ldap.OPT_X_TLS_DEMAND, True)
>>> con.get_option(ldap.OPT_X_TLS_DEMAND)
1

A Safe Connection

In most production environments, security is a major concern. As we have seen in previous chapters, one major component of security in network-based LDAP services is the use of SSL/TLS-based connections.

There are two ways to get transport-layer security with the Python-LDAP module. The first is to connect to the LDAPS (LDAP over SSL) port. This is done by passing the correct parameter to the initialize() function. Instead of using the ldap:// protocol, which will make an unverified unencrypted connection to port 389, use an ldaps:// protocol, which will make an SSL connection to port 636 (you can specify alternate an alternate port by appending a colon (:) and then the port number to the end of the URL).

Or, instead of using LDAPS, you can perform a Start TLS operation before binding to the server:

>>> import ldap
>>> con = ldap.initialize('ldap://localhost')
>>> con.start_tls_s()

Note that while the call to ldap.initialize() does not actually open a connection, the call to ldap.start_tls_s() does create a connection.

Exceptions

Connecting to an LDAP server may result in the raising of an exception, so in production code, it is best to wrap the connection attempt inside of a try/except block. Here is a fragment of a script:

#!/usr/bin/env python

import ldap, sys

server = 'ldap://localhost'
l = ldap.initialize(server)
try:
    l.start_tls_s()
except ldap.LDAPError, e:
    print e.message['info']
    if type(e.message) == dict and e.message.has_key('desc'):
        print e.message['desc']
    else:
        print e
    sys.exit()

In the case above, if the start_tls_s() method results in an error, it will be caught. The except clause checks if the returned message is a dict (which it should always be), and also checks if it has the description ('desc') field. If so, it prints the description. Otherwise, it prints the entire message.

There are a few dozen exceptions that the Python-LDAP library might raise, but all of them are subclasses of the LDAPError class, and can be caught by the line:

except ldap.LDAPError, e:

Within an LDAPError object, there is a dictionary, called message, which contains the 'info' and 'desc' fields. The 'info' field contains the information returned from the server, and the 'desc' field contains a description of the error.

In general, it is best to use try/except blocks around LDAP operations in order to catch any errors that might occur during processing.

Binding

Once we have an LDAPObject instance, we can bind to the LDAP directory. The Python-LDAP API supports both simple and SASL binding methods, and there are five different bind methods:

bind(): Takes three required parameters: a DN, a password (or credential, for SASL), and a string indicating what type of bind method to use. Currently, only ldap.AUTH_SIMPLE is supported. This is asynchronous. Example: con.bind(dn, pw, ldap.AUTH_SIMPLE)
bind_s(): This one is same as above, but it is synchronous, and returns information about the status of the bind.
simple_bind(): This performs a simple bind. This has two optional parameters: DN and password. If no parameter is specified, this will bind as anonymous. This is asynchronous.
simple_bind_s(): This is the synchronous version of the above.
sasl_interactive_bind_s(): This performs an SASL bind, and it takes two parameters: an SASL identifier and an SASL authentication string.

First, for many Python LDAP functions, including almost all of the LDAP operations, there are both synchronous and asynchronous versions. Synchronous versions, which will block until the server returns a result, have method names that end with _s.

The other operations – those that do not end with _s – are asynchronous. An asynchronous message will begin an operation, and then return control to the program. The operation will continue in the background. It is the responsibility of the program to periodically check on the operation to see if it has been completed.

Since they wait to return any results until the operation has been completed, synchronous methods will often have different return values than their asynchronous counterparts. Synchronized methods may return the results obtained from the server, or they may have void returns. Asynchronous methods, on the other hand, will always return a message identifier. This identifier can be used to access the results of the operation.

Here's an example of the different results for the two different forms of simple bind. First, the synchronous bind:

>>> dn = "uid=matt,ou=users,dc=example,dc=com"
>>> pw = "secret"
>>> con.simple_bind_s( dn, pw ) 
(97, [])
>>>

Notice that this method returns a tuple. Now, look at the asynchronous version:

>>> con.simple_bind( dn, pw )
8
>>> con.result(8)
(97, [])

In this case, the simple_bind() method returned 8 – the message identification number for the result. We can use the result() method to fetch the resulting information. The result() method returns a two-item tuple, where the first item is the status code (97 means success), and the second is a list of messages from the server. In this case, the list is empty.

Notes on Getting Results
There are two noteworthy caveats about fetching results. First, a particular result can only be fetched once. You cannot call result() with the same message ID multiple times. Second, you can execute multiple asynchronous operations without checking the results. The consequence of doing this is that all of the results will be stored until they are fetched. This consumes memory, and can lead to confusing results if result() or result( ldap.RES_ANY ) is called.

Later in this chapter, we will see more sophisticated uses of synchronous and asynchronous methods, but for now we will continue looking at methods of binding.

The bind() and bind_s() methods work the same way, but they require a third parameter, specifying which sort of authentication mechanism to use. Unfortunately, at the time of this writing, only the AUTH_SIMPLE form of binding (plain old simple bind) is supported by this mechanism:

>>> con.bind_s( dn, pw, ldap.AUTH_SIMPLE ) 
(97, [])

This performs a simple bind to the server.

Exceptions

A bind can fail for a number of reasons, the most common being that the connection failed (the CONNECT_ERROR exception) or authentication failed (INVALID_CREDENTIALS). In production code, it is a good idea to check for these exceptions using try/except blocks. By checking for them separately, you can distinguish between, say, authentication failures and other, more serious failures:

l = ldap.initialize(server)
try: 
    #l.start_tls_s()
    l.bind_s(user_dn, user_pw)
except ldap.INVALID_CREDENTIALS:
    print "Your username or password is incorrect."
    sys.exit()
except ldap.LDAPError, e:
    if type(e.message) == dict and e.message.has_key('desc'):
        print e.message['desc']
    else: 
        print e
    sys.exit()

In this case, if the failure is due to the user entering the wrong DN or password, a message to that effect is printed. Otherwise, the error description provided by the LDAP library is printed.

SASL Interactive Binds

SASL is a robust authentication mechanism, but the flexibility and adaptability of SASL comes at the cost of additional complexity.

This additional complexity is evident in the Python-LDAP module. SASL binding is implemented differently than the other bind methods. First, there is no asynchronous version of the SASL bind method (not all thread safety issues have been worked out in this module, yet).

Since the SASL code is not as stable as the rest of the API, you may want to stick to simple binding (with SSL/TLS protection) rather than rely upon SASL support.

There is only one SASL binding method, sasl_interactive_bind_s(). This method takes two arguments. The first is a DN string. It is almost always left blank, since with SASL, we usually authenticate with some other identifier. The second argument is an sasl object (or a subclass of an sasl object).

The sasl object contains a dictionary of information that the SASL subsystem uses to perform authentication. Each different SASL mechanism is implemented as a class that is a subclass of the sasl object. There are a handful of different subclasses that come with the Python-LDAP module, though you can create your own if you need support for a different mechanism.

cram_md5: This class implements the CRAM-MD5 SASL mechanism. A new cram_md5 object can be created with a constructor that passes in the authentication ID, a password, and an optional authorization ID.
digest_md5: This implements the DIGEST-MD5 SASL mechanism. Like cram_md5(), this object can be constructed with an authentication ID, a password, and an optional authorization ID.
gssapi: This implements the GSSAPI mechanism, an its constructor has only the optional authorization ID. It is used to perform Kerberos V authentication.
external: This implements the EXTERNAL SASL mechanism, that uses an underlying transport security mechanism (like SSL/TLS). Its constructor only takes the optional authorization ID.

Our LDAP server is configured to allow DIGEST-MD5 SASL connections, so we will walk through an example of performing this sort of SASL authentication.

>>> import ldap
>>> import ldap.sasl
>>> user_name = "matt"
>>> pw = "secret"
>>> 
>>> con = ldap.initialize("ldap://localhost")
>>> auth_tokens = ldap.sasl.digest_md5( user_name, pw )
>>> 
>>> con.sasl_interactive_bind_s( "", auth_tokens )
0

To begin with, we import the ldap and ldap.sasl packages, and we store the user name and password information in a couple of variables.

After initializing a connection, we need to create a new sasl object – on that will contain the information necessary to perform DIGEST-MD5 authentication. We do this by constructing a new digest_md5 object:

>>> auth_tokens = ldap.sasl.digest_md5( user_name, pw )

Now, auth_tokens points to our new SASL object. Next, we need to bind. This is done with the sasl_interactive_bind_s() method of the LDAPObject:

>>> con.sasl_interactive_bind_s( "", auth_tokens )

If a SASL interactive bind is successful, then this method will return an integer. Otherwise, an INVALID_CREDENTIALS exception will be raised:

>>> auth_tokens = ldap.sasl.digest_md5( "foo", pw )
>>> try:
...   con.sasl_interactive_bind_s( "", auth_tokens )
... except ldap.INVALID_CREDENTIALS, e :
...   print e
... 
{'info': 'SASL(-13): user not found: no secret in database', 'desc': 'Invalid credentials'}

In this case, the user foo was not found in the SASL DB, and the SASL subsystem returned an error.