7 min read

If you were to create an API for Python, you should write it using Cython to create a more type-safe Python API. Or, you could take the C types from Cython to implement the same algorithms in your Python code, and they will be faster because you’re specifying the types and you avoid a lot of the type conversion required.

Consider you are implementing a fresh project in C. There are a few issues we always come across in starting fresh; for example, choosing the logging or configuration system we will use or implement.

With Cython, we can reuse the Python logging system as well as the ConfigParser standard libraries from Python in our C code to get a head start. If this doesn’t prove to be the correct solution, we can chop and change easily. We can even extend and get Python to handle all usage. Since the Python API is very powerful, we might as well make Python do as much as it can to get us off the ground. Another question is do we want Python be our “driver” (main entry function) or do we want to handle this from our C code?

Cython cdef

In the next two examples, I will demonstrate how we can reuse the Python logging and Python ConfigParser modules directly from C code. But there are a few formalities to get over first, namely the Python initialization API and the link load model for fully embedded Python applications for using the shared library method.

It’s very simple to embed Python within a C/C++ application; you will require the following boilerplate:

#include <Python.h>
int main (int argc, char ** argv)
{
Py_SetProgramName (argv [0]);
Py_Initialize ();

/* Do all your stuff in side here...*/
Py_Finalize ();
return 0;
}

Make sure you always put the Python.h header at the very beginning of each C file, because Python contains a lot of headers defined for system headers to turn things on and off to make things behave correctly on your system.

Later, I will introduce some important concepts about the GIL that you should know and the relevant Python API code you will need to use from time to time. But for now, these few calls will be enough for you to get off the ground.

Linking models

Linking models are extremely important when considering how we can extend or embed things in native applications. There are two main linking models for Cython: fully embedded Python and code, which looks like the following figure:

This demonstrates a fully embedded Python application where the Python runtime is linked into the final binary. This means we already have the Python runtime, whereas before we had to run the Python interpreter to call into our Cython module. There is also a Python shared object module as shown in the following figure:

We have now fully modularized Python. This would be a more Pythonic approach to Cython, and if your code base is mostly Python, this is the approach you should take if you simply want to have a native module to call into some native code, as this lends your code to be more dynamic and reusable.

The public keyword

Moving on from linking models, we should next look at the public keyword, which allows Cython to generate a C/C++ header file that we can include with the prototypes to call directly into Python code from C.

The main caveat if you’re going to call Python public declarations directly from C is if your link model is fully embedded and linked against libpython.so; you need to use the boilerplate code as shown in the previous section. And before calling anything with the function, you need to initialize the Python module example if you have a cythonfile.pyx file and compile it with public declarations such as the following:

cdef public void cythonFunction ():
print "inside cython function!!!"

You will not only get a cythonfile.c file but also cythonfile.h; this declares a function called extern void initcythonfile (void). So, before calling anything to do with the Cython code, use the following:

/* Boiler plate init Python */
Py_SetProgramName (argv [0]);
Py_Initialize ();
/* Init our config module into Python memory */

initpublicTest ();
cythonFunction ();

/* cleanup python before exit ... */
Py_Finalize ();

Calling initcythonfile can be considered as the following in Python:

import cythonfile

Just like the previous examples, this only affects you if you’re generating a fully embedded Python binary.

Logging into Python

A good example of Cython’s abilities in my opinion is reusing the Python logging module directly from C. So, for example, we want a few macros we can rely on, such as info (…) that can handle VA_ARGS and feels as if we are calling a simple printf method.

I think that after this example, you should start to see how things might work when mixing C and Python now that the cdef and public keywords start to bring things to life:

import logging
cdef public void initLogging (char * logfile):
logging.basicConfig (filename = logfile,
level = logging.DEBUG,
format = '%(levelname)s %(asctime)s:
%(message)s',
datefmt = '%m/%d/%Y %I:%M:%S')
cdef public void pyinfo (char * message):
logging.info (message)
cdef public void pydebug (char * message):
logging.debug (message)
cdef public void pyerror (char * message):
logging.error (message)

This could serve as a simple wrapper for calling directly into the Python logger, but we can make this even more awesome in our C code with C99 __VA_ARGS__ and an attribute that is similar to GCC printf. This will make it look and work just like any function that is similar to printf. We can define some headers to wrap our calls to this in C as follows:

#ifndef __MAIN_H__
#define __MAIN_H__
#include <Python.h>
#include <stdio.h>
#include <stdarg.h>
#define printflike
__attribute__ ((format (printf, 3, 4)))

extern void printflike cinfo (const char *, unsigned, const char *,
...);
extern void printflike cdebug (const char *, unsigned, const char *,
...);
extern void printflike cerror (const char *, unsigned, const char *,
...);
#define info(...)
cinfo (__FILE__, __LINE__, __VA_ARGS__)
#define error(...)
cerror (__FILE__, __LINE__, __VA_ARGS__)
#define debug(...)
cdebug (__FILE__, __LINE__, __VA_ARGS__)
#include "logger.h" // remember to import our cython public's
#endif //__MAIN_H__

Now we have these macros calling cinfo and the rest, and we can see the file and line number where we call these logging functions:

void cdebug (const char * file, unsigned line,
const char * fmt, ...)
{
char buffer [256];
va_list args;

va_start (args, fmt);
vsprintf (buffer, fmt, args);
va_end (args);
char buf [512];
snprintf (buf, sizeof (buf), "%s-%i -> %s",
file, line, buffer);
pydebug (buf);

}

On calling debug (“debug message”), we see the following output:

Philips-MacBook:cpy-logging redbrain$ ./example log
Philips-MacBook:cpy-logging redbrain$ cat log
INFO 05/06/2013 12:28:24: main.c-62 -> info message
DEBUG 05/06/2013 12:28:24: main.c-63 -> debug message
ERROR 05/06/2013 12:28:24: main.c-64 -> error message

Also, you should note that we import and do everything we would do in Python as we would in here, so don’t be afraid to make lists or classes and use these to help out. Remember if you had a Cython module with public declarations calling into the logging module, this integrates your applications as if it were one.

More importantly, you only need all of this boilerplate when you fully embed Python, not when you compile your module to a shared library.

Python ConfigParser

Another useful case is to make Python’s ConfigParser accessible in some way from C; ideally, all we really want is to have a function to which we pass the path to a config file to receive a STATUS OK/FAIL message and a filled buffer of the configuration that we need:

from ConfigParser import SafeConfigParser, NoSectionError
cdef extern from "main.h":
struct config:
char * path
int number
cdef config myconfig

Here, we’ve Cythoned our struct and declared an instance on the stack for easier management:

cdef public config * parseConfig (char * cfg):
# initialize the global stack variable for our config...

myconfig.path = NULL
myconfig.number = 0
# buffers for assigning python types into C types
cdef char * path = NULL
cdef number = 0
parser = SafeConfigParser ()
try:
parser.readfp (open (cfg))
pynumber = int (parser.get ("example", "number"))
pypath = parser.get ("example", "path")
except NoSectionError:
print "No section named example"
return NULL
except IOError:
print "no such file ", cfg
return NULL
finally:
myconfig.number = pynumber
myconfig.path = pypath
return &myconfig

This is a fairly trivial piece of Cython code that will return NULL on error as well as the pointer to the struct containing the configuration:

Philips-MacBook:cpy-configparser redbrain$ ./example sample.cfg
cfg->path = some/path/to/something
cfg-number = 15

As you can see, we easily parsed a config file without using any C code. I always found figuring out how I was going to parse config files in C to be a nightmare. I usually ended up writing my own mini domain-specific language using Flex and Bison as a parser as well as my own middle-end, which is just too involved.

LEAVE A REPLY

Please enter your comment!
Please enter your name here