20 min read

The strange relationship between objects, functions, generators and coroutines

In this article, I’d like to investigate some relationships between functions, objects, generators and coroutines in Python. At a theoretical level, these are very different concepts, but because of Python’s dynamic nature, many of them can appear to be used interchangeably. I discuss useful applications of all of these in my book, Python 3 Object-oriented Programming – Second Edition. In this essay, we’ll examine their relationship in a more whimsical light; most of the code examples below are ridiculous and should not be attempted in a production setting!

Let’s start with functions, which are simplest. A function is an object that can be executed. When executed, the function is entered at one place accepting a group of (possibly zero) objects as parameters. The function exits at exactly one place and always returns a single object.

Already we see some complications; that last sentence is true, but you might have several counter-questions:

  • What if a function has multiple return statements?
    Only one of them will be executed in any one call to the function.
  • What if the function doesn’t return anything?
    Then it it will implicitly return the None object.
  • Can’t you return multiple objects separated by a comma?
    Yes, but the returned object is actually a single tuple

Here’s a look at a function:

def average(sequence):
    avg = sum(sequence) / len(sequence)
    return avg

print(average([3, 5, 8, 7]))

which outputs:


That’s probably nothing you haven’t seen before. Similarly, you probably know what an object and a class are in Python. I define an object as a collection of data and associated behaviors. A class represents the “template” for an object. Usually the data is represented as a set of attributes and the behavior is represented as a collection of method functions, but this doesn’t have to be true.

Here’s a basic object:

class Statistics:
    def __init__(self, sequence):
        self.sequence = sequence

    def calculate_average(self):
        return sum(self.sequence) / len(self.sequence)

    def calculate_median(self):
        length = len(self.sequence)
        is_middle = int(not length % 2)
        return (
            self.sequence[length // 2 - is_middle] + 
            self.sequence[-length // 2]) / 2

statistics = Statistics([5, 2, 3])
print(statistics.calculate_average())

which outputs:

This object has one piece of data attached to it: the sequence. It also has two methods besides the initializer. Only one of these methods is used in this particular example, but as Jack Diederich said in his famous Stop Writing Classes talk, a class with only one function besides the initializer should just be a function. So I included a second one to make it look like a useful class (It’s not. The new statistics module in Python 3.4 should be used instead. Never define for yourself that which has been defined, debugged, and tested by someone else).

Classes like this are also things you’ve seen before, but with this background in place, we can now look at some bizarre things you might not expect (or indeed, want) to be able to do with a function.

For example, did you know that functions are objects? In fact, anything that you can interact with in Python is defined in the source code for the CPython interpreter as a PyObject structure. This includes functions, objects, basic primitives, containers, classes, modules, you name it.

This means we can attach attributes to a function just as with any standard object. Ah, but if functions are objects, can you attach functions to functions? Don’t try this at home (and especially don’t do it at work):

def statistics(sequence):
    statistics.sequence = sequence
    return statistics

def calculate_average():
    return sum(statistics.sequence) / len(statistics.sequence)

statistics.calculate_average = calculate_average

print(statistics([1, 5, 8, 4]).calculate_average())

which outputs:

This is a pretty crazy example (but we’re just getting started). The statistics function is being set up as an object that has two attributes: sequence is a list and calculate_average is another function object. For fun, the function returns itself so that the print function can call the calculate_average function all in one line.

Note that the statistics function here is an object, not a class. Rather than emulating the Statistics class in the previous example, it is more similar to the statistics instance of that class.

It is hard to imagine any reason that you would want to write code like this in real life. Perhaps it could be used to implement the Singleton (anti-)pattern popular with some other languages. Since there can only ever be one statistics function, it is not possible to create two distinct instances with two distinct sequence attributes the way we can with the Statistics class. There is generally little need for such code in Python, though, because of its ‘consenting adults’ nature.

We can more closely simulate a class by using a function like a constructor:

def Statistics(sequence):
    def self():
        return self.average()

    self.sequence = sequence

    def average():
        return sum(self.sequence) / len(self.sequence)

    self.average = average
    return self

statistics = Statistics([2, 1, 1])
print(Statistics([1, 4, 6, 2]).average())
print(statistics())

which outputs:

That looks an awful lot like Javascript, doesn’t it? The Statistics function acts like a constructor that returns an object (that happens to be a function, named self). That function object has had a couple attributes attached to it, so our function is now an object with both data and behavior. The last three lines show that we can instantiate two separate Statistics “objects” just as if it were a class. Finally, since the statistics object in the last line really is a function, we can even call it directly. It proxies the call through to the average function (or is it a method at this point? I can’t tell anymore) defined on itself.

Before we go on, note that this simulated overlapping of functionality does not mean that we are getting exactly the same behavior out of the Python interpreter. While functions are objects, not all objects are functions. The underlying implementation is different, and if you try to do this in production code, you’ll quickly find confusing anomalies. In normal code, the fact that functions can have attributes attached to them is rarely useful. I’ve seen it used for interesting diagnostics or testing, but it’s generally just a hack.

However, knowing that functions are objects allows us to pass them around to be called at a later time. Consider this basic partial implementation of an observer pattern:

class Observers(list):
    register_observer = list.append

    def notify(self):
        for observer in self:
            observer()

observers = Observers()

def observer_one():
    print('one was called')

def observer_two():
    print('two was called')

observers.register_observer(observer_one)
observers.register_observer(observer_two)
observers.notify()

which outputs:

At line 2, I’ve intentionally reduced the comprehensibility of this code to conform to my initial ‘most of the examples in this article are ridiculous’ thesis. This line creates a new class attribute named register_observer which points to the list.append function. Since the Observers class inherits from the list class, this line essentially creates a shortcut to a method that would look like this:

def register_observer(self, item):
    self.append(item)

And this is how you should do it in your code. Nobody’s going to understand what’s going on if you follow my version. The part of this code that you might want to use in real life is the way the callback functions are passed into the registration function at lines 16 and 17. Passing functions around like this is quite common in Python. The alternative, if functions were not objects, would be to create a bunch of classes that have a single method with an uninformative name like execute and pass those around instead.

Observers are a bit too useful, though, so in the spirit of keeping things ridiculous, let’s make a silly function that returns a function object:

def silly():
    print("silly")
    return silly

silly()()()()()()

which outputs:

Since we’ve seen some ways that functions can (sort of) imitate objects, let’s now make an object that can behave like a function:

class Function:
    def __init__(self, message):
        self.message = message

    def __call__(self, name):
        return "{} says '{}'".format(name, self.message)

function = Function("I am a function")
print(function('Cheap imitation function'))

which outputs:

I don’t use this feature often, but it can be useful in a few situations. For example, if you write a function and call it from many different places, but later discover that it needs to maintain some state, you can change the function to an object and implement the __call__ method without changing all the call sites. Or if you have a callback implementation that normally passes functions around, you can use a callable object when you need to store more complicated state. I’ve also seen Python decorators made out of objects when additional state or behavior is required.

Now, let’s talk about generators. As you might expect by now, we’ll start with the silliest way to implement generation code. We can use the idea of a function that returns an object to create a rudimentary generatorish object that calculates the Fibonacci sequence:

def FibFunction():
    a = b = 1
    def next():
        nonlocal a, b
        a, b = b, a + b
        return b
    return next

fib = FibFunction()
for i in range(8):
    print(fib(), end=' ')

which outputs:

This is a pretty wacky thing to do, but the point is that it is possible to build functions that are able to maintain state between calls. The state is stored in the surrounding closure; we can access that state by referencing them with Python 3’s nonlocal keyword. It is kind of like global except it accesses the state from the surrounding function, not the global namespace.

We can, of course, build a similar construct using classic (or classy) object notation:

class FibClass():
    def __init__(self):
        self.a = self.b = 1

    def __call__(self):
        self.a, self.b = self.b, self.a + self.b
        return self.b

fib = FibClass()
for i in range(8):
    print(fib(), end=' ')

which outputs:

Of course, neither of these obeys the iterator protocol. No matter how I wrangle it, I was not able to get FibFunction to work with Python’s builtin next() function, even after looking through the CPython source code for a couple hours. As I mentioned earlier, using the function syntax to build pseudo-objects quickly leads to frustration. However, it’s easy to tweak the object based FibClass to fulfill the iterator protocol:

class FibIterator():
    def __init__(self):
        self.a = self.b = 1

    def __next__(self):
        self.a, self.b = self.b, self.a + self.b
        return self.b

    def __iter__(self):
        return self

fib = FibIterator()
for i in range(8):
    print(next(fib), end=' ')

which outputs:

This class is a standard implementation of the iterator pattern. But it’s kind of ugly and verbose. Luckily, we can get the same effect in Python with a function that includes a yield statement to construct a generator. Here’s the Fibonacci sequence as a generator:

ef FibGenerator():
    a = b = 1
    while True:
        a, b = b, a + b
        yield b

fib = FibGenerator()
for i in range(8):
    print(next(fib), end=' ')
print('n', fib)

which outputs:

The generator version is a bit more readable than the other two implementations. The thing to pay attention to here is that a generator is not a function. The FibGenerator function returns an object as illustrated by the words “generator object” in the last line of output above.

Unlike a normal function, a generator function does not execute any code inside it when we call the function. Instead it constructs a generator object and returns that. You could think of this as like an implicit Python decorator; the Python interpreter sees the yield keyword and wraps it in a decorator that returns an object instead. To start the function code executing, we have to use the next function (either explicitly as in the examples, or implicitly by using a for loop or yield from).

While a generator is technically an object, it is often convenient to think of the function that creates it as a function that can have data passed in at one place and can return values multiple times. It’s sort of like a generic version of a function (which can have data passed in at one place and return a value at only one place). It is easy to make a generator that behaves not completely unlike a function, by yielding only one value:

def average(sequence):
    yield sum(sequence) / len(sequence)

print(next(average([1, 2, 3])))

which outputs:

Unfortunately, the call site at line 4 is less readable than a normal function call, since we have to throw that pesky next() in there. The obvious way around this would be to add a __call__ method to the generator but this fails if we try to use attribute assignment or inheritance. There are optimizations that make generators run quickly in C code and also don’t let us assign attributes to them. We can, however, wrap the generator in a function-like object using a ludicrous decorator:

def gen_func(func):
    def wrapper(*args, **kwargs):
        gen = func(*args, **kwargs)
        return next(gen)
    return wrapper

@gen_func
def average(sequence):
    yield sum(sequence) / len(sequence)

print(average([1, 6, 3, 4]))

which outputs:

Of course this is an absurd thing to do. I mean, just write a normal function for pity’s sake! But taking this idea a little further, it could be tempting to create a slightly different wrapper:

def callable_gen(func):
    class CallableGen:
        def __init__(self, *args, **kwargs):
            self.gen = func(*args, **kwargs)

        def __next__(self):
            return self.gen.__next__()

        def __iter__(self):
            return self

        def __call__(self):
            return next(self)
    return CallableGen

@callable_gen
def FibGenerator():
        a = b = 1
        while True:
            a, b = b, a + b
            yield b

fib = FibGenerator()
for i in range(8):
    print(fib(), end=' ')

which outputs:

To completely wrap the generator, we’d need to proxy a few other methods through to the underlying generator including send, close, and throw. This generator wrapper can be used to call a generator any number of times without calling the next function. I’ve been tempted to do this to make my code look cleaner if there are a lot of next calls in it, but I recommend not yielding into this temptation. Coders reading your code, including yourself, will go berserk trying to figure out what that “function call” is doing. Just get used to the next function and ignore this decorator business.

So we’ve drawn some parallels between generators, objects and functions. Let’s talk now about one of the more confusing concepts in Python: coroutines. In Python, coroutines are usually defined as “generators that you can send values into”. At an implementation level, this is probably the most sensible definition. In the theoretical sense, however, it is more accurate to define coroutines as constructs that can accept values at one or more locations and return values at one or more locations.

Therefore, while in Python it is easy to think of a generator as a special type of function that has yield statements and a coroutine as a special type of generator that we can send data into a different points, a better taxonomy is to think of a coroutine that can accept and return values at multiple locations as a general case, and generators and functions as special types of coroutines that are restricted in where they can accept or return values.

So let’s see a coroutine:

def LineInserter(lines):
    out = []
    for line in lines:
        to_append = yield line
        out.append(line)
        if to_append is not None:
            out.append(to_append)
    return out

emily = """I died for beauty, but was scarce
Adjusted in the tomb,
When one who died for truth was lain
In an adjoining room.
He questioned softly why I failed?
“For beauty,” I replied.
“And I for truth,—the two are one;
We brethren are,” he said.
And so, as kinsmen met a night,
We talked between the rooms,
Until the moss had reached our lips,
And covered up our names.
"""

inserter = LineInserter(iter(emily.splitlines()))
count = 1
try:
    line = next(inserter)
    while True:
        line = next(inserter) if count % 4 else inserter.send('-------')
        count += 1
except StopIteration as ex:
    print('n' + 'n'.join(ex.value))

which outputs:

The LineInserter object is called a coroutine rather than a generator only because the yield statement is placed on the right side of an assignment operator. Now whenever we yield a line, it stores any value that might have been sent back into the coroutine in the to_append variable.

As you can see in the driver code, we can send a value back in using inserter.send. if you instead just use next, the to_append variable gets a value of None. Don’t ask me why next is a function and send is a method when they both do nearly the same thing!

In this example, we use the send call to insert a ruler every four lines to separate stanzas in Emily Dickinson’s famous poem. But I used the exact same coroutine in a program that parses the source file for this article. It checks if any line contains the string !#python, and if so, it executes the subsequent code block and inserts the output (see the ‘which outputs’ lines throughout this article) into the article. Coroutines can provide that little extra something when normal ‘one way’ iteration doesn’t quite cut it.

The coroutine in the last example is really nice and elegant, but I find the driver code a bit annoying. I think it’s just me, but something about the indentation of a try…catch statement always frustrates me. Recently, I’ve been emulating Python 3.4’s contextlib.suppress context manager to replace except clauses with a callback. For example:

def LineInserter(lines):
    out = []
    for line in lines:
        to_append = yield line
        out.append(line)
        if to_append is not None:
            out.append(to_append)
    return out

from contextlib import contextmanager
@contextmanager
def generator_stop(callback):
    try:
        yield
    except StopIteration as ex:
        callback(ex.value)

def lines_complete(all_lines):
    print('n' + 'n'.join(all_lines))


emily = """I died for beauty, but was scarce
Adjusted in the tomb,
When one who died for truth was lain
In an adjoining room.
He questioned softly why I failed?
“For beauty,” I replied.
“And I for truth,—the two are one;
We brethren are,” he said.
And so, as kinsmen met a night,
We talked between the rooms,
Until the moss had reached our lips,
And covered up our names.
"""

inserter = LineInserter(iter(emily.splitlines()))
count = 1
with generator_stop(lines_complete):
    line = next(inserter)
    while True:
        line = next(inserter) if count % 4 else inserter.send('-------')
        count += 1

which outputs:

The generator_stop now encapsulates all the ugliness, and the context manager can be used in a variety of situations where StopIteration needs to be handled.

Since coroutines are undifferentiated from generators, they can emulate functions just as we saw with generators. We can even call into the same coroutine multiple times as if it were a function:

def IncrementBy(increment):
    sequence = yield
    while True:
        sequence = yield [i + increment for i in sequence]

sequence = [10, 20, 30]
increment_by_5 = IncrementBy(5)
increment_by_8 = IncrementBy(8)
next(increment_by_5)
next(increment_by_8)
print(increment_by_5.send(sequence))
print(increment_by_8.send(sequence))
print(increment_by_5.send(sequence))

which outputs:

Note the two calls to next at lines 9 and 10. These effectively “prime” the generator by advancing it to the first yield statement. Then each call to send essentially looks like a single call to a function. The driver code for this coroutine doesn’t look anything like calling a function, but with some evil decorator magic, we can make it look less disturbing:

def evil_coroutine(func):
    def wrapper(*args, **kwargs):
        gen = func(*args, **kwargs)
        next(gen)
        def gen_caller(arg=None):
            return gen.send(arg)
        return gen_caller
    return wrapper

@evil_coroutine
def IncrementBy(increment):
    sequence = yield
    while True:
        sequence = yield [i + increment for i in sequence]

sequence = [10, 20, 30]
increment_by_5 = IncrementBy(5)
increment_by_8 = IncrementBy(8)
print(increment_by_5(sequence))
print(increment_by_8(sequence))
print(increment_by_5(sequence))

which outputs:

The decorator accepts a function and returns a new wrapper function that gets assigned to the IncrementBy variable. Whenever this new IncrementBy is called, it constructs a generator using the original function, and advances it to the first yield statement using next (the priming action from before). It returns a new function that calls send on the generator each time it is called. This function makes the argument default to None so that it can also work if we call next instead of send.

The new driver code is definitely more readable, but once again, I would not recommend using this coding style to make coroutines behave like hybrid object/functions. The argument that other coders aren’t going to understand what is going through your head still stands. Plus, since send can only accept one argument, the callable is quite restricted.

Before we leave our discussion of the bizarre relationships between these concepts, let’s look at how the stanza processing code could look without an explicit coroutine, generator, function, or object:

emily = """I died for beauty, but was scarce
Adjusted in the tomb,
When one who died for truth was lain
In an adjoining room.
He questioned softly why I failed?
“For beauty,” I replied.
“And I for truth,—the two are one;
We brethren are,” he said.
And so, as kinsmen met a night,
We talked between the rooms,
Until the moss had reached our lips,
And covered up our names.
"""
for index, line in enumerate(emily.splitlines(), start=1):
    print(line)
    if not index % 4:
        print('------')

which outputs:

This code is so simple and elegant! This happens to me nearly every time I try to use coroutines. I keep refactoring and simplifying it until I discover that coroutines are making my code less, not more, readable. Unless I am explicitly modeling a state-transition system or trying to do asynchronous work using the terrific asyncio library (which wraps all the possible craziness with StopIteration, cascading exceptions, etc), I rarely find that coroutines are the right tool for the job. That doesn’t stop me from attempting them though, because they are fun.

For the record, the LineInserter coroutine actually is useful in the markdown code executor I use to parse this source file. I need to keep track of more transitions between states (am I currently looking for a code block? am I in a code block? Do I need to execute the code block and record the output?) than in the stanza marking example used here.

So, it has become clear that in Python, there is more than one way to do a lot of things. Luckily, most of these ways are not very obvious, and there is usually “one, and preferably only one, obvious way to do things”, to quote The Zen Of Python. I hope by this point that you are more confused about the relationship between functions, objects, generators and coroutines than ever before. I hope you now know how to write much code that should never be written. But mostly I hope you’ve enjoyed your exploration of these topics.

If you’d like to see more useful applications of these and other Python concepts, grab a copy of Python 3 Object-oriented Programming, Second Edition.

LEAVE A REPLY

Please enter your comment!
Please enter your name here