Python Multimedia: Working with Audios

13 min read

(For more resources on Python, see here.)

So let’s get on with it!

Installation prerequisites

Since we are going to use an external multimedia framework, it is necessary to install the necessary to install the packages mentioned in this section.


GStreamer is a popular open source multimedia framework that supports audio/video manipulation of a wide range of multimedia formats. It is written in the C programming language and provides bindings for other programming languages including Python. Several open source projects use GStreamer framework to develop their own multimedia application. Throughout this article, we will make use of the GStreamer framework for audio handling. In order to get this working with Python, we need to install both GStreamer and the Python bindings for GStreamer.

Windows platform

The binary distribution of GStreamer is not provided on the project website Installing it from the source may require considerable effort on the part of Windows users. Fortunately, GStreamer WinBuilds project provides pre-compiled binary distributions. Here is the URL to the project website:

The binary distribution for GStreamer as well as its Python bindings (Python 2.6) are available in the Download area of the website:

You need to install two packages. First, the GStreamer and then the Python bindings to the GStreamer. Download and install the GPL distribution of GStreamer available on the GStreamer WinBuilds project website. The name of the GStreamer executable is GStreamerWinBuild- The version should be 0.10.5 or higher. By default, this installation will create a folder C:gstreamer on your machine. The bin directory within this folder contains runtime libraries needed while using GStreamer.

Next, install the Python bindings for GStreamer. The binary distribution is available on the same website. Use the executable Pygst- pertaining to Python 2.6. The version should be 0.10.15 or higher.

GStreamer WinBuilds appears to be an independent project. It is based on the OSSBuild developing suite. Visit for more information. It could happen that the GStreamer binary built with Python 2.6 is no longer available on the mentioned website at the time you are reading this book. Therefore, it is advised that you should contact the developer community of OSSBuild. Perhaps they might help you out!

Alternatively, you can build GStreamer from source on the Windows platform, using a Linux-like environment for Windows, such as Cygwin ( Under this environment, you can first install dependent software packages such as Python 2.6, gcc compiler, and others. Download the gst-python- package from the GStreamer website Then extract this package and install it from sources using the Cygwin environment. The INSTALL file within this package will have installation instructions.

Other platforms

Many of the Linux distributions provide GStreamer package. You can search for the appropriate gst-python distribution (for Python 2.6) in the package repository. If such a package is not available, install gst-python from the source as discussed in the earlier the Windows platform section.

If you are a Mac OS X user, visit It has detailed instructions on how to download and install the package Py26-gst-python version 0.10.17 (or higher).

Mac OS X 10.5.x (Leopard) comes with the Python 2.5 distribution. If you are using packages using this default version of Python, GStreamer Python bindings using Python 2.5 are available on the darwinports website:


There is a free multiplatform software utility library called ‘GLib’. It provides data structures such as hash maps, linked lists, and so on. It also supports the creation of threads. The ‘object system’ of GLib is called GObject. Here, we need to install the Python bindings for GObject. The Python bindings are available on the PyGTK website at:

Windows platform

The binary installer is available on the PyGTK website. The complete URL is: Download and install version 2.20 for Python 2.6.

Other platforms

For Linux, the source tarball is available on the PyGTK website. There could even be binary distribution in the package repository of your Linux operating system. The direct link to the Version 2.21 of PyGObject (source tarball) is:

If you are a Mac user and you have Python 2.6 installed, a distribution of PyGObject is available at Install version 2.14 or later.

Summary of installation prerequisites

The following table summarizes the packages needed for this article.

Package Download location Version Windows platform Linux/Unix/OS X platforms
GStreamer 0.10.5 or later Install using binary distribution available on the Gstreamer WinBuild website: Use GStreamerWinBuild- (or later version if available). Linux: Use GStreamer distribution in package repository. Mac OS X: Download and install by following instructions on the website:
Python Bindings for GStreamer 0.10.15 or later for Python 2.6 Use binary provided by GStreamer WinBuild project. See for details pertaining to Python 2.6. Linux: Use gst-python distribution in the package repository. Mac OS X: Use this package (if you are using Python2.6): Linux/Mac: Build and install from the source tarball.
Python bindings for GObject “PyGObject” Source distribution: 2.14 or later for Python 2.6 Use binary package from pygobject-2.20.0.win32-py2.6.exe Linux: Install from source if pygobject is not available in the package repository. Mac: Use this package on darwinports (if you are using Python2.6) See for details.

Testing the installation

Ensure that the GStreamer and its Python bindings are properly installed. It is simple to test this. Just start Python from the command line and type the following:

>>>import pygst

If there is no error, it means the Python bindings are installed properly.

Next, type the following:

>>>import gst

If this import is successful, we are all set to use GStreamer for processing audios and videos!

If import gst fails, it will probably complain that it is unable to work some required DLL/shared object. In this case, check your environment variables and make sure that the PATH variable has the correct path to the gstreamer/bin directory. The following lines of code in a Python interpreter show the typical location of the pygst and gst modules on the Windows platform.

>>> import pygst
>>> pygst
<module 'pygst' from 'C:Python26libsite-packagespygst.pyc'>
>>> pygst.require('0.10')
>>> import gst
>>> gst
<module 'gst' from 'C:Python26libsite-packagesgst-0.10gst__init__.pyc'>

Next, test if PyGObject is successfully installed. Start the Python interpreter and try importing the gobject module.

>>import gobject

If this works, we are all set to proceed!

A primer on GStreamer

In this article, we will be using GStreamer multimedia framework extensively. Before we move on to the topics that teach us various audio processing techniques, a primer on GStreamer is necessary.

So what is GStreamer? It is a framework on top of which one can develop multimedia applications. The rich set of libraries it provides makes it easier to develop applications with complex audio/video processing capabilities. Fundamental components of GStreamer are briefly explained in the coming sub-sections.

Comprehensive documentation is available on the GStreamer project website. GStreamer Application Development Manual is a very good starting point. In this section, we will briefly cover some of the important aspects of GStreamer. For further reading, you are recommended to visit the GStreamer project website:

gst-inspect and gst-launch

We will start by learning the two important GStreamer commands. GStreamer can be run from the command line, by calling gst-launch-0.10.exe (on Windows) or gst-launch-0.10(on other platforms). The following command shows a typical execution of GStreamer on Linux. We will see what a pipeline means in the next sub-section.

$gst-launch-0.10 pipeline_description

GStreamer has a plugin architecture. It supports a huge number of plugins. To see more details about any plugin in your GStreamer installation, use the command gst-inspect-0.10 (gst-inspect-0.10.exe on Windows). We will use this command quite often. Use of this command is illustrated here.

$gst-inspect-0.10 decodebin

Here, decodebin is a plugin. Upon execution of the preceding command, it prints detailed information about the plugin decodebin.

Elements and pipeline

In GStreamer, the data flows in a pipeline. Various elements are connected together forming a pipeline, such that the output of the previous element is the input to the next one.

A pipeline can be logically represented as follows:

Element1 ! Element2 ! Element3 ! Element4 ! Element5

Here, Element1 through to Element5 are the element objects chained together by the symbol !. Each of the elements performs a specific task. One of the element objects performs the task of reading input data such as an audio or a video. Another element decodes the file read by the first element, whereas another element performs the job of converting this data into some other format and saving the output. As stated earlier, linking these element objects in a proper manner creates a pipeline.

The concept of a pipeline is similar to the one used in Unix. Following is a Unix example of a pipeline. Here, the vertical separator | defines the pipe.

$ls -la | more

Here, the ls -la lists all the files in a directory. However, sometimes, this list is too long to be displayed in the shell window. So, adding | more allows a user to navigate the data.

Now let’s see a realistic example of running GStreamer from the command prompt.

$ gst-launch-0.10 -v filesrc location=path/to/file.ogg ! decodebin ! audioconvert ! fakesink

For a Windows user, the gst command name would be gst-launch-0.10.exe. The pipeline is constructed by specifying different elements. The !symbol links the adjacent elements, thereby forming the whole pipeline for the data to flow. For Python bindings of GStreamer, the abstract base class for pipeline elements is gst.Element, whereas gst.Pipeline class can be used to created pipeline instance. In a pipeline, the data is sent to a separate thread where it is processed until it reaches the end or a termination signal is sent.


GStreamer is a plugin-based framework. There are several plugins available. A plugin is used to encapsulate the functionality of one or more GStreamer elements. Thus we can have a plugin where multiple elements work together to create the desired output. The plugin itself can then be used as an abstract element in the GStreamer pipeline. An example is decodebin. We will learn about it in the upcoming sections. A comprehensive list of available plugins is available at the GStreamer website In almost all applications to be developed, decodebin plugin will be used. For audio processing, the functionality provided by plugins such as gnonlin, audioecho, monoscope, interleave, and so on will be used.


In GStreamer, a bin is a container that manages the element objects added to it. A bin instance can be created using gst.Bin class. It is inherited from gst.Element and can act as an abstract element representing a bunch of elements within it. A GStreamer plugin decodebin is a good example representing a bin. The decodebin contains decoder elements. It auto-plugs the decoder to create the decoding pipeline.


Each element has some sort of connection points to handle data input and output. GStreamer refers to them as pads. Thus an element object can have one or more “receiver pads” termed as sink pads that accept data from the previous element in the pipeline. Similarly, there are ‘source pads’ that take the data out of the element as an input to the next element (if any) in the pipeline. The following is a very simple example that shows how source and sink pads are specified.

>gst-launch-0.10.exe fakesrc num-bufferes=1 ! fakesink

The fakesrc is the first element in the pipeline. Therefore, it only has a source pad. It transmits the data to the next linkedelement, that is fakesink which only has a sink pad to accept elements. Note that, in this case, since these are fakesrc and fakesink, just empty buffers are exchanged. A pad is defined by the class gst.Pad. A pad can be attached to an element object using the gst.Element.add_pad() method.

The following is a diagrammatic representation of a GStreamer element with a pad. It illustrates two GStreamer elements within a pipeline, having a single source and sink pad.

Python Multimedia: Working with Audios

Now that we know how the pads operate, let’s discuss some of special types of pads. In the example, we assumed that the pads for the element are always ‘out there’. However, there are some situations where the element doesn’t have the pads available all the time. Such elements request the pads they need at runtime. Such a pad is called a dynamic pad. Another type of pad is called ghost pad. These types are discussed in this section.


Dynamic pads

Some objects such as decodebin do not have pads defined when they are created. Such elements determine the type of pad to be used at the runtime. For example, depending on the media file input being processed, the decodebin will create a pad. This is often referred to as dynamic pad or sometimes the available pad as it is not always available in elements such as decodebin.

Ghost pads

As stated in the Bins section a bin object can act as an abstract element. How is it achieved? For that, the bin uses ‘ghost pads’ or ‘pseudo link pads’. The ghost pads of a bin are used to connect an appropriate element inside it. A ghost pad can be created using gst.GhostPad class.


The element objects send and receive the data by using the pads. The type of media data that the element objects will handle is determined by the caps (a short form for capabilities). It is a structure that describes the media formats supported by the element. The caps are defined by the class gst.Caps.


A bus refers to the object that delivers the message generated by GStreamer. A message is a gst.Message object that informs the application about an event within the pipeline. A message is put on the bus using the gst.Bus.gst_bus_post() method. The following code shows an example usage of the bus.

1 bus = pipeline.get_bus()
2 bus.add_signal_watch()
3 bus.connect("message", message_handler)

The first line in the code creates a gst.Bus instance. Here the pipeline is an instance of gst.PipeLine. On the next line, we add a signal watch so that the bus gives out all the messages posted on that bus. Line 3 connects the signal with a Python method. In this example, the message is the signal string and the method it calls is message_handler.


Playbin is a GStreamer plugin that provides a high-level audio/video player. It can handle a number of things such as automatic detection of the input media file format, auto-determination of decoders, audio visualization and volume control, and so on. The following line of code creates a playbin element.

playbin = gst.element_factory_make("playbin")

It defines a property called uri. The URI (Uniform Resource Identifier) should be an absolute path to a file on your computer or on the Web. According to the GStreamer documentation, Playbin2 is just the latest unstable version but once stable, it will replace the Playbin.

A Playbin2 instance can be created the same way as a Playbin instance.

gst-inspect-0.10 playbin2

With this basic understanding, let us learn about various audio processing techniques using GStreamer and Python.


Please enter your comment!
Please enter your name here