Architecture of FreeSWITCH

13 min read

(For more resources related to this topic, see here.)

A revolution has begun and secrets have been revealed

How and why the telephone works is a mystery to most people. It has been kept secret for years. We just plugged our phones into the wall and they worked, and most people do just that and expect it to work. The telephony revolution has begun, and we have begun to pry its secrets from the clutches of the legacy of the telephony industry. Now, everyday individuals like you and me are able to build phone systems that outperform traditional phone services and offer advanced features for relatively low cost. Some people even use FreeSWITCH to provide telephone services for making a profit. FreeSWITCH has been designed to make all of this easier, so we will go over the architecture to get a better understanding of how it works.

Do not be concerned if some of the concepts we introduce seem unnaturally abstract. Learning telephony takes time, especially VoIP. In fact, we recommend that you read this article more than once. Absorb as much as you can on the first pass. You will be surprised at how much your understanding of VoIP and FreeSWITCH has improved.Give yourself plenty of time to digest all of these strange new concepts, and soon you will find that you are a skilled FreeSWITCH administrator. If you keep at it, you will be rewarded with a meaningful understanding of this strange and wonderful world we call telephony.

Telephones and telephony systems (such as telephone switches and PBXs) are very complicated and have evolved over the years into several varieties. The most popular type of phone in the U.K. and the U.S. is the traditional analog phone, which we affectionately refer to as POTS lines or Plain Old Telephone Service. From the traditional Ma Bell phone up to the long-range cordless phones that most of us have today, one thing has remained the same—the underlying technology. In the last 10-15 years, there has been a convergence of technology between computers and telephones that has produced a pair of affordable alternatives to POTS lines—Mobile phones and VoIP phones (also called Internet Phones).

FreeSWITCH fits into this big tangled mess of various telephone technologies by bridging them together, so that they can communicate despite being otherwise completely incompatible. FreeSWITCH also bridges telephone calls with computer programs that you can write yourself, and controls what happens in ways like never before. FreeSWITCH is software that runs on Windows and several UNIX varieties such as Mac OS X, Linux, Solaris, and BSD. This means you can install FreeSWITCH on your home PC or even a high-end server and use it to process phone calls.

The FreeSWITCH design – modular, scalable, and stable

The design goal of FreeSWITCH is to provide a modular, scalable system around a stable switching core, and to provide a robust interface for developers to add to and control the system. Various elements in FreeSWITCH are independent of each other and do not have much knowledge about how the other parts are working, other than what is provided in what are called exposed functions. The functionality of FreeSWITCH can also be extended with loadable modules, which tie a particular external technology into the core.

FreeSWITCH has many different module types that revolve around the central core, much like satellites orbiting a planet. The list includes:

Module type:



Telephone protocols like SIP/H.323 and POTS lines


Performs a task such as playing audio or setting data



Interface (API)

Exports a function that takes text input and returns text output,

which could be used across modules or from an external


Automated Speech

Recognition (ASR)

Interfaces with speech recognition systems


Bridges and exchanges various chat protocols


Translates between audio formats


Parses the call details and decides where to route the call


Connects directory information services, such as LDAP, to a

common core lookup API

Event handlers

Allows external programs to control FreeSWITCH


Provides an interface to extract and play sound from various

audio file formats


Plays audio files in various formats


Programming language interfaces used for call control


Controls logging to the console, system log, or log files


Strings together audio files in various languages to provide

feedback to say things like phone numbers, time of day, spell

words, and so on

Text-To-Speech (TTS)

Interfaces with text-to-speech engines


POSIX or Linux kernel timing in applications

XML Interfaces

Uses XML for Call Detail Records (CDRs), RADIUS, CURL,

LDAP, RPC, and/or SCGI

The following image shows what the FreeSWITCH architecture looks like and how the modules orbit the core of FreeSWITCH:

By combining the functionality of the various module interfaces, FreeSWITCH can be configured to connect IP phones, POTS lines, and IP-based telephone services. It can also translate audio formats and interfaces with a custom menu system, which you can create by yourself. You can even control a running FreeSWITCH server from another machine. Let’s start by taking a closer look at a pair of important module types.

Important modules – Endpoint and Dialplan

Endpoint modules are critically important and add some of the key features that make FreeSWITCH the powerful platform it is today. The primary role of these modules is to take certain common communication technologies and normalize them into a common abstract entity which we refer to as a session. A session represents a connection between FreeSWITCH and a particular protocol. There are several Endpoint modules that come with FreeSWITCH, which implement several protocols such as SIP, H.323, Jingle (Google Talk), and some others. We will spend some time examining one of the more popular modules named mod_sofia.

Sofia-SIP ( is an open source project sponsored by Nokia, which provides a programming interface for the Session Initiation Protocol (SIP). We use this library in FreeSWITCH in a module we call mod_sofia. This module registers to all the hooks in FreeSWITCH necessary to make an Endpoint module, and translates the native FreeSWITCH constructs into SIP constructs and back again. Configuration information is taken from the central FreeSWITCH configuration files, which allows mod_sofia to load user-defined preferences and connection details. This allows FreeSWITCH to accept registration from SIP phones and devices, register to other SIP Endpoints such as service providers, send notifications, and provide services to the phones such as voicemail.

The SIP protocol is defined by a number of RFC (request for comment) documents. The primary RFC can be found at

When a SIP call is established between FreeSWITCH and another SIP device, it will show up in FreeSWITCH as an active session. If the call is inbound, it can be transferred or bridged to interactive voice response (IVR) menus, hold music, or one or more extensions, though numerous other options are available. Let’s examine a typical scenario where an SIP phone registered as extension 2000 dials extension 2001 with the hope of establishing a call.

First, the SIP phone sends a call setup message to mod_sofia over the network (mod_sofia is listening for such messages). After receiving the message, mod_sofia in turn parses the relevant details and passes the call into the core state machine in FreeSWITCH. The state machine (in the FreeSWITCH core) then sends the call into the ROUTING state.

The next step is to locate the Dialplan module based on the configuration data for the calling Endpoint. The default and most widely used Dialplan module is the XML Dialplan module. This module is designed to look up a list of instructions from the central XML registry within FreeSWITCH. The XML Dialplan module will parse a series of XML extension objects using regular expression pattern-matching.

As we are trying to call 2001, we hope to find an XML extension testing the destination_number field for something that matches 2001 and routes accordingly. The Dialplan is not limited to matching only a single extension. The XML Dialplan module builds a sort of task list for the call. Each extension that matches it will have its actions added to the call’s task list.

Assuming FreeSWITCH finds at least one extension, the XML Dialplan will insert instructions into the session object with the information it needs to try and connect the call to 2001. Once these instructions are in place, the state of the calling session changes from ROUTING to EXECUTE, where the next handler drills down the list and executes the instructions obtained during the ROUTING state. This is where the application interface comes into the picture.

Each instruction is added to the session in the form of an application name and a data argument that will be passed to that application. The one we will use in this example is the bridge application. The purpose of this application is to create another session with an outbound connection, then connect the two sessions for direct audio exchange. The argument we will supply to bridge will be user/2001, which is the easiest way to generate a call to extension 2001. A Dialplan entry for 2001 might look like this:

<extension name="example">
<condition field="destination_number"
<action application="bridge" data="user/2001"/>

The extension is named example, and it has a single condition to match. If the condition is matched, it has a single application to execute. In plain language, the mentioned extension could be expressed like this: If the caller dialed 2001, this establishes a connection between the calling party and the endpoint (that is, telephone) at 2001. Consider how this happens.

Once we have inserted the instructions into the session, the session’s state will change to EXECUTE, and the FreeSWITCH core will use the data collected to perform the desired action. First, the default execute state handler will parse the command to execute bridge on user/2001, then it will look up the bridge application and pass the user/2001 data in. This will cause the FreeSWITCH core to create a new outbound session of the desired type. User 2001 is also a SIP phone, so user/2001 will resolve into a SIP dial string, which will be passed to mod_sofia to ask it to create a new outbound session.

If the setup for that new session is successful, there will now be two sessions in the FreeSWITCH core. The bridge application will take the new session and the original session (the caller’s phone) and call the bridge function on it. This allows the audio to flow in both directions once the person at extension 2001 actually answers the phone. If that user was unable to answer or was busy, a timeout (that is, a failure) would occur and send the corresponding message back to the caller’s phone. If a call is unanswered or an extension is busy, many routing options are possible, including call forwarding or voicemail.

All of this happens from the simple action of picking up the phone handset and dialing 2 0 0 1. FreeSWITCH takes all of the complexity of SIP and reduces it to a common denominator. From there, it reduces the complexity further by allowing us to configure a single instruction in the Dialplan to connect the phone at 2000 to the phone at 2001. If we want to allow the phone at 2001 to be able to call the phone at 2000, we can add another entry in the Dialplan going the other way:

<extension name="example 2">
<condition field="destination_number" expression="^2000$">
<action application="bridge" data="user/2000"/>

In this scenario, the Endpoint module turned SIP into a FreeSWITCH session and the Dialplan module turned XML into an extension. The bridge application turned the complex code of creating an outbound call and connecting the audio into a simple application/data pair. Both the Dialplan module and the application module interface are designed around regular FreeSWITCH sessions. Therefore, not only does the abstraction make life easier for us at the user level, it also simplifies the design of the application and the Dialplan because they can be made agnostic of the actual endpoint technology involved in the call. It is because of this abstraction, when we make up a new Endpoint module tomorrow for something like Skype (there is actually such a thing present, by the way), that we can reuse all the same application and Dialplan modules. The same principle applies to the Say, Automatic Speech Recognition (ASR), Text-to-Speech (TTS), and other such modules.

It is possible that you may want to work with some specific data provided by the Endpoint’s native protocol. In SIP, for instance, there are several arbitrary headers as well as several other bits of interesting data from the SIP packets. We solve this problem by adding variables to the channel. Using channel variables, mod_sofia can set these arbitrary values as they are encountered in the SIP data where you can retrieve them by name from the channel in your Dialplan or application. This way, we share our knowledge of these special variables with the SIP Endpoint. However, the FreeSWITCH core just sees them as arbitrary channel variables that the core can ignore. There are also several special reserved channel variables that can influence the behavior of FreeSWITCH in many interesting ways. If you have ever used a scripting language or configuration engine that uses variables (sometimes called attribute-value pairs or AVPs), you are at an advantage because channel variables are pretty much the same concept. There is simply a variable name and a value that is passed to the channel and the data is set.

There is even an application interface for this, the set application, which lets you set your own variables from the Dialplan:

<extension name="example 3">
<condition field="destination_number" expression="^2000$">
<action application="set" data="foo=bar"/>
<action application="bridge" data="user/2000"/>

This example is almost identical to the previous example, but instead of just placing the call, we first set the variable foo equal to the value bar. This variable will remain set throughout the call and can even be referenced at the end of the call in the detail logs.

The more we build things in small pieces, the more the same underlying resources can be reused, making the system simpler to use. For example, the codec interface knows nothing else about the core, other than its own isolated world of encoding and decoding audio packets. Once a proper codec module has been written, it becomes usable by any Endpoint interface capable of carrying that codec in its audio stream. This means that if we get a Text-To-Speech module working, we can generate synthesized speech on any and all Endpoints that FreeSWITCH supports. It does not matter which one comes first as they have nothing to do with each other. However, the addition of either one instantly adds functionality to the other. The TTS module becomes more useful because it can use more codecs; the codecs have become more useful because we added a new function that can take advantage of them. The same idea applies to applications. If we write a new application module, the existing endpoints will immediately be able to run and use that application.


Please enter your comment!
Please enter your name here