26 min read

In this article by, Sam Washington and Dr. M. O. Faruque Sarker, authors of the book Learning Python Network Programming, we’re going to use sockets to build network applications. Sockets follow one of the main models of computer networking, that is, the client/server model. We’ll look at this with a focus on structuring server applications. We’ll cover the following topics:

  • Designing a simple protocol
  • Building an echo server and client

(For more resources related to this topic, see here.)

The examples in this article are best run on Linux or a Unix operating system. The Windows sockets implementation has some idiosyncrasies, and these can create some error conditions, which we will not be covering here. Note that Windows does not support the poll interface that we’ll use in one example. If you do use Windows, then you’ll probably need to use ctrl + break to kill these processes in the console, rather than using ctrlc because Python in a Windows command prompt doesn’t respond to ctrlc when it’s blocking on a socket send or receive, which will be quite often in this article! (and if, like me, you’re unfortunate enough to try testing these on a Windows laptop without a break key, then be prepared to get very familiar with the Windows Task Manager’s End task button).

Client and server

The basic setup in the client/server model is one device, the server that runs a service and patiently waits for clients to connect and make requests to the service. A 24-hour grocery shop may be a real world analogy. The shop waits for customers to come in and when they do, they request certain products, purchase them and leave. The shop might advertise itself so people know where to find it, but the actual transactions happen while the customers are visiting the shop.

A typical computing example is a web server. The server listens on a TCP port for clients that need its web pages. When a client, for example a web browser, requires a web page that the server hosts, it connects to the server and then makes a request for that page. The server replies with the content of the page and then the client disconnects. The server advertises itself by having a hostname, which the clients can use to discover the IP address so that they can connect to it.

In both of these situations, it is the client that initiates any interaction – the server is purely responsive to that interaction. So, the needs of the programs that run on the client and server are quite different.

Client programs are typically oriented towards the interface between the user and the service. They retrieve and display the service, and allow the user to interact with it. Server programs are written to stay running for indefinite periods of time, to be stable, to efficiently deliver the service to the clients that are requesting it, and to potentially handle a large number of simultaneous connections with a minimal impact on the experience of any one client.

In this article, we will look at this model by writing a simple echo server and client, which can handle a session with multiple clients. The socket module in Python perfectly suits this task.

An echo protocol

Before we write our first client and server programs, we need to decide how they are going to interact with each other, that is we need to design a protocol for their communication.

Our echo server should listen until a client connects and sends a bytes string, then we want it to echo that string back to the client. We only need a few basic rules for doing this. These rules are as follows:

  1. Communication will take place over TCP.
  2. The client will initiate an echo session by creating a socket connection to the server.
  3. The server will accept the connection and listen for the client to send a bytes string.
  4. The client will send a bytes string to the server.
  5. Once it sends the bytes string, the client will listen for a reply from the server
  6. When it receives the bytes string from the client, the server will send the bytes string back to the client.
  7. When the client has received the bytes string from the server, it will close its socket to end the session.

These steps are straightforward enough. The missing element here is how the server and the client will know when a complete message has been sent. Remember that an application sees a TCP connection as an endless stream of bytes, so we need to decide what in that byte stream will signal the end of a message.

Framing

This problem is called framing, and there are several approaches that we can take to handle it. The main ones are described here:

  1. Make it a protocol rule that only one message will be sent per connection, and once a message has been sent, the sender will immediately close the socket.
  2. Use fixed length messages. The receiver will read the number of bytes and know that they have the whole message.
  3. Prefix the message with the length of the message. The receiver will read the length of the message from the stream first, then it will read the indicated number of bytes to get the rest of the message.
  4. Use special character delimiters for indicating the end of a message. The receiver will scan the incoming stream for a delimiter, and the message comprises everything up to the delimiter.

Option 1 is a good choice for very simple protocols. It’s easy to implement and it doesn’t require any special handling of the received stream. However, it requires the setting up and tearing down of a socket for every message, and this can impact performance when a server is handling many messages at once.

Option 2 is again simple to implement, but it only makes efficient use of the network when our data comes in neat, fixed-length blocks. For example in a chat server the message lengths are variable, so we will have to use a special character, such as the null byte, to pad messages to the block size. This only works where we know for sure that the padding character will never appear in the actual message data. There is also the additional issue of how to handle messages longer than the block length.

Option 3 is usually considered as one of the best approaches. Although it can be more complex to code than the other options, the implementations are still reasonably straightforward, and it makes efficient use of bandwidth. The overhead imposed by including the length of each message is usually minimal as compared to the message length. It also avoids the need for any additional processing of the received data, which may be needed by certain implementations of option 4.

Option 4 is the most bandwidth-efficient option, and is a good choice when we know that only a limited set of characters, such as the ASCII alphanumeric characters, will be used in messages. If this is the case, then we can choose a delimiter character, such as the null byte, which will never appear in the message data, and then the received data can be easily broken into messages as this character is encountered. Implementations are usually simpler than option 3. Although it is possible to employ this method for arbitrary data, that is, where the delimiter could also appear as a valid character in a message, this requires the use of character escaping, which needs an additional round of processing of the data. Hence in these situations, it’s usually simpler to use length-prefixing.

For our echo and chat applications, we’ll be using the UTF-8 character set to send messages. The null byte isn’t used in any character in UTF-8 except for the null byte itself, so it makes a good delimiter. Thus, we’ll be using method 4 with the null byte as the delimiter to frame our messages.

So, our last rule which is number 8 will become:

Messages will be encoded in the UTF-8 character set for transmission, and they will be terminated by the null byte.

Now, let’s write our echo programs.

A simple echo server

As we work through this article, we’ll find ourselves reusing several pieces of code, so to save ourselves from repetition, we’ll set up a module with useful functions that we can reuse as we go along. Create a file called tincanchat.py and save the following code in it:

import socket
 
HOST = ''
PORT = 4040
 
def create_listen_socket(host, port):
   """ Setup the sockets our server will receive connection requests on """
   sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
   sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
   sock.bind((host, port))
   sock.listen(100)
   return sock
 
def recv_msg(sock):
   """ Wait for data to arrive on the socket, then parse into messages using b'' as message delimiter """
   data = bytearray()
   msg = ''
   # Repeatedly read 4096 bytes off the socket, storing the bytes
   # in data until we see a delimiter
   while not msg:
       recvd = sock.recv(4096)
       if not recvd:
           # Socket has been closed prematurely
           raise ConnectionError()
       data = data + recvd
       if b'' in recvd:
           # we know from our protocol rules that we only send
           # one message per connection, so b'' will always be
           # the last character
           msg = data.rstrip(b'')
   msg = msg.decode('utf-8')
   return msg
 
def prep_msg(msg):
   """ Prepare a string to be sent as a message """
   msg += ''
   return msg.encode('utf-8')
 
def send_msg(sock, msg):
   """ Send a string over a socket, preparing it first """
   data = prep_msg(msg)
   sock.sendall(data)

First we define a default interface and a port number to listen on. The empty interface, specified in the HOST variable, tells socket.bind() to listen on all available interfaces. If you want to restrict access to just your machine, then change the value of the HOST variable at the beginning of the code to 127.0.0.1.

We’ll be using create_listen_socket() to set up our server listening connections. This code is the same for several of our server programs, so it makes sense to reuse it.

The recv_msg() function will be used by our echo server and client for receiving messages from a socket. In our echo protocol, there isn’t anything that our programs may need to do while they’re waiting to receive a message, so this function just calls socket.recv() in a loop until it has received the whole message. As per our framing rule, it will check the accumulated data on each iteration to see if it has received a null byte, and if so, then it will return the received data, stripping off the null byte and decoding it from UTF-8.

The send_msg() and prep_msg() functions work together for framing and sending a message. We’ve separated the null byte termination and the UTF-8 encoding into prep_msg() because we will use them in isolation later on.

Handling received data

Note that we’re drawing ourselves a careful line with these send and receive functions as regards string encoding. Python 3 strings are Unicode, while the data that we receive over the network is bytes. The last thing that we want to be doing is handling a mixture of these in the rest of our program code, so we’re going to carefully encode and decode the data at the boundary of our program, where the data enters and leaves the network. This will ensure that any functions in the rest of our code can assume that they’ll be working with Python strings, which will later on make things much easier for us.

Of course, not all the data that we may want to send or receive over a network will be text. For example, images, compressed files, and music, can’t be decoded to a Unicode string, so a different kind of handling is needed. Usually this will involve loading the data into a class, such as a Python Image Library (PIL) image for example, if we are going to manipulate the object in some way.

There are basic checks that could be done here on the received data, before performing full processing on it, to quickly flag any problems with the data. Some examples of such checks are as follows:

  • Checking the length of the received data
  • Checking the first few bytes of a file for a magic number to confirm a file type
  • Checking values of higher level protocol headers, such as the Host header in an HTTP request

This kind of checking will allow our application to fail fast if there is an obvious problem.

The server itself

Now, let’s write our echo server. Open a new file called 1.1-echo-server-uni.py and save the following code in it:

import tincanchat
 
HOST = tincanchat.HOST
PORT = tincanchat.PORT
 
def handle_client(sock, addr):
   """ Receive data from the client via sock and echo it back """
   try:
       msg = tincanchat.recv_msg(sock) # Blocks until received
                                         # complete message
       print('{}: {}'.format(addr, msg))
       tincanchat.send_msg(sock, msg) # Blocks until sent
   except (ConnectionError, BrokenPipeError):
       print('Socket error')
   finally:
       print('Closed connection to {}'.format(addr))
       sock.close()
 
if __name__ == '__main__':
   listen_sock = tincanchat.create_listen_socket(HOST, PORT)
   addr = listen_sock.getsockname()
   print('Listening on {}'.format(addr))
 
   while True:
       client_sock, addr = listen_sock.accept()
       print('Connection from {}'.format(addr))
       handle_client(client_sock, addr)

This is about as simple as a server can get! First, we set up our listening socket with the create_listen_socket() call. Second, we enter our main loop, where we listen forever for incoming connections from clients, blocking on listen_sock.accept(). When a client connection comes in, we invoke the handle_client() function, which handles the client as per our protocol. We’ve created a separate function for this code, partly to keep the main loop tidy, and partly because we’ll want to reuse this set of operations in later programs.

That’s our server, now we just need to make a client to talk to it.

A simple echo client

Create a file called 1.2-echo_client-uni.py and save the following code in it:

import sys, socket
import tincanchat
 
HOST = sys.argv[-1] if len(sys.argv) > 1 else '127.0.0.1'
PORT = tincanchat.PORT
 
if __name__ == '__main__':
   while True:
       try:
           sock = socket.socket(socket.AF_INET,
                                 socket.SOCK_STREAM)
           sock.connect((HOST, PORT))
           print('nConnected to {}:{}'.format(HOST, PORT))
           print("Type message, enter to send, 'q' to quit")
           msg = input()
           if msg == 'q': break
           tincanchat.send_msg(sock, msg) # Blocks until sent
           print('Sent message: {}'.format(msg))
           msg = tincanchat.recv_msg(sock) # Block until
                                           # received complete
                                             # message
           print('Received echo: ' + msg)
       except ConnectionError:
           print('Socket error')
           break
       finally:
           sock.close()
           print('Closed connection to servern')

If we’re running our server on a different machine from the one on which we are running the client, then we can supply the IP address or the hostname of the server as a command line argument to the client program. If we don’t, then it will default to trying to connect to the localhost.

The third and forth lines of the code check the command line arguments for a server address. Once we’ve determined which server to connect to, we enter our main loop, which loops forever until we kill the client by entering q as a message. Within the main loop, we first create a connection to the server. Second, we prompt the user to enter the message to send and then we send the message using the tincanchat.send_msg() function. We then wait for the server’s reply. Once we get the reply, we print it and then we close the connection as per our protocol.

Give our client and server a try. Run the server in a terminal by using the following command:

$ python 1.1-echo_server-uni.py
Listening on ('0.0.0.0', 4040)

In another terminal, run the client and note that you will need to specify the server if you need to connect to another computer, as shown here:

$ python 1.2-echo_client.py 192.168.0.7
Type message, enter to send, 'q' to quit

Running the terminals side by side is a good idea, because you can simultaneously see how the programs behave.

Type a few messages into the client and see how the server picks them up and sends them back. Disconnecting with the client should also prompt a notification on the server.

Concurrent I/O

If you’re adventurous, then you may have tried connecting to our server using more than one client at once. If you tried sending messages from both of them, then you’d have seen that it does not work as we might have hoped. If you haven’t tried this, then give it a go.

A working echo session on the client should look like this:

Type message, enter to send. 'q' to quit
hello world
Sent message: hello world
Received echo: hello world
Closed connection to server

However, when trying to send a message by using a second connected client, we’ll see something like this:

Type message, enter to send. 'q' to quit
hello world
Sent message: hello world

The client will hang when the message is sent, and it won’t get an echo reply. You may also notice that if we send a message by using the first connected client, then the second client will get its response. So, what’s going on here?

The problem is that the server can only listen for the messages from one client at a time. As soon as the first client connects, the server blocks at the socket.recv() call in tincanchat.recv_msg(), waiting for the first client to send a message. The server isn’t able to receive messages from other clients while this is happening and so, when another client sends a message, that client blocks too, waiting for the server to send a reply.

This is a slightly contrived example. The problem in this case could easily be fixed in the client end by asking the user for an input before establishing a connection to the server. However in our full chat service, the client will need to be able to listen for messages from the server while simultaneously waiting for user input. This is not possible in our present procedural setup.

There are two solutions to this problem. We can either use more than one thread or process, or use non-blocking sockets along with an event-driven architecture. We’re going to look at both of these approaches, starting with multithreading.

Multithreading and multiprocessing

Python has APIs that allow us to write both multithreading and multiprocessing applications. The principle behind multithreading and multiprocessing is simply to take copies of our code and run them in additional threads or processes. The operating system automatically schedules the threads and processes across available CPU cores to provide fair processing time allocation to all the threads and processes. This effectively allows a program to simultaneously run multiple operations. In addition, when a thread or process blocks, for example, when waiting for IO, the thread or process can be de-prioritized by the OS, and the CPU cores can be allocated to other threads or processes that have actual computation to do.

Here is an overview of how threads and processes relate to each other:

Learning Python Network Programming

Threads exist within processes. A process can contain multiple threads but it always contains at least one thread, sometimes called the main thread. Threads within the same process share memory, so data transfer between threads is just a case of referencing the shared objects. Processes do not share memory, so other interfaces, such as files, sockets, or specially allocated areas of shared memory, must be used for transferring data between processes.

When threads have operations to execute, they ask the operating system thread scheduler to allocate them some time on a CPU, and the scheduler allocates the waiting threads to CPU cores based on various parameters, which vary from OS to OS. Threads in the same process may run on separate CPU cores at the same time.

Although two processes have been displayed in the preceding diagram, multiprocessing is not going on here, since the processes belong to different applications. The second process is displayed to illustrates a key difference between Python threading and threading in most other programs. This difference is the presence of the GIL.

Threading and the GIL

The CPython interpreter (the standard version of Python available for download from www.python.org) contains something called the Global Interpreter Lock (GIL). The GIL exists to ensure that only a single thread in a Python process can run at a time, even if multiple CPU cores are present. The reason for having the GIL is that it makes the underlying C code of the Python interpreter much easier to write and maintain. The drawback of this is that Python programs using multithreading cannot take advantage of multiple cores for parallel computation.

This is a cause of much contention; however, for us this is not so much of a problem. Even with the GIL present, threads that are blocking on I/O are still de-prioritized by the OS and put into the background, so threads that do have computational work to do can run instead. The following figure is a simplified illustration of this:

Learning Python Network Programming

The Waiting for GIL state is where a thread has sent or received some data and so is ready to come out of the blocking state, but another thread has the GIL, so the ready thread is forced to wait. In many network applications, including our echo and chat servers, the time spent waiting on I/O is much higher than the time spent processing data. As long as we don’t have a very large number of connections (a situation we’ll discuss later on when we come to event driven architectures), thread contention caused by the GIL is relatively low, and hence threading is still a suitable architecture for these network server applications.

With this in mind, we’re going to use multithreading rather than multiprocessing in our echo server. The shared data model will simplify the code that we’ll need for allowing our chat clients to exchange messages with each other, and because we’re I/O bound, we don’t need processes for parallel computation. Another reason for not using processes in this case is that processes are more “heavyweight” in terms of the OS resources, so creating a new process takes longer than creating a new thread. Processes also use more memory.

One thing to note is that if you need to perform an intensive computation in your network server application (maybe you need to compress a large file before sending it over the network), then you should investigate methods for running this in a separate process. Because of quirks in the implementation of the GIL, having even a single computationally intensive thread in a mainly I/O bound process when multiple CPU cores are available can severely impact the performance of all the I/O bound threads. For more details, go through the David Beazley presentations linked to in the following information box:

Processes and threads are different beasts, and if you’re not clear on the distinctions, it’s worthwhile to read up. A good starting point is the Wikipedia article on threads, which can be found at http://en.wikipedia.org/wiki/Thread_(computing).

A good overview of the topic is given in Chapter 4 of Benjamin Erb’s thesis, which is available at http://berb.github.io/diploma-thesis/community/.

Additional information on the GIL, including the reasoning behind keeping it in Python can be found in the official Python documentation at https://wiki.python.org/moin/GlobalInterpreterLock.

You can also read more on this topic in Nick Coghlan’s Python 3 Q&A, which can be found at http://python-notes.curiousefficiency.org/en/latest/python3/questions_and_answers.html#but-but-surely-fixing-the-gil-is-more-important-than-fixing-unicode.

Finally, David Beazley has done some fascinating research on the performance of the GIL on multi-core systems. Two presentations of importance are available online. They give a good technical background, which is relevant to this article. These can be found at http://pyvideo.org/video/353/pycon-2010–understanding-the-python-gil—82 and at https://www.youtube.com/watch?v=5jbG7UKT1l4.

A multithreaded echo server

A benefit of the multithreading approach is that the OS handles the thread switches for us, which means we can continue to write our program in a procedural style. Hence we only need to make small adjustments to our server program to make it multithreaded, and thus, capable of handling multiple clients simultaneously.

Create a new file called 1.3-echo_server-multi.py and add the following code to it:

import threading
import tincanchat
 
HOST = tincanchat.HOST
PORT = tincanchat.PORT
 
def handle_client(sock, addr):
   """ Receive one message and echo it back to client, then close
       socket """
   try:
       msg = tincanchat.recv_msg(sock) # blocks until received
                                         # complete message
       msg = '{}: {}'.format(addr, msg)
       print(msg)
       tincanchat.send_msg(sock, msg) # blocks until sent
   except (ConnectionError, BrokenPipeError):
       print('Socket error')
   finally:
       print('Closed connection to {}'.format(addr))
       sock.close()
 
if __name__ == '__main__':
   listen_sock = tincanchat.create_listen_socket(HOST, PORT)
   addr = listen_sock.getsockname()
   print('Listening on {}'.format(addr))
 
   while True:
       client_sock,addr = listen_sock.accept()
       # Thread will run function handle_client() autonomously
       # and concurrently to this while loop
       thread = threading.Thread(target=handle_client,
                                 args=[client_sock, addr],
                                 daemon=True)
       thread.start()
       print('Connection from {}'.format(addr))

You can see that we’ve just imported an extra module and modified our main loop to run our handle_client() function in separate threads, rather than running it in the main thread. For each client that connects, we create a new thread that just runs the handle_client() function. When the thread blocks on a receive or send, the OS checks the other threads to see if they have come out of a blocking state, and if any have, then it switches to one of them.

Notice that we have set the daemon argument in the thread constructor call to True. This will allow the program to exit if we hit ctrlc without us having to explicitly close all of our threads first.

If you try this echo server with multiple clients, then you’ll see that a second client that connects and sends a message will immediately get a response.

Summary

We looked at how to develop network protocols while considering aspects such as the connection sequence, framing of the data on the wire, and the impact these choices will have on the architecture of the client and server programs.

We worked through different architectures for network servers and clients, demonstrating the multithreaded models by writing a simple echo server.

Resources for Article:


Further resources on this subject:


LEAVE A REPLY

Please enter your comment!
Please enter your name here