Categories: Front-End Web DevelopmentWeb DevelopmentTutorials

Using the WebRTC Data API

10 min read

(For more resources related to this topic, see here.)

What is WebRTC?

Web Real-Time Communication is a new (still under an active development) open framework for the Web to enable browser-to-browser applications for audio/video calling, video chat, peer-to-peer file sharing without any third-party additional software/plugins.

It was open sourced by Google in 2011 and includes the fundamental building components for high-quality communications on the Web. These components, when implemented in a browser, can be accessed through a JavaScript API, enabling developers to build their own rich, media web applications. Google, Mozilla, and Opera support WebRTC and are involved in the development process.

Major components of WebRTC API are as follows:

getUserMedia: This allows a web browser to access the camera and microphone
PeerConnection: This sets up audio/video calls
DataChannels: This allow browsers to share data via peer-to-peer connection

Benefits of using WebRTC in your business

Reducing costs: It is a free and open source technology. You don’t need to pay for complex proprietary solutions ever. IT deployment and support costs can be lowered because now you don’t need to deploy special client software for your customers.
Plugins?: You don’t need it ever. Before now you had to use Flash, Java applets, or other tricky solutions to build interactive rich media web applications. Customers had to download and install third-party plugins to be able using your media content. You also had to keep in mind different solutions/plugins for variety of operating systems and platforms. Now you don’t need to care about it.
Peer-to-peer communication: In most cases communication will be established directly between your customers and you don’t need to have a middle point.
Easy to use: You don’t need to be a professional programmer or to have a team of certified developers with some kind of specific knowledge. In a basic case, you can easily integrate WebRTC functionality into your web services/sites by using open JavaScript API or even using a ready-to-go framework.
Single solution for all platforms: You don’t need to develop special native version of your web service for different platforms (iOS, Android, Windows, or any other). WebRTC is developed to be a cross-platform and universal tool.
WebRTC is open source and free: Community can discover new bugs and solve them effectively and quick. Moreover, it is developed and standardized by Mozilla, Google, and Opera—world software companies.

Topics

The article covers the following topics:

Developing a WebRTC application: You will learn the basics of the technology and build a complete audio/video conference real-life web application. We will also talk on SDP (Session Description Protocol), signaling, client-server sides’ interoperation, and configuring STUN and TURN servers.
In Data API, you will learn how to build a peer-to-peer, cross-platform file sharing web service using the WebRTC Data API.
Media streaming and screen casting introduces you into streaming prerecorded media content peer-to-peer and desktop sharing. In this article, you will build a simple application that provides such kind of functionality.
Nowadays, security and authentication is very important topic and you definitely don’t want to forget on it while developing your applications. So, in this article, you will learn how to make your WebRTC solutions to be secure, why authentication might be very important, and how you can implement this functionality in your products.
Nowadays, mobile platforms are literally part of our life, so it’s important to make your interactive application to be working great on mobile devices also. This article will introduce you into aspects that will help you in developing great WebRTC products keeping mobile devices in mind.

Session Description Protocol

SDP is an important part of WebRTC stack. It used to negotiate on session/media options during establishing peer connection.

It is a protocol intended for describing multimedia communication sessions for the purposes of session announcement, session invitation, and parameter negotiation. It does not deliver media data itself, but is used for negotiation between peers of media type, format, and all associated properties/options (resolution, encryption, codecs, and so on). The set of properties and parameters are usually called a session profile.

Peers have to exchange SDP data using signaling channel before they can establish a direct connection.

The following is example of an SDP offer:

v=0
o=alice 2890844526 2890844526 IN IP4 host.atlanta.example.com
s=
c=IN IP4 host.atlanta.example.com
t=0 0
m=audio 49170 RTP/AVP 0 8 97
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:97 iLBC/8000
m=video 51372 RTP/AVP 31 32
a=rtpmap:31 H261/90000
a=rtpmap:32 MPV/90000

Here we can see that this is a video and audio session, and multiple codecs are offered.

The following is example of an SDP answer:

v=0
o=bob 2808844564 2808844564 IN IP4 host.biloxi.example.com
s=
c=IN IP4 host.biloxi.example.com
t=0 0
m=audio 49174 RTP/AVP 0
a=rtpmap:0 PCMU/8000
m=video 49170 RTP/AVP 32
a=rtpmap:32 MPV/90000

Here we can see that only one codec is accepted in reply to the offer above.

You can find more SDP sessions examples at https://www.rfc-editor.org/rfc/rfc4317.txt.

You can also find in-dept details on SDP in the appropriate RFC at http://tools.ietf.org/html/rfc4566.

Configuring and installing your own STUN server

As you already know, it is important to have an access to STUN/TURN server to work with peers located behind NAT or firewall. In this article, developing our application, we used pubic STUN servers (actually, they are public Google servers accessible from other networks).

Nevertheless, if you plan to build your own service, you should install your own STUN/TURN server. This way your application will not be depended on a server you even can’t control. Today we have public STUN servers from Google, tomorrow they can be switched off. So, the right way is to have your own STUN/TURN server.

In this section, you will be introduced to installing STUN server as the simpler case. There are several implementations of STUN servers that can be found on the Internet. You can take one from http://www.stunprotocol.org.

It is cross-platform and can be used under Windows, Mac OS X, or Linux.

To start STUN server, you should use the following command line:

stunserver --mode full --primaryinterface x1.x1.x1.x1 --altinterface x2.x2.x2.x2

Please, pay attention that you need two IP addresses on your machine to run STUN server. It is mandatory to make STUN protocol work correct. The machine can have only one physical network interface, but it should have then a network alias with IP address different of that used on the main network interface.

WebSocket

WebSocket is a protocol that provides full-duplex communication channels over a single TCP connection. This is a relatively young protocol but today all major web browsers including Chrome, Internet Explorer, Opera, Firefox, and Safari support it. WebSocket is a replacement for long-polling to get two-way communications between browser and server.

In this article, we will use WebSocket as a transport channel to develop a signaling server for our videoconference service. Using it, our peers will communicate with the signaling server.

The two important benefits of WebSocket is that it does support HTTPS (secure channel) and can be used via web proxy (nevertheless, some proxies can block WebSocket protocol).

NAT traversal

WebRTC has in-built mechanism to use such NAT traversal options like STUN and TURN servers.

In this article, we used public STUN (Session Traversal Utilities for NAT) servers, but in real life you should install and configure your own STUN or TURN (Traversal Using Relay NAT) server.

In most cases, you will use a STUN server. It helps to do NAT/firewall traversal and establish direct connection between peers. In other words, STUN server is utilized during connection establishing stage only. After the connection has been established, peers will transfer media data directly between them.

In some cases (unfortunately, they are not so rare), STUN server won’t help you to get through a firewall or NAT and establishing direct connection between peers will be impossible. For example, if both peers are behind symmetric NAT. In this case TURN server can help you.

TURN server works as a retransmitter between peers. Using TURN server, all media data between peers will be transmitted through the TURN server.

If your application gives a list of several STUN/TURN servers to the WebRTC API, the web browser will try to use STUN servers first and in case if connection failed it will try to use TURN servers automatically.

Preparing environment

We can prepare the environment by performing the following steps:

Create a folder for the whole application somewhere on your disk. Let’s call it my_rtc_project.
Make a directory named my_rtc_project/www here, we will put all the client-side code (JavaScript files or HTML pages).
Signaling server’s code will be placed under its separate folder, so create directory for it my_rtc_project/apps/rtcserver/src.
Kindly note that we will use Git, which is free and open source distributed version control system. For Linux boxes it can be installed using default package manager. For Windows system, I recommend to install and use this implementation: https://github.com/msysgit/msysgit.
If you’re using Windows box, install msysgit and add path to its bin folder to your PATH environment variable.

Installing Erlang

The signaling server is developed in Erlang language. Erlang is a great choice to develop server-side applications due to the following reasons:

It is very comfortable and easy for prototyping
Its processes (aktors) are very lightweight and cheap
It does support network operations with no need of any external libraries
The code been compiled to a byte code running by a very powerful Erlang Virtual Machine

Some great projects

The following projects are developed using Erlang:

Yaws and Cowboy: These are web servers
Riak and CouchDB: These are distributed databases
Cloudant: This is a database service based on fork of CouchDB
Ejabberd: This is a XMPP instant messaging service
Zotonic: This is a Content Management System
RabbitMQ: This is a message bus
Wings 3D: This is a 3D modeler
GitHub: This a web-based hosting service for software development projects that use Git. GitHub uses Erlang for RPC proxies to Ruby processes
WhatsApp: This is a famous mobile messenger, sold to Facebook
Call of Duty: This computer game uses Erlang on server side
Goldman Sachs: This is high-frequency trading computer programs

A very brief history of Erlang

1982 to 1985: During this period, Ericsson starts experimenting with programming of telecom. Existing languages do not suit for the task.
1985 to 1986: During this period, Ericsson decides they must develop their own language with desirable features from Lisp, Prolog, and Parlog. The language should have built-in concurrency and error recovery.
1987: In this year, first experiments with the new language Erlang were conducted.
1988: In this year, Erlang firstly used by external users out of the lab.
1989: In this year, Ericsson works on fast implementation of Erlang.
1990: In this year, Erlang is presented on ISS’90 and gets new users.
1991: In this year, Fast implementation of Erlang is released to users. Erlang is presented on Telecom’91, and has compiler and graphic interface.
1992: In this year, Erlang gets a lot of new users. Ericsson ported Erlang to new platforms including VxWorks and Macintosh.
1993: In this year, Erlang gets distribution. It makes it possible to run homogeneous Erlang system on a heterogeneous hardware. Ericsson starts selling Erlang implementations and Erlang Tools. Separate organization in Ericsson provides support.
Erlang is supported by many platforms. You can download and install it using the main website: http://www.erlang.org.