Connecting to Sakai is straightforward, and simple tasks, such as automatic course creation, take only a few tens of lines of programming effort.

There are significant advantages to having web services in the enterprise. If a developer writes an application that calls a number of web services, then the application does not need to know the hidden details behind the services. It just needs to agree on what data to send. This loosely couples the application to the services. Later, you can replace one web service with another. Programmers do not need to change the code on the application side. SOAP works well with most organizations' firewalls (http://en.wikipedia.org/wiki/Firewall), as SOAP uses the same protocol as web browsers. System administrators have a tendency to protect an organization's network by closing unused ports to the outside world. This means that most of the time there is no extra network configuration effort required to enable web services.

Another simplifying factor is that a programmer does not need to know the details of SOAP or REST, as there are libraries and frameworks that hide the underlying magic. For the Sakai implementation of SOAP, to add a new service is as simple as writing a small amount of Java code within a text file, which then is automatically compiled and run the first time the service is called. This is great for rapid application development and deployment, as the system administrator does not need to restart Sakai for each change. Just as importantly, the Sakai services use the well-known libraries from the Apache Axis project (http://ws.apache.org/axis/).

SOAP is an XML message passing protocol that, in the case of Sakai sites, sits on top of the Hyper Text Transfer Protocol (HTTP). HTTP is the protocol used by web browsers to obtain web pages from a server. The client sends messages in XML format to a service, including the information that the service needs, and then the service returns a message with the results or an error message. A readable reference to this interchange is the book Pro Apache XML by Poornachandra Sarang, PhD (http://www.freesoftwaremagazine.com/articles/book_review_pro_apache_xml).

The full definition of HTTP is given at http://www.w3.org/TR/soap12-part1.

The architects introduced SOAP-based web services first to Sakai and later RESTful services. Unlike SOAP, instead of sending XML via HTTP posts to one URL that points to a service, REST sends to a URL that includes information about the entity, such as a user, with which the client wishes to interact. For example, a REST URL for viewing an address book item could look similar to http://host/direct/addressbook_item/15. Applying URLs in this way makes understandable address spaces that are easier for a human to read. This more intuitive approach simplifies coding. Further, SOAP XML passing requires that the client and server parse the XML and at times, the parsing effort is expensive in CPU cycles and response times.

The Entity Broker is an internal service that makes life easier for programmers and helps them manipulate entities. Entities in Sakai are managed pieces of data such as representations of courses, users, grade books, and so on. In the newer versions of Sakai, the Entity Broker has the power to expose entities as RESTful services. In contrast, for SOAP services, if you wanted a new service, you would need to write it yourself. Over time, the Entity Broker exposes more and more entities RESTfully, delivering more hooks free to integrate with other enterprise systems.

Both SOAP and REST services sit on top of the HTTP protocol, which is explained in the next section of this article.

Protocols

This section explains how web browsers talk to servers in order to gather web pages. It explains how to use the telnet command and a visual tool called TCPMON (http://ws.apache.org/commons/tcpmon/tcpmontutorial.html) to gain insight into how web services and Web 2.0 technologies work.

Playing with Telnet

It turns out that message passing occurs via text commands between the browser and the server. Web browsers use HTTP (http://www.w3.org/Protocols/rfc2616/rfc2616.html) to get web pages and the embedded content from the server and to send form information to the server. HTTP talks between the client and server via text (7 bit ASCII) commands. When humans talk with each other, they have a wide vocabulary. However, HTTP uses fewer than twenty words.

You can experiment directly with HTTP using a Telnet client to send your commands to a web server. For example, if your demonstration Sakai instance is running on port 8080, the following command will get you the login page:

telnet localhost 8080
GET /portal/login

The GET command does what it sounds like and gets a web page. Forms can use the GET verb to send data at the end of the URL. For example, GET /portal/login?name=alan&age=15 is sending the variables name=alan and age=15 to the server.

Installing TCPMON

You can use the TCPMON tool to view requests and responses from a web browser such as Firefox. One of TCPMON's abilities is that it can act as an invisible man in the middle, recording the messages between the web browser and the server. Once set up, the requests sent from the browser go to TCPMON and TCPMON passes the request on to the server. The server passes back a response and then TCPMON, a transparent proxy (http://en.wikipedia.org/wiki/Proxy_server), returns the response to the web browser. This allows us to look at all requests and responses graphically.

First, you can set TCPMON up to listen on a given port number—by convention, normally, port 8888—and then you can configure your web browser to send its requests through the proxy. Then, you can type the address of a given page into the web browser, but instead of going directly to the relevant server, the browser sends the request to the proxy, which then passes it on and passes the response back. TCPMON displays both the request and responses in a window.

You can download TCPMON from http://ws.apache.org/commons/tcpmon/download.cgi.

After downloading and unpacking, you can, from within the build directory, run either tcpmon.bat for the Windows environment or tcpmon.sh for Unix/Linux environments. To configure a proxy, you can click the Admin tab and then set the Listen Port to 8888 and select the Proxy radio button. After that, clicking Add will create a new tab, where the requests and responses will later be displayed.

sakai-web-services-connecting-enterprise-part-1-img-0

Your favorite web browser now has to recognize the newly set up proxy. For Firefox 3, you can do this by selecting the menu option Edit/Preferences and then choosing the advanced tab and the network tab, as shown next. You will need to set the proxy options HTTP proxy to 127.0.0.1 and the port number to 8888. If you do this, you will need to ensure that the No proxies text input is blank. Clicking the OK button enables the new settings.

sakai-web-services-connecting-enterprise-part-1-img-1

To use the Proxy from within Internet Explorer 7 for a Local Area Network (LAN), you can edit the dialog box found under Tools | Internet Options | Connections | LAN settings.

sakai-web-services-connecting-enterprise-part-1-img-2

Once the proxy is working, typing http://localhost:8080/portal/login in the address bar will seamlessly return the login page of your local Sakai instance. Otherwise, you will see an error message similar to Proxy Server Refused Connection for Firefox or Internet Explorer cannot display the webpage.

To turn the proxy settings off, simply select the No Proxies radio box and click OK for Firefox 3, or unselect the Use the proxy server for the LAN tick box in Internet Explorer 7 and click OK.

Requests and returned status codes

When TCPMON is running a proxy on port 8888, it allows you to view the requests from the browser and the response in an extra tab, as shown in the following screen grab. Notice the extra information that the browser sends as part of the request. HTTP/1.1 defines the protocol and version level and the lines below the GET are header variables. The User-Agent defines which client sent the request. The Accept headers tell the server what the capabilities of the browser are, and the Cookie header defines the value stored in a cookie. HTTP is stateless, that is, in principle; each response is based only on the current request. However, to get around this, persistent information can be stored in cookies. Web browsers normally store their representation of a cookie as a little text file or in a small database on the end users' computers.

sakai-web-services-connecting-enterprise-part-1-img-3

Sakai uses the supporting features of a servlet container, such as Tomcat, to maintain state in cookies. A cookie stores a session ID, and when the server sees the session ID, it can look up the request's server-side state. Server-side state contains information such as whether the user is logged in or what he or she has ordered. The web browser deletes the local representation of the cookie each time the browser closes.

A cookie that is deleted when a web browser closes is known as a session cookie.

The server response starts with the protocol followed by a status number. HTTP/1.1 200 OK tells the web browser that the server is using HTTP version 1.1 and it was able to return the requested web page successfully. 2xx status codes imply success. 3xx status codes imply some form of redirection and tell the web browser where to try to pick up the requested resource. 4xx status codes are for client errors, such as malformed requests or lack of permission to obtain the resource. 4xx states are fertile grounds for security managers to look in log files for attempted hacking. 5xx status codes mostly have to do with a failure of the server itself and are mostly of interest to system administrators and programmers during the debugging cycle. In most cases, 5xx status numbers are about either high server load or a broken piece of code. Sakai is changing rapidly and even with the most vigorous testing, there are bound to be the occasional hiccups. You will find accurate details of the full range of status codes at: http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html.

Another important part of the response is the Content-Type, which tells the web browser which type of material the response is returning so the browser knows how to handle it. For example, the web browser may want to run a plug-in for video types and display text natively. The Content-Length in characters is normally also given. After the header information is finished, there is a newline followed by the content.

Web browsers interpret any redirects that are returned by sending extra requests. Web browsers also interpret any HTML pages and make multiple requests for resources such as JavaScript files and images. Modern browsers do not wait until the server returns all the requests, but render the HTML page live as the server returns the parts.

The GET verb is not very efficient for posting a large amount of data, as the URL has a length limit of around 2000 characters. Further, the end user can see the form data, and the browser may encode entities such as spaces to make the URL unreadable. There is also a security aspect: if you are typing in passwords in forms using GET, others may see your password or other details. This is not a good idea, especially at Internet Cafés where the next user who logs on can see the password in the browsing history. The POST verb is a better choice. Let us take as an example the Sakai demonstration login page http://localhost:8080/portal/login. The login page itself contains a form tag that points with the POST method to the relogin page.

<form method="post" action="http://localhost:8080/portal/relogin" 
enctype="application/x-www-form-urlencoded">

Notice the HTML tag also defines the content type. Key features of the Post request compared to the GET are: the form values are stored as content after the header values, there is a newline between the end of the header and the data, and the request mentions data and the amount of data by the use of the Content-Length header value.

The essential POST values for a login form with user admin (eid=admin) and password admin (pw=admin) will look like:

POST http://localhost:8080/portal/relogin HTTP/1.1
Content-Type: application/x-www-form-urlencoded
Content-Length: 31

eid=admin&pw=admin&submit=Login

POSTs can contain much more information than GETs, and the request hides the values from the Address bar of the web browser. This is not secure. The header is just as visible as the URL, so POST values are also neither hidden nor secure. The only viable solution is for your web browser to encrypt your transactions using SSL/TLS (http://www.ietf.org/rfc/rfc2246.txt) for security, and this occurs every time you connect to a server using an HTTPS URL.

SOAP

Sakai uses the Apache Axis framework, which the developers have configured to accept SOAP calls via POST. SOAP sends messages in a specific XML format with the Content-Type, otherwise known as MIME type, application/soap+xml. A programmer does not need to know much more than that, as client libraries take care of the majority of the excruciating low-level details. An example SOAP message generated by the Perl module SOAP::Lite (http://www.soaplite.com/) for creating a login session in Sakai will look like the following Post data:

<?xml version="1.0" encoding="UTF-8"?>
<soap:Envelope  
 
 
soap:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" 
>
<soap:Body>
<login >
<c-gensym3 xsi_type="xsd:string">admin</c-gensym3>
<c-gensym5 xsi_type="xsd:string">admin</c-gensym5>
</login>
</soap:Body>
</soap:Envelope>

There is an envelope with a body containing data for the service to consume. The important point to remember is that both the client and the server have to be able to parse the specific XML schema. SOAP messages can include extra security features, but Sakai does not require these. The architects expect organizations to encrypt web services using SSL/TSL.

The last extra SOAP-related complexity is the Web Service Description Language (http://www.w3.org/TR/wsdl). Web services may change location or exist in multiple locations for redundancy. The service writer can define the location of the services and the data types involved with those services in another file, in XML format.

JSON

Also worth mentioning is JavaScript Object Notation (JSON) (http://tools.ietf.org/html/rfc4627), which is another popular format passed using HTTP. A significant improvement in the quality of the end user experience during web browsing occurred when web developers realized that they could force browsers to load parts of a web page in at a time. This asynchronous loading enables all kinds of whiz-bang features, such as when you type in a search term and can choose from a set of search term completions before pressing submit. Asynchronous loading delivers more responsive and richer web pages that feel more like traditional applications than a plain old web page. JSON is one of the formats of choice for passing asynchronous requests and responses.

The asynchronous communication normally occurs through HTTP GET or POST, but with a specific content structure that is designed to be human readable and script language parser-friendly. JSON calls have the file extension .json as part of the URL. As mentioned in RFC 4627, an example image object communicated in JSON looks like:

{
  "Image": {
     "Width": 800,
     "Height": 600,
     "Title": "View from 15th Floor",
     "Thumbnail": {
	"Url": "http://www.example.com/image/481989943",
	"Height": 125,
	"Width": "100"
     },
     "IDs": [116, 943, 234, 38793]
  }
}

To confuse the boundaries between client and server, a lot of the presentation and business logic is locked on the client side in scripting languages such as JavaScript. The scripting language orchestrates the loading of parts of pages and the generation of widget sets. Frameworks such as jQuery (http://jquery.com/) and MyFaces (http://myfaces.apache.org/) significantly ease the client-side programming burden.

REST

To understand REST, you need to understand the other verbs in HTTP (http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html). The full HTTP set is OPTIONS, GET, HEAD, POST, PUT, DELETE, and TRACE.

The HEAD verb returns from the server only the headers of the response without the content, and is useful for clients that want to see if the content has changed since the last request. PUT requests that the content in the request be stored at the particular location mentioned in the request. DELETE is for deleting the entity.

REST uses the URL of the request to route to the resource, and the HTTP verb GET is used to get a resource, PUT to update, DELETE to delete, and POST to add a new resource. In general, POST=create an item, PUT=update an item, DELETE=delete an item, and GET=return information on the item.

In SOAP, you are pointing directly towards the service the client calls or indirectly via the web service description. However, in REST, part of the URL describes the resource or resources you wish to work with. For example, a hypothetical address book application that lists all email addresses in HTML format would look similar to the following:

GET /email

To list the addresses in XML format or JSON format:

GET /email.xml
GET /email.json

To get the first email address in the list:

GET /email/1

To create a new email address, of course remembering to add the rest of email details to the end of the GET:

POST /email

And to delete address 5 in the list:

DELETE /email/5

To obtain address 5 in other formats such as JSON or XML, then use file extensions at the end of the URL, for example:

GET /email/5.json
GET /email/5.xml

RESTful services are more intuitively descriptive than SOAP services and they enable easy switching of the format from HTML to JSON to fuel dynamic, asynchronously-loaded web sites. Due to the direct use of HTTP verbs by REST, this methodology also fits well with the most common application type: CRUD (Create, Read, Update, Delete) applications, such as the site or user tools within Sakai.

Now that we have discussed the theory, in the next section, we shall discuss which Sakai-related SOAP services already exist.

Existing web services

Sakai has built in, by default, the most community-requested web services, and there are also a few more services in the contributed section of the source code repository. This section describes the currently available services and the next section explains an example use, creating a new user.

Recapping terminology

In general, developers write web services for other developer's code to connect to (consume). Therefore, terminology can be confusing. In Sakai, a realm is a set of roles and their associated permissions. When you create a site, a copy is made from a specific realm template for that particular site type. The permissions can then be modified for the roles in the site, and members added to the site with one or other of the specific roles. Internally, Sakai uses AuthzGroups to keep track of groups of users. An AuthzGroup is an authorization group (a group of users, each with a role and a set of permissions of functions assigned to each role). A site contains pages; when you click on the tool menu for a given tool, normally, you will see one tool displayed in a page. However, for the home page tool, you will see more tools contained within a page.