SOAP

SOAP, formerly known as Simple Object Access Protocol (until the acronym was dropped in version 1.2), came around shortly after XML-RPC was released. It was created by a group of developers with backing from Microsoft. Interestingly, the creator of XML-RPC, David Winer, was also one of the primary contributors to SOAP. Winer released XML-RPC before SOAP, when it became apparent to him that though SOAP was still a way away from being completed, there was an immediate need for some sort of web service protocol.

Like XML-RPC, SOAP is an XML-based web service protocol. SOAP, however, satisfies a lot of the shortcomings of XML-RPC: namely the lack of user-defined data types, better character set support, and rudimentary security. It is quite simply, a more powerful and flexible protocol than REST or XML-RPC. Unfortunately, sacrifices come with that power. SOAP is a much more complex and rigid protocol. For example, even though SOAP can stand alone, it is much more useful when you use another XML-based standard, called Web Services Descriptor Language (WSDL), in conjunction with it. Therefore, in order to be proficient with SOAP, you should also be proficient with WSDL.

The most-levied criticism of SOAP is that it is overly complex. Indeed, SOAP is not simple. It is long and verbose. You need to know how namespaces work in XML. SOAP can rely heavily on other standards. This is true for most implementations of SOAP, including Microsoft Live Search, which we will be looking at. The most common external specifications used by a SOAP-based service is WSDL to describe its available services, and that, in turn, usually relies on XML Schema Data (XSD) to describe its data types. In order to "know" SOAP, it would be extremely useful to have some knowledge of WSDL and XSD. This will allow one to figure out how to use the majority of SOAP services.

We are going to take a "need to know" approach when looking at SOAP. Microsoft Live Search's SOAP API uses WSDL and XSD, so we will take a look at SOAP with the other two in mind. We will limit our discussion on how to gather information about the web service that you, as a web service consumer, would need and how to write SOAP requests using PHP 5 against it. Even though this article will just introduce you to the core necessities of SOAP, there is a lot of information and detail.

SOAP is very meticulous and you have to keep track of a fair amount of things. Do not be discouraged, take notes if you have to, and be patient.

All three, SOAP, WSD, and XSD are maintained by the W3C. All three specifications are available for your perusal. The official SOAP specification is located at http://www.w3.org/TR/soap/. WSDL specification is located at http://www.w3.org/TR/wsdl. Finally, the recommended XSD specification can be found at http://www.w3.org/XML/Schema.

Web Services Descriptor Language (WSDL) With XML Schema Data (XSD)

Out of all the drawbacks of XML-RPC and REST, there is one that is prominent. Both of these protocols rely heavily on good documentation by the service provider in order to use them. Lacking this, you really do not know what operations are available to you, what parameters you need to pass in order to use them, and what you should expect to get back. Even worse, an XML-RPC or REST service may be poorly or inaccurately documented and give you inaccurate or unexpected results. SOAP addresses this by relying on another XML standard called WSDL to set the rules on which web service methods are available, how parameters should be passed, and what data type might be returned. A service's WSDL document, basically, is an XML version of the documentation. If a SOAP-based service is bound to a WSDL document, and most of them are, requests and responses must adhere to the rules set in the WSDL document, otherwise a fault will occur.

WSDL is an acronym for a technical language. When referring to a specific web service's WSDL document, people commonly refer to the document as "the WSDL" even though that is grammatically incorrect.

Being XML-based, this allows clients to automatically discover everything about the functionality of the web service. Human-readable documentation is technically not required for a SOAP service that uses a WSDL document, though it is still highly recommended. Let's take a look at the structure of a WSDL document and how we can use it to figure out what is available to us in a SOAP-based web service. Out of all three specifications that we're going to look at in relationship to SOAP, WSDL is the most ethereal. Both supporters and detractors often call writing WSDL documents a black art. As we go through this, I will stress the main points and just briefly note other uses or exceptions.

Basic WSDL Structure

Beginning with a root definitions element, WSDL documents follow this basic structure:

    <definitions>
        <types>
        …
        </types>
        <message>
        …
        </message>
        <portType>
        …
        </portType>
        <binding>
        …
        </binding>
    </definitions>

As you can see, in addition to the definitions element, there are four main sections to a WSDL document: types, message, portType, and binding. Let's take a look at these in further detail.

Google used to provide a SOAP service for their web search engine. However, this service is now deprecated, and no new developer API keys are given out. This is unfortunate because the service was simple enough to learn SOAP quickly, but complex enough to get a thorough exposure to SOAP. Luckily, the service itself is still working and the WSDL is still available. As we go through WSDL elements, we will look at the Google SOAP Search WSDL and Microsoft Live Search API WSDL documents for examples. These are available at http://api.google.com/GoogleSearch.wsdl and http://soap.search.msn.com/webservices.asmx?wsdl respectively.

definitions Element

This is the root element of a WSDL document. If the WSDL relies on other specifications, their namespace declarations would be made here. Let's take a look at Google's WSDL's definition tag:

    <definitions name="GoogleSearch"
        targetNamespace="urn:GoogleSearch"
        
        
        
        
        
        >

The more common ones you'll run across are xsd for schema namespace, wsdl for the WSDL framework itself, and soap and soapenc for SOAP bindings. As these namespaces refer to W3C standards, you will run across them regardless of the web service implementation. Note that some searches use an equally common prefix, xs, for XML Schema. tns is another common namespace. It means "this namespace" and is a convention used to refer to the WSDL itself.

types Element

In a WSDL document, data types used by requests and responses need to be explicitly declared and defined. The textbook answer that you'll find is that the types element is where this is done. In theory, this is true. In practice, this is mostly true. The types element is used only for special data types.

To achieve platform neutrality, WSDL defaults to, and most implementations use, XSD to describe its data types. In XSD, many basic data types are already included and do not need to be declared.

Common Built-in XSD Data Types

Time

Date

Boolean

String

Base64Binary

Float

Double

Integer

Byte

For a complete list, see the recommendation on XSD data types at http://www.w3.org/TR/xmlschema-2/.

If the web service utilizes nothing more than these built-in data types, there is no need to have special data types, and thus, types will be empty. So, the data types will just be referred to later, when we define the parameters.

There are three occasions where data types would be defined here:

If you want a special data type that is based on a built-in data type. Most commonly this is a built-in, whose value is restricted in some way. These are known as simple types.

If the data type is an object, it is known as a complex type in XSD, and must be declared.

An array, which can be described as a hybrid of the former two.

Let's take a look at some examples of what we will encounter in the types element.

Simple Type

Sometimes, you need to restrict or refine a value of a built-in data type. For example, in a hospital's patient database, it would be ludicrous to have the length of a field called Age to be more than three digits. To add such a restriction in the SOAP world, you would have to define Age here in the types section as a new type.

Simple types must be based on an existing built-in type. They cannot have children or properties like complex types. Generally, a simple type is defined with the simpleType element, the name as an attribute, followed by the restriction or definition. If the simple type is a restriction, the built-in data type that it is based on, is defined in the base attribute of the restriction element.

For example, a restriction for an age can look like this:

    <xsd:simpleType name="Age">
        <xsd:restriction base="xsd:integer">
            <xsd:totalDigits value="3" />
        </xsd:restriction>
    </xsd:simpleType>

Children elements of restriction define what is acceptable for the value. totalDigits is used to restrict a value based on the character length. A table of common restrictions follows:

Restriction

Use

Applicable In

enumeration

Specifies a list of acceptable values.

All except boolean

fractionDigits

Defines the number of decimal places allowed.

Integers

length

Defines the exact number of characters allowed.

Strings and all binaries

maxExclusive/ maxInclusive

Defines the maximum value allowed. If Exclusive is used, value cannot be equal to the definition. If Inclusive, can be equal to, but not greater than, this definition.

All numeric and dates

minLength/ maxLength

Defines the minimum and maximum number of characters or list items allowed.

Strings and all binaries

minExclusive/ minInclusive

Defines the minimum value allowed. If Exclusive is used, value cannot be equal to the definition. If Inclusive, can be equal to, but not less than, this definition.

All numeric and dates

pattern

A regular expression defining the allowed values.

All

totalDigits

Defines the maximum number of digits allowed.

Integers

whiteSpace

Defines how tabs, spaces, and line breaks are handled. Can be preserve (no changes), replace (tabs and line breaks are converted to spaces) or collapse (multiple spaces, tabs, and line breaks are converted to one space.

Strings and all binaries

A practical example of a restriction can be found in the MSN Search Web Service WSDL. Look at the section that defines SafeSearchOptions.

    <xsd:simpleType name="SafeSearchOptions">
        <xsd:restriction base="xsd:string">
            <xsd:enumeration value="Moderate" />
            <xsd:enumeration value="Strict" />
            <xsd:enumeration value="Off" />
        </xsd:restriction>
    </xsd:simpleType>

In this example, the SafeSearchOptions data type is based on a string data type. Unlike a regular string, however, the value that SafeSearchOptions takes is restricted by the restriction element. In this case, the several enumeration elements that follow. SafeSearchOptions can only be what is given in this enumeration list. That is, SafeSearchOptions can only have a value of "Moderate", "Strict", or "Off".

Restrictions are not the only reason to use a simple type. There can also be two other elements in place of restrictions. The first is a list. If an element is a list, it means that the value passed to it is a list of space-separated values. A list is defined with the list element followed by an attribute named itemType, which defines the allowed data type. For example, this example specifies an attribute named listOfValues, which comprises all integers.

    <xsd:simpleType name="listOfValues">
        <xsd:list itemType="xsd:integer" />
    </xsd:simpleType>

The second is a union. Unions are basically a combination of two or more restrictions. This gives you a greater ability to fine-tune the allowed value. Back to our age example, if our service was for a hospital's pediatrics ward that admits only those under 18 years old, we can restrict the value with a union.

    <xsd:simpleType name="Age">
        <xsd:union>
            <xsd:simpleType>
                <xsd:restriction base="decimal">
                        <xsd:minInclusive value="0" />
                </xsd:restriction>
            </xsd:simpleType>
            <xsd:simpleType>
                <xsd:restriction base="decimal">
                        <xsd:maxExclusive value="18" />
                </xsd:restriction>
            </xsd:simpleType>
        </xsd:union>
    </xsd:simpleType>

Finally, it is important to note that while simple types are, especially in the case of WSDLs, used mainly in the definition of elements, they can be used anywhere that requires the definition of a number. For example, you may sometimes see an attribute being defined and a simple type structure being used to restrict the value.

Complex Type

Generically, a complex type is anything that can have multiple elements or attributes. This is opposed to a simple type, which can have only one element. A complex type is represented by the element complexType in the WSDL. The most common use for complex types is as a carrier for objects in SOAP transactions. In other words, to pass an object to a SOAP service, it needs to be serialized into an XSD complex type in the message.

The purpose of a complexType element is to explicitly define what other data types make up the complex type. Let's take a look at a piece of Google's WSDL for an example:

    <xsd:complexType name="ResultElement">
        <xsd:all>
            <xsd:element name="summary" type="xsd:string"/>
            <xsd:element name="URL" type="xsd:string"/>
            <xsd:element name="snippet" type="xsd:string"/>
            <xsd:element name="title" type="xsd:string"/>
            <xsd:element name="cachedSize" type="xsd:string"/>
            <xsd:element name=
                        "relatedInformationPresent" type="xsd:boolean"/>
            <xsd:element name="hostName" type="xsd:string"/>
            <xsd:element name=
                        "directoryCategory" type="typens:DirectoryCategory"/>
            <xsd:element name="directoryTitle" type="xsd:string"/>
        </xsd:all>
    </xsd:complexType>

First thing to notice is how the xsd: namespace is used throughout types. This denotes that these elements and attributes are part of the XSD specification.

In this example, a data type called ResultElement is defined. We don't exactly know what it is used for right now, but we know that it exists. An element tag denotes complex type's equivalent to an object property. The first property of it is summary, and the type attribute tells us that it is a string, as are most properties of ResultElement. One exception is relatedInformationPresent, which is a Boolean. Another exception is directoryCategory. This has a data type of DirectoryCategory. The namespace used in the type attribute is typens. This tells us that it is not an XSD data type. To find out what it is, we'll have to look for the namespace declaration that declared typens.