Understanding Web Services

Why write this article?

<br
It's been a while now since I wrote my first web service. I wrote it using WCF (Windows Communication Foundation). Honestly, back then - on that day, I wouldn't be able to tell you the difference between WCF and a potato. All I knew was that a web service (once deployed to a server) was a piece of code that could be invoked / executed by other code in existing applications (both on the same machine or on remote machines). This, I believe, is entirely true. However, the why and how remained in shadows. Over the past year, while taking a look into RESTful services and their difference with SOAP, I found a vast number of misconceptions surrounding this area of web programming. And I myself was prey to most. So here, we'll talk about all of what I've learnt in this time about web services, and I'll throw in acronyms like RPC and HATEOAS just for kicks.

Article Structure

<br
In this article I'll be referring to a number of linked web pages. These together, when read in the order of appearance in this document, should help us form a very concrete understanding of the need for and the evolution of web services. They will also guide us in describing the technical architectures involved and the reasoning behind each programming style. I shall give a brief introduction to the concepts / content involved in each link and will go into any further details only where I deem that explanation of any particular bit is lacking or ambiguous.

At the beginning... Or somewhere in between

<br
As I was hunting down the dawn of the web service era, I found a very interesting article that had been written in Information Week by Jason Levitt way back in September, 2001. It was a very simple exposition on the history of web services. The article talks about the efforts made in EDI (Electronic Data Interchange) and the eventual coming together of SOAP web services. So, the basis for the need for web services was EDI between different business entities. There did not exist any standard way for systems to communicate over a network. For example, let's say that a certain piece of code with a vendor contains business logic that can be applied to the data in a number of client business processes. If each of these business processes are implemented in separate programming languages by separate clients, the lack of a common standard interface / data type would make it difficult for the vendor to integrate its singular implementation of code with each separate business. The first thing we need in such a scenario is a standard data format that can be sent to the vendor system, which it will understand. This space was filled in by XML.

Before XML and before the advent of web services, there were other technologies like DCOM (Distributed Component Object Model - Microsoft) and CORBA (Common Object Request Broker Architecture - Object Management Group). However, they lacked a single standard protocol and programming was relatively complex. As a result, they did not gain momentum within the industry. The rest of the article talks briefly about the beginnings of SOAP and its journey to becoming the framework it is today.

We'll move to another article now, which delves a little deeper into the evolution of web services, XML having been established as a standard. This write-up, and others following it, will be more technical in nature. The most logical implementation of a service invocation would be in the lines of making a procedure call with defined parameters – the most traditional programming paradigm. However, as the application (service), by definition, is meant to be distributed over a network, the invocation call would have to be a remote procedure call (RPC). The first such framework was XML – RPC. It used HTTP as the transport and XML as the encoding.

You can read the specification here.

The few shortcomings it has are:

Use of only simple data types. No user defined objects allowed as parameters.
XML-RPC does not follow the W3C XML Schema recommendation as it does not use XML namespaces and it defines its own data types.
XML-RPC is “bound to” HTTP whereas some applications might require other transfer protocols such as SMTP.

SOAP followed soon after, attempting to overcome the above.

(There is an excellent article here that compares XML-RPC and SOAP in greater detail: XML-RPC vs SOAP)

To SOAP.

<br
The Web Services Description Language (WSDL) provides an XML grammar for describing SOAP Services. Further details:

Understanding WSDL

Keep up with the Web service styles (and uses)

A SOAP message is made up of three main elements – The Envelope, the Header and the Body. Apart from the Header, which is optional, the other two are mandatory.

The Envelope acts as a container encapsulating the other two elements and is also used to define XML namespace
information. The Header contains information relevant for processes such as routing and authentication.

Thirdly, the Body contains the actual message content. Another special block that may appear in a message is the Fault block. If this block were present, it would appear within Body.

When a client node sends a SOAP message to the server node, the message may directly reach the server or it may pass through one or many “intermediary nodes” before reaching the server. It is likely that some, or even all of the SOAP blocks within the Header targets these intermediary nodes. When a SOAP message is passed through an intermediary node, the Header may be modified or removed completely when reaching the target node. As a side-note, the term “SOAP block” is used by SOAP to describe a block of data seen by the processor of the message as a single computational unit of data. The SOAP blocks within the Header section, are known as the “Header blocks” and likewise, the blocks within Body are called the “Body blocks”.

Now, continuing on the present article, we find further concepts involving SOAP, such as RPC style and Document Style (message centric) invocations. However, I find that the article refers to these as programming models for SOAP services. This is incorrect. They are, rather, messaging styles from which programming models may be inferred. I'll link three more articles that provide a clearer explanation of this:

SOAP's Two Messaging Styles

RPC/Literal and Freedom of Choice

SOAP Binding: Difference between Document and RPC Style Web Services

The primary feature of document/literal, and its key benefit compared to RPC/literal, is the use of a schema element declaration to completely describe the contents of soap:Body. This means you can tell what the message body Infoset contains just by looking at the schema and with no need for additional rules. Consequently you could take the schema describing a document/literal message and use it to validate the message. You can't do this with RPC/literal.

In the second article linked above we find two example SOAP messages (one RPC Literal and one Document Literal) and their corresponding WSDLs. The WSDL for the RPC style service does not have the element “Example” defined under tag. (The tag here is the XSD against which a SOAP message can be validated.) So from looking at the SOAP message we find that the schema (XSD / ) alone does not tell us what the message should contain. As a result additional rules are required to validate this message which is documented in the SOAP spec.

Primarily, in RPC style invocations method calls and responses are structured within the SOAP body as hierarchical XML elements, or structures, where the root level element name is the method name in the case of the request and an arbitrary value in the case of the response, the structure's child elements are the method's parameters or return values; and each parameter or return value's elements are the data value or values it represents.

There are no such constraints for Document Style messaging.

SOAP encoding is another concept. There are usually two types "literal" and "encoded". If you choose "literal", you are saying that the XML Schema constructs your WSDL definitions refer to are concrete specifications of what will appear in your SOAP message bodies. If you choose "encoded," you are saying that the XML Schema constructs your WSDL definitions refer to are abstract specifications of what will appear in your SOAP message bodies; these can be made concrete by applying the rules defined by SOAP encoding. (The WSDL specification allows other encoding schemes as well, but alternatives are rarely if ever used.) Basically encoding is used in addition to SOAP Data Models. The data model is an abstract representation of data structures such as you might find in Java or C#, and the encoding is a set of rules to map that data model into XML so you can send it in SOAP messages. SOAP data models and encoding are optional and as best-practice are not used. WSDL along with XSD is sufficient these days. As to why data models and encoding were a part of SOAP, I'll quote from another article:

When the first SOAP specification was written, the concepts behind Web services were still in their infancy. People were planning to use SOAP as a way to better integrate distributed object technologies like DCOM, CORBA, and RMI with native Internet technologies such as XML and HTTP. The goal was to build plumbing that produced and consumed XML-based messages instead of the various binary message formats favored by each technology (NDR, CDR, and JRMP, respectively).
In order for the clients and servers in a distributed application to produce and consume messages, they need to know how those messages are supposed to look. Most distributed object systems rely on a combination of compiled proxy/stub/skeleton code and binary representations of metadata (such as COM Type Libraries, CORBA Interface Repositories, or Java .class files) to provide that information. SOAP didn't change this. The authors of the SOAP specification assumed that an application developer would ensure that clients and servers had whatever information they needed to process SOAP messages correctly.
However, the SOAP authors realized that if they were not going to define a common way to describe messages, they should at least provide some guidance for how to map common object-oriented programming constructs to XML. They couldn't use XML Schema (XSD) to solve this problem; it was still far from completion. So they defined a data model based on graphs of untyped structures. Then they wrote the SOAP encoding rules, which explain how to serialize an instance of the SOAP data model to a SOAP message. It was left to SOAP implementers to map their own technologies to the SOAP data model.

Now that we have an insight into SOAP, let's head back to our original article. What was needed now, was a method to discover web services once they had been developed. UDDI (Universal Description, Discovery and Integration)- a standards based system, was developed to advertise these SOAP services. How UDDI exactly works can be read here.

With SOAP, WSDL and UDDI in place, the industry had a standard for web services. It became wildly popular and is still very much in use.

In the next article we will take a look at another standard, REST, that is increasingly finding greater favour among developers for building web service APIs. That's when we'll play with HATEOAS. :-)