Documenting web applications

Documenting an entire web application can be surprisingly tricky because of the many different layers involved. Some web application frameworks support automatic documentation generation better than others. It's preferable to have fewer disparate parts. For example, Lisp, Smalltalk, and some Ruby frameworks are little more than internal DSLs that can be trivially redefined to produce documentation from the actual application code.

In general, Java frameworks are more difficult to limit to a single layer. Instead, we are confronted with HTML, JSP, JavaScript, Java, the framework itself, its configuration methodologies (XML, annotations, scripting languages, etc.), the service layers, business logic, persistence layers, and so on? Complete documentation generally means aggregating information from many disparate sources and presenting them in a way that is meaningful to the intended audience.

High-level overviews

The site map is obviously a reasonable overview of a web application. A site map may look like a simple hierarchy chart, showing a simple view of a site's pages without showing all of the possible links between pages, how a page is implemented, and so on.

documenting-our-application-apache-struts-2-part-2-img-0

This diagram was created by hand and shows only the basic outline of the application flow. It represents minor maintenance overhead since it would need to be updated when there are any changes to the application.

Documenting JSPs

There doesn't seem to be any general-purpose JSP documentation methodology. It's relatively trivial to create comments inside a JSP page using JSP comments or a regular Javadoc comment inside a scriptlet. Pulling these comments out is then a matter of some simple parsing. This may be done by using one of our favorite tools, regular expressions, or using more HTML-specific parsing and subsequent massaging.

Where it gets tricky is when we want to start generating documentation that includes elements such as JSP pages, which may be included using many different mechanisms—static includes, <jsp:include.../> tags, Tiles, SiteMesh, inserted via Ajax, and so on. Similarly, generating connections between pages is fraught with custom cases. We might use general-purpose HTML links, Struts 2 link tags, attach a link to a page element with JavaScript, ad infinitum/nauseum.

When we throw in the (perhaps perverse) ability to generate HTML using Java, we have a situation where creating a perfectly general-purpose tool is a major undertaking. However, we can fairly easily create a reasonable set of documentation that is specific to our framework by parsing configuration files (or scanning a classpath for annotations), understanding how we're linking the server-side to our presentation views, and performing (at least limited) HTML/JSP parsing to pull out presentation-side dependencies, links, and anything that we want documented.

Documenting JavaScript

If only there was a tool such as Javadoc for JavaScript. The JsDoc Toolkit provides Javadoc-like functionality for JavaScript, with additional features to help handle the dynamic nature of JavaScript code. Because of the dynamic nature, we (as developers) must remain diligent in both in how we write our JavaScript and how we document it.

Fortunately, the JsDoc Toolkit is good at recognizing current JavaScript programming paradigms (within reason), and when it can't, provides Javadoc-like tags we can use to give it hints.

For example, consider our JavaScript Recipe module where we create several private functions intended for use only by the module, and return a map of functions for use on the webpage. The returned map itself contains a map of validation functions. Ideally, we'd like to be able to document all of the different components.

Because of the dynamic nature of JavaScript, it's more difficult for tools to figure out the context things should belong to. Java is much simpler in this regard (which is both a blessing and a curse), so we need to give JsDoc hints to help it understand our code's layout and purpose.

A high-level flyby of the Recipe module shows a layout similar to the following:

var Recipe = function () {
    var ingredientLabel;
    var ingredientCount;
    // ...
    function trim(s) {
        return s.replace(/^s+|s+$/g, "");
    }
    function msgParams(msg, params) {
        // ...
    }
    return {
        loadMessages: function (msgMap) {
            // ...
        },
        prepare: function (label, count) {
            // ...
        },
        pageValidators: {
            validateIngredientNameRequired: function (form) {
                // ...
            },
            // ...
        }
    };
}();

We see several documentable elements: the Recipe module itself, private variables, private functions, and the return map which contains both functions and a map of validation functions. JsDoc accepts a number of Javadoc-like document annotations that allow us to control how it decides to document the JavaScript elements.

The JavaScript module pattern, exemplified by an immediately-executed function, is understood by JsDoc through the use of the @namespace annotation.

/**
* @namespace
* Recipe module.
*/
var Recipe = function () {
// ...
}();

We can mark private functions with the @private annotation as shown next:

/**
* @private
* Trims leading/trailing space.
*/
function trim(s) {
return s.replace(/^s+|s+$/g, "");
}

It gets interesting when we look at the map returned by the Recipe module:

return /** @lends Recipe */ {
/**
* Loads message map.
*
* <p>
* This is generally used to pass in text resources
* retrieved via <s:text.../> or <s:property
* value="getText(...)"/> tags on a JSP page in lieu
* of a normalized way for JS to get Java I18N resources
* </p>
*/
loadMessages: function (msgMap) {
_msgMap = msgMap;
},
// ...

The @lends annotation indicates that the functions returned by the Recipe module belong to the Recipe module. Without the @lends annotation, JsDoc doesn't know how to interpret the JavaScript in the way we probably intend the JavaScript to be used, so we provide a little prodding.

The loadMessages() function itself is documented as we would document a Java method, including the use of embedded HTML.

The other interesting bit is the map of validation functions. Once again, we apply the @namespace annotation, creating a separate set of documentation for the validation functions, as they're used by our validation template hack and not directly by our page code.

/**
* @namespace
* Client-side page validators used by our template hack.
* ...
*/
pageValidators: {
/**
* Insures each ingredient with a quantity
* also has a name.
*
* @param {Form object} form
* @type boolean
*/
validateIngredientNameRequired: function (form) {
// ...

Note also that we can annotate the type of our JavaScript parameters inside curly brackets. Obviously, JavaScript doesn't have typed parameters. We need to tell it what the function is expecting. The @type annotation is used to document what the function is expected to return. It gets a little trickier if the function returns different types based on arbitrary criteria. However, we never do that because it's hard to maintain.

JsDoc has the typical plethora of command-line options, and requires the specification of the application itself (written in JavaScript, and run using Rhino) and the templates defining the output format. An alias to run JsDoc might look like the following, assuming the JsDoc installation is being pointed at by the ${JSDOC} shell variable:

alias jsdoc='java -jar ${JSDOC}/jsrun.jar
${JSDOC}/app/run.js -t=${JSDOC}/templates/jsdoc'

The command line to document our Recipe module (including private functions using the -p options) and to write the output to the jsdoc-out folder, will now look like the following:

jsdoc -p -d=jsdoc-out recipe.js

The homepage looks similar to a typical JavaDoc page, but more JavaScript-like:

documenting-our-application-apache-struts-2-part-2-img-1

A portion of the Recipe module's validators, marked by a @namespace annotation inside the @lends annotation of the return map, looks like the one shown in the next image (the left-side navigation has been removed):

documenting-our-application-apache-struts-2-part-2-img-2

We can get a pretty decent and accurate JavaScript documentation using JsDoc, with only a minimal amount of prodding to help with the dynamic aspects of JavaScript, which is difficult to figure out automatically.

Documenting interaction

Documenting interaction can be surprisingly complicated, particularly in today's highly-interactive Web 2.0 applications. There are many different levels of interactivity taking place, and the implementation may live in several different layers, from the JavaScript browser to HTML generated deep within a server-side framework.

UML sequence diagrams may be able to capture much of that interactivity, but fall somewhat short when there are activities happening in parallel. AJAX, in particular, ends up being a largely concurrent activity. We might send the AJAX request, and then do various things on the browser in anticipation of the result.

More UML and the power of scribbling

The UML activity diagram is able to capture this kind of interactivity reasonably well, as it allows a single process to be split into multiple streams and then joined up again later. As we look at a simple activity diagram, we'll also take a quick look at scribbling, paper, whiteboards, and the humble digital camera.

Don't spend so much time making pretty pictures!

One of the hallmarks of lightweight, agile development is that we don't spend all of our time creating the World's Most Perfect Diagram™. Instead, we create just enough documentation to get our points across. One result of this is that we might not use a $1,000 diagramming package to create all of our diagrams. Believe it or not, sometimes just taking a picture of a sketched diagram from paper or a whiteboard is more than adequate to convey our intent, and is usually much quicker than creating a perfectly-rendered software-driven diagram.

documenting-our-application-apache-struts-2-part-2-img-3

Yes, the image above is a digital camera picture of a piece of notebook paper with a rough activity diagram. The black bars here are used to indicate a small section of parallel functionality, a server-side search and some activity on the browser. The browser programming is informally indicated by the black triangles. In this case, it might not even be worth sketching out. However, for moderately more complicated usage cases, particularly when there is a lot of both server- and client-side activity, a high-level overview is often worth the minimal effort.

The same digital camera technique is also very helpful in meetings where various documentation might be captured on a whiteboard. The resulting images can be posted to a company wiki, used in informal specifications, and so on.