Part 2: Google Maps
Section 1: Introduction to Part 2
When approaching any sufficiently high-level computing topic, it is often true that we need to be familiar with a broad range of underlying and related concepts to establish the foundation necessary to learn the new material. Virtually every technology not only borrows from, but also builds on others. Furthermore, it is often the case with programming projects that the bulk of the work comes at the beginning and involves carefully thinking through the problem, devising a solution, and building those parts of the project that we need to write before we can get to the code we want to write.
In the first part of the tutorial we wrote a Perl script which allows us to generate valid KML files from collections of geotagged photos. Using our new tool, we generated a number of files containing all of the data (about our photos) that we’ll use to populate our Google Maps. We also looked at the structure of KML, and covered a number of other important topics, which are key to understanding the material presented here. If you have not read part 1, do take the time at least to glance through it to confirm for yourself that you won’t have trouble following along with this discussion.
Now we turn our attention to Google Maps.
After this short introduction we’ll handle the tutorial in 3 parts.
Part 1: This section discusses the document-object-model (DOM), which is a formal methodology for negotiating XML files, like the KML files we created in part 1.
Part 2: Next we’ll look at XHTML and the XHTML file we’ll use to display our Google Map. As you’ll see, the file included with the project is no more than a shell, but it does suffice to allow us to present our map. Furthermore, it affords us an opportunity to talk about integrating XHTML, Javacript, and the Google Maps API. There are a few important things we’ll need to understand.
Working with a simple page will help us avoid complicating the issues unnecessarily and will make it easy for you to relocate the map to a page of your own (maybe as part of a weblog for example) with a minimum amount of fuss.
Before we get to all of that, there are a couple of preliminary topics to talk about:
We’ve already looked at XML, describing it as an open, extensible markup language. We’ve said that XML is a metalanguage which allows us to define other languages with specific syntactic structures and vocabularies tailored to discrete problem domains. Furthermore we’ve said that KML is one such XML-based language, engineered to address the problem of describing geographical data and the positioning of geographical placemarks and other features to display on a map, among other applications.
We haven’t yet talked about XHTML which is a reimplementation of HTML as a standard, XML-based format. HTML (Hypertext Markup Language) has served as the preeminent language for authoring documents on the World Wide Web since the web’s inception in the late 1980s and early 1990s.
HTML is the markup language1; which makes hypertext2; linking between documents on the web possible and defines the set of tags, i.e. elements and attributes, that indicate the structure of a document using plain-text codes included alongside the content itself. Just as the web was not the first hypertext system3, HTML was not the first markup language, though both have become extremely important.
Without delving too deeply into the primordial history of markup, HTML is a descendant of SGML (Standard Generalized Markup Language) which is itself an offspring of GML (Generalized Markup Language).
The Wikipedia entry for GML offers this description of the language:
GML frees document creators from specific document formatting concerns such as font specification, line spacing, and page layout required by Script.
SCRIPT was an early text formatting language developed by IBM, and a precursor to modern page description languages like Postscript and LaTeX, that describe the structure and appearance of documents. Not surprisingly the goal of GML is echoed in the intent of HTML, though the two are far removed from each other.
Berners-Lee; considered HTML to be an application of SGML from the very beginning, but with a clear emphasis on simplicity and winnowing down much of the overhead that had always characterized formal markup languages. In the early days, the web had to struggle for acceptance. The birth of the web is a story of humble beginnings4. The significance of the web was by no means a forgone conclusion and early adoption required that the markup syntax of the web be as accessible as possible. After all, more stodgy languages already existed.
The simplicity of HTML certainly contributed to the success of the web (among many other factors), but it also meant that the standard was lacking in key areas. As the web gained wider appeal there was a tendency to take advantage of the early permissiveness of the standard. Developers of mainstream web browsers exacerbated the problem significantly by tolerating noncompliant documents and encouraging the use of proprietary tags in defiance of efforts on the part of standards organizations to reign in the syntax of the language.
In the short term, this was a confusing situation for everyone and led to incompatibilities and erratic behavior among an ever increasing multitude of documents on the web and the various web browsers available, which differed in their support for, and interpretation of, what were in essence emerging ‘dialects’ of HTML. These web browsers differed not only from one another, but also frequently from one version to the next of the same application.
There was a real need to establish stability in the language. Lack of dependable syntax meant that the job of building a parser capable of adequately adhering to the standards, and at the same time accommodating various abuses of the standard, had become an exercise in futility. Unlike HTML, XHTML must strictly adhere to the XML standard, which means that any compliant XML parser can negotiate properly formatted documents with relative ease. This reliability paves the way for a more efficient and sophisticated web, resulting in not only more consistent rendering of content, but the development of software that is able to differentiate and act on content based on context, and relationships among data. This is one of the primary advantages of XML, as has already been discussed, and this is a key advantage of XHTML as well.
Long term, Berners-Lee, who currently serves as the director of the World Wide Web Consortium (W3C); and a senior researcher at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), continues to actively pursue the vision of a Semantic Web. That is, a web that can be mined as a rich repository of data by intelligent software agents, which not only present information, perform basic pattern matching, and the like, but are capable of analysing information (in a truer sense—keep this or get rid of it?).
From the W3C’s Semantic Web Roadmap we have this brief introduction to the concept:
The Web was designed as an information space, with the goal that it should be useful not only for human-human communication, but also that machines would be able to participate and help. One of the major obstacles to this has been the fact that most information on the Web is designed for human consumption, and even if it was derived from a database with well defined meanings (in at least some terms) for its columns, that the structure of the data is not evident to a robot browsing the web. Leaving aside the artificial intelligence problem of training machines to behave like people, the Semantic Web approach instead develops languages for expressing information in a machine processable form.
Whether this vision accurately predicts the future of the web remains to be seen. At the moment the Semantic Web is a veritable stew of protocols, frameworks, concepts, extensions, namespaces, vocabularies, schema, ontologies, and other components. It is the subject of much debate, virtually all of it far beyond the scope of this tutorial. But this ‘brave new web’ is not purely academic.
The increased emphasis on standards-compliant markup has resulted in developers of web browsers and content creation tools steering their apps toward the standards. This in turn motivates authors to produce compliant documents. In fact the Strict variations of XHTML do not allow for deviation from the standards. Two immediate benefits of this work, whether or not it ultimately leads to some future web, are (1) more consistent document rendering across platforms among mainstream web browsers, (2) and the emergence of the Document Object Model (DOM). We’ll look at the DOM shortly.
Section 2: Object Models and the DOM
An object model is a collection of objects and the typically hierarchical, often complex, relationships among them, where the most generic definition of an object is simply a ‘thing’. Object models are all around us and are not limited to programming or the broader topic of computing for that matter. If, for example, you were to describe a car in terms of its object model:
You might start by talking about the chassis, or base frame, and then go on to describe the body of the car, its wheels, axles, exhaust system, the drivetrain, etc.; everything that is directly attached to the frame. For each of these you could describe various attributes and elements of that object; for e.g. the body material is an attribute of the body object. The elements may in fact be other objects that collectively comprise those objects within which they are contained.
The body of the car contains the engine compartment, passenger compartment, and trunk elements. The engine compartment includes the engine which is itself a collection of smaller objects. The passenger compartment can be described as the collection of front and rear compartment objects, where the front compartment object includes at least a driver side compartment with a steering wheel, instrument panel, etc. These objects have their own attributes and elements perhaps consisting of other objects each of which is an object itself, with a more specific function than the objects in which they are contained.
If you’ve ever put together a toy model of a car you may remember seeing an exploded schematic diagram of the completed model which clearly shows all of the objects from the largest container elements to the smallest ties, screws and other fasteners that hold all of the pieces together. This is a nice way to visualize the object model of the toy car.
Of course the object model I have described is only an informal example. It may be fair to dispute the model I’ve sketched here. As long as you have a better general understanding of what an object model is then this example has served its purpose.
Object models are very common within the realms of computer technology, information technology, programming languages, and formal notation.
The Document Object Model
The Document Object Model (DOM); is a standard object model for describing, accessing and manipulating HTML documents and XML-based formats. The often quoted description of the DOM from the W3C’s site dedicated to the specification is:
The Document Object Model is a platform- and language-neutral interface that will allow programs and scripts to dynamically access and update the content, structure and style of documents. The document can be further processed and the results of that processing can be incorporated back into the presented page.
Remember that XHTML is a special case of XML (i.e. it is an XML based format) and essentially no more than a formalization of HTML. Because it is an XML based format the XML DOM applies. Furthermore, because XHTML describes a single format with a number of required structural elements and only a limited collection of allowable elements and attributes, the HTML DOM has a number of unique objects and methods for acting on those objects that are not available in the more generic XML DOM. Having said that, I want to emphasize the the two are more alike than the are different.
Technically we will look at the XML DOM here but nearly everything discussed is applicable to the HTML DOM as well. Though there are differences, some of which we’ll encounter later in this tutorial, as an introduction it is appropriate to limit ourselves to fundamental concepts, which the two share in common. We need both. We’ll rely on the XML DOM to parse the KML files we generated in part one of the tutorial to populate our Google Map, and the HTML DOM to add the map to our webpage. By the time we’ve completed this part of the tutorial hopefully you will appreciate just how integral the DOM is for modern web design and development, though we’ll only have skimmed the surface of it.
The concept of the DOM is actually quite easily understood. It will seem intuitive to anyone who has ever dealt with tree structures (from filesystem hierarchies to family trees). Even if you haven’t had any experience with this sort of data structure, you should anticipate being able to pick it up quickly.
Under the Document Object Model individual components of the structure are referred to as nodes. Elements, attributes and the text contained within elements are all nodes.
The DOM represents an XML document as an inverted tree with a root node at the top. As far as the structure of an actual XML document is concerned, the root is the element that contains all others. In every structure all other nodes are contained within a document node. In the KML files we’ve generated the Document element is the root element.
Every other node is a descendent of Document. We can express the reciprocal relationship by stating that Document is an ancestor of every element other than itself.
Relationships among nodes of the document are described in familial terms. Direct descendants are called child nodes or simply children of the their immediate ancestor, referred to as a parent.
In our KML files,
We can make a couple of other useful observations about this structure:
- Every node other than the root has exactly one parent.
- Parents may have any number of children including zero, though a node without any children won’t be referred to as a parent. (A node with no children is called a leaf.)
Implicitly there are other familial relationships among nodes. For example, elements with parents that are siblings could be thought of as ‘cousins’ I suppose, but it is unusual to see these relationships named or otherwise acknowledged.
There is one important subtlety. Text is always stored in a text node and never directly in some other element node. For example, the description elements in our KML files contain either plain text or html descriptions of associated Placemarks. This text is not contained directly in the description node. Instead the description node contains unseen text node which contains the descriptive text. So the text is a grandchild of the description node, and a child of a text node, which is the direct descendent of description. Make sure that you understand this before continuing.
Because of the inherent structure of XML, we can unambiguously navigate a document without knowing anything else about it, except that it validates. We can move around the tree without being able to name the nodes before we begin. Starting at the root document node, we can traverse that node’s children, move laterally among siblings, travel more deeply from parent to child, and then work our way back up the tree negotiating parent relationships. We haven’t yet described how we move among siblings.
The DOM allows us to treat siblings as a list of nodes, and take advantage of the relationships that exist among elements in any list. Specifically, we can refer to the first (firstChild) and last (lastChild) nodes to position ourselves in the list. Once we are at some location in the list, we can refer to the previous (previousSibling) and next (nextSibling) nodes to navigate among siblings. Programmatically we can use the DOM to treat siblings in a XML data structure as we would any other list. For example, we can loop through sibling nodes working on each node in turn.
Keep in mind that we are using generic terminology, not referring to specific node names and we are relying only on the structure of XML which we know must be dependable if the document adheres to the standard. This will work well for our KML files, and it is certainly not limited to KML.
There are primarily two techniques we can use to find and manipulate elements in our XML structure using the DOM.
- Node Properties
Firstly, we can take advantage of relationships among element nodes as we have been discussing.
A number of node properties, some of which have already been mentioned, allow us to move between nodes in the structure. These include:
If we look at a fragment of KML, similar to the KML files generated in part 1 of the tutorial, starting at the Placemark element…
…we see a number of these properties:
is the firstChild of
is the lastChild of
- The nextSibling of
- The previousSibling of
is the parentNode of , , , and
Secondly, we can use the method getElementsByTagName() to find any element regardless of the document structure.
For example, using the syntax…
…we can retrieve all
elements as a nodeList from the document which are descendants of the element we are using when we call the method.
elements in the document and stores that list at the variable list_of_nodes.
var list_of_nodes = getElementsByTagName(“name”);
What is a node list?
A node list (NodeList) is an object representing an ordered list of nodes, where each node represents an element in the XML document. The order of the NodeList is the same as the order of the elements in the document. Elements at the top of the document appear first in the list, and the first element returned, i.e. the first position in the list, is numbered 0.
Keep in mind that the list includes all <name> elements. If you look at the KML files we’ve generated you may notice that both <folder> and <placemark> elements contain <name>. We need to be aware that getElementsByTagName(“name”) will return all of these if we are starting at the document root. We can differentiate between these <name> elements in a number of different ways. For example we can insist that the node is a child of a Placemark node to exclude <name> elements that are the children of <folder> elements.
We need to be able to refer to various properties of these nodes if we are going to act on them in any reasonable way. The XML DOM exposes several useful node properties incl: nodeName, nodeValue, and nodeType.
nodeName: is (quite obviously) the name of the node. What might be less obvious is precisely how the DOM defines nodeName.
- The tag name is always the name of an element, e.g. ‘Placemark’ is the name of the <Placemark> elements
- The attribute name is the nodeName of an attribute, e.g. the name of the maxLines attribute of <Snippet> is ‘maxLines’
- The name of any text node is always the string ‘#text’. e.g., the plain text or html that we’re using as our Placemark descriptions are each contained in a text node, as has already been discussed, and the name of this text node is ‘#text’, and not the name of the element which surrounds the value in the XML document
- The nodeName of the root document node is always the literal string ‘#document’.
- The value of text nodes is text itself. So the text node of one of our Placemark
elements is all of the plain-text or html within the description tags. Again, as far as the DOM is concerned the text is actually contained within a text node which is a child of a description node
- The value of an attribute node is simply the attribute value. e.g. maxLines=”1″ has a nodeValue of 1
- nodeValue is not defined for the document node and all element nodes.
nodeValue: is what you might expect.
nodeType: is certainly not something you could guess.
A specific value is assigned to each of the available types (categories) of nodes. The following is an incomplete list of common node types.
- Element, type 1
- Attribute, type 2
- Text, type 3
- Comment, type 8
- Document, type 9