Introduction

A scholar who never travels but stays at home is not worthy to be accounted a scholar. From my youth on I had the ambition to travel, but could not afford to wander over the three hundred counties of Korea, much less the whole world. So, carrying out an ancient practise, I drew a geographical atlas. And while gazing at it for long stretches at a time I feel as though I was carrying out my ambition . . . Morning and evening while bending over my small study table, I meditate on it and play with it and there in one vast panorama are the districts, the prefectures and the four seas, and endless stretches of thousands of miles.

WON HAK-SAENG
Korean Student
Preface to his untitled manuscript atlas of China during the Ming Dynasty, dated 1721.

To say that maps are important is an understatement almost as big as the world itself. Maps quite literally connect us to the world in which we all live, and by extension, they link us to one another. The oldest preserved maps date back nearly 4500 years. In addition to connecting us to our past, they chart much of human progress and expose the relationships among people through time.

Unfortunately, as a work of humankind, maps share many of the same shortcomings of all human endeavors. They are to some degree inaccurate and they reflect the bias of the map maker. Advancements in technology help to address the former issue, and offer us the opportunity to resist the latter. To the extent that it's possible for all of us to participate in the map making, the bias of a select few becomes less meaningful.

Google Earth and Google Maps are two applications that allow each of us to assert our own place in the world and contribute our own unique perspective. I can think of no better way to accomplish this than by combining maps and photography.

Photos reveal much about who we are, the places we have been, the people with whom we have shared those experiences, and the significant events in our lives. Pinning our photos to a map allows us to see them in their proper geographic context, a valuable way to explore and share them with friends and family. Photos can reveal the true character of a place, and afford others the opportunity to experience these destinations, perhaps faraway and unreachable for some of us, from the perspective of someone who has actually been there.

In this tutorial I'll show you how to 'pin' your photos using Google Earth and Google Maps. Both applications are free, and available for Windows, Mac OS X, and Linux. In the case of Google Earth there are requirements associated with installing and running what is a local application. Google Maps has its own requirements: primary among them a compatible web browser (the highly regarded FireFox is recommended).

In Google Earth, your photos show up on the map within the application complete with titles, descriptions, and other relevant information. You can choose to share your photos with everyone, only people you know, or even reserve them strictly for yourself.

Google Maps offers the flexibility to present maps outside of a traditional application. For example you can embed a map on a webpage pinpointing the location of one particular photo, or mapping a collection of photos to present along with a photo gallery, or even collecting all of your digital photos together on one dynamic map.

Over the course of a short couple of articles we'll cover everything you need to take advantage of both applications. I've put together two scripts to help us accomplish that goal. The first is a Perl script that works through your photos and generates a file in the proper format with all of the data necessary to include those photos in Google Earth. The second is a short bit of Javascript that works with the first file and builds a dynamic Google Map of those same photos. Both scripts are available for you to download, after which you are free to use them as is, or modify them to suit your own projects. I've purposefully kept the code as simple as possible to make them accessible to the widest audience, even those of you reading this who may be new to programming or unfamiliar with Perl, Javascript or both. I've taken care to comment the code generously so that everything is plainly obvious. I'm hopeful that you will be surprised at just how simple it is.

There a couple of preliminary topics to examine briefly before we go any further.

In the preceding paragraph I mentioned that the result of the first of our two scripts is a 'file in the proper format...'. This file, or more to the point the file format, is a very important part of the project. KML (Keyhole Markup Language) is a fairly simple XML-based format that can be considered the native "language" of Google Earth. That description begs the question, 'What is XML?'.

To oversimplify, because even a cursory discussion of XML is outside of the scope of this article, XML (Extensible Markup Language) is an open data format (in contrast to proprietary formats) which allows us to present information in such way that we communicate not only the data itself, but also descriptive information about the data and the relationships among elements of the data. One of the technical terms that applies is 'metalanguage', which approximately translates to mean a language that makes it possible to describe other languages. If you're unfamiliar with the concept, it may be difficult to grasp at first or it may not seem impressive. However, metalanguages, and specifically XML, are an important innovation (I don't mean to suggest that it's a new concept. In fact XML has roots that are quite old, relative to the brief history of computing). These metalanguages make it possible for us to imbue data with meaning such that software can make use of that data. Let's look at an example taken from the Wikipedia entry for KML.

<?xml version="1.0" encoding="UTF-8"?>
<kml >
<Placemark>
 <description>New York City</description>
 <name>New York City</name>
 <Point>
 <coordinates>-74.006393,40.714172,0</coordinates>
 </Point>
</Placemark>
</kml>

Ignore all of the pro forma stuff before and after the <Placemark>
tags and you might be able to guess what this bit of data represents. More importantly, a computer can be made to understand what it represents in some sense. Without the tags and structure, "New York City" is just a string, i.e. a sequence of characters. Considering the tags we can see that we're dealing with a place, (a Placemark), and that "New York City" is the name of this place (and in this example also its description). With all of this formal structure, programs can be made to roughly understand the concept of a Placemark, which is a useful thing for a mapping application.

Let's think about this for a moment. There are a very large number of recognizable places on the planet, and a provably infinite number of strings. Given a block of text, how could a computer be made to identify a place from, for example the scientific name of a particular flower, or a person's name? It would be extremely difficult.

We could try to create a database of every recognizable place and have the program check the database every time it encounters a string. That assumes it's possible to agree on a 'complete list of every place', which is almost certainly untrue. Keep in mind that we could be talking about informal places that are significant only to a small number of people or even a single person, e.g. 'the park where, as a child, I first learned to fly a kite'. It would be a very long list if we were going to include these sorts of places, and incredibly time-consuming to search.

Relying on the structure of the fragment above, we can easily write a program that can identify 'New York City' as the name of a place, or for that matter 'The park where I first learned to fly a kite'. In fact, I could write a program that pulls all of the place names from a file like this, along with a description, and a pair of coordinate points for each, and includes them on a map. That's exactly what we're going to do. KML makes it possible.

If I haven't made it clear, the structural bits of the file must be standardized. KML supports a limited set of elements (e.g. 'Placemark' is a supported element, as are 'Point' and 'coordinates'), and all of the elements used in a file must adhere to the standard for it to be considered valid.

The second point we need to address before we begin is, appropriately enough... where to begin? Lewis Carroll famously tells us to "Begin at the beginning and go on till you come to the end: then stop." Of course, Mr. Carroll was writing a book at the time. If "Alice's Adventures in Wonderland" were an article, he might have had different advice. From the beginning to the end there is a lot of ground to cover. We're going to start somewhere further along, and make up the difference with the following assumptions. For the purposes of this discussion, I am going to assume that you have:

Perl

Access to Phil Harvey's excellent ExifTool , a Perl library and command-line application for reading, writing, and editing metadata in images (among other file types). We will be using this library in our first script

A publicly accessible web server. Google requires the use of an API key by which it can monitor the use of its map services. Google must be able to validate your key, and so your site must be publicly available. Note that this a requirement for Google Maps only

Photos, preferably in a photo management application. Essentially, all you need is an app capable of generating both thumbnails and reduced size copies of your original photos . An app that can export a nice gallery for use on the web is even better

Coordinate data as part of the EXIF metadata embedded in your photos. If that sounds unfamiliar to you, then most likely you will have to take some additional steps before you can make use of this tutorial. I'm not aware of any digital cameras that automatically include this information at the time the photo is created. There are devices that can be used in combination with digital cameras, and there are a number of ways that you can 'tag' your photos with geographic data much the same way you would add keywords and other annotations.

Let's begin!

Part 1: Google Earth

Section 1: Introduction to Part 1

Time to begin the project in earnest. As I've already mentioned we'll spend the first half of this tutorial looking at Google Earth and putting together a Perl script whichthat, given a collection of geotagged photos, will build a set of KML files so that we can browse our photos in Google Earth. These same files will serve as the data source for our Google Maps application later on.

Section 2: Some Advice Before We Begin

Take your time to make sure you understand each topic before you move on to the next. Think of this as the first step in debugging your completed code. If you go slowly enough that you are to be able to identify aspects of the project that you don't quite understand, then you'll have some idea where to start looking for problems should things not go as expected. Furthermore, going slowly will give you the opportunity to identify those parts that you may want to modify to better fit the script to your own needs.

If this is new to you, follow along as faithfully as possible with what I do here the first time through. Feel free to make notes for yourself as you go, but making changes on the first pass may make it difficult for you to catch back on to the narrative and piece together a functional script. After you have a working solution, it will be a simple matter to implement changes one at a time until you have something that works for you. Following this approach it will be easy to identify the silly mistakes that tend to creep in once you start making changes.

There is also the issue of trust. This is probably the first time we're meeting each other, in which case you should have some doubt that my code works properly to begin with. If you minimize the number of changes you make, you can confirm that this works for you before blaming yourself or your changes for my mistakes. I will tell you up front that I'm building this project myself as we go. You can be certain at least that it functions as described for me as of the date attached to the article. I realize that this is quite different from being certain that the project will work for you, but at least it's something.

The entirety of my project is available for you to download. You are free to use all of it for any legal purpose whatsoever, including my photos in addition to all of the code, icons, etc. This is so you have some samples to use before you involve your own content. I don't promise that they are the most beautiful images you have ever seen, but they are all decent photos and properly annotated with the necessary metadata, including geographic tags.

Section 3: Photos, Metadata and ExifTool

To begin, we must have a clear understanding of what the Perl script will require from us. Essentially, we need to provide it with a selection of annotated image files, and information about how to reference those files.

A simple folder of files is sufficient, and will be convenient for us, both as the programmer and end user. The script will be capable of negotiating nested folders, and if a directory contains both images and other file types, non-image types will be ignored.

Typically, after a day of taking photos I'll have 100 to 200 that I want to keep. I delete the rest immediately after offloading them from the camera. For the files that are left, I preserve the original grouping, keeping all of the files together in a single folder. I place this folder of images in an appropriate location according to a scheme that serves to keep my complete collection of photos neatly organized. These are my digital 'negatives'. I handle all subsequent organization, editing, enhancements, and annotations within my photo management application. I use Apple Inc.'s Aperture; but there are many others that do the job equally well.

Annotating your photos is well worth the investment of time and effort, but it's important that you have some strategy in mind so that you don't create meaningless tags that are difficult to use to your advantage. For the purposes of this project the tags we'll need are quite limited, which means that going forward we will be able to continue adding photos to our maps with a reasonable amount of work.

The annotations we need are:

Caption
Latitude
Longitude
Image Date *
Location/Sub-Location
City
Province/State
Country Name
Event
People
ImageWidth *
ImageHeight *

* Values for these Exif tags are generated by your camera.

Note that these are labels used in Aperture, and are not necessarily consistent from one application to the next. Some of them are more likely than others to be used reliably. 'City' for example should be dependable, while the labels 'People', 'Events', and 'Location', among others, are more likely to differ. One explanation for these variations is that the meaning of these fields are more open to interpretation. Location, for example, is likely to be used to narrow down the area where the photo was taken within a particular city, but it is left to the person who is annotating the photo to decide that the field should name a specific street address, an informal place (e.g. 'home' or 'school'), or a larger area, for example a district or neighborhood. Fortunately things aren't so arbitrary as they seem.

Each of these fields corresponds to a specific tag name that adheres to one of the common metadata formats (Exif¹, IPTC², XMP³, and there are others). These tag names are consistent as required by the standards. The trick is in determining the labels used in your application that correspond to the well-defined tag names. Our script relies on these metadata tags, so it is important that you know which fields to use in your application.

This gives us an excuse to get acquainted with ExifTool⁴. From the project's website, we have this description of the application:

ExifTool is a platform-independent Perl library plus a command-line application for reading, writing, and editing meta information in image, audio, and video files...

ExifTool can seem a little intimidating at first. Just keep in mind that we will need to understand just a small part of it for this project, and then be happy that such a useful and flexible tool is freely available for you to use.

The brief description above states in part that ExifTool is a Perl library and command line application that we can use to extract metadata from image files.

With a single short command, we can have the app print all of the metadata contained in one of our image files. First, make sure ExifTool is installed. You can test for this by typing the name of the application at the command line.

$ exiftool

If it is installed, then running it with no options should prompt the tool to print its documentation. If this works, there will be more than one screen of text. You can page through it by pressing the spacebar. Press the 'q' key at any time to stop.

If the tool is not installed, you will need to add it before continuing. See the appendix at the end of this tutorial for more information.

Having confirmed that ExifTool is installed, typing the following command will result in a listing of the metadata for the named image:

$ exiftool -f -s -g2 /path/image.jpg

Where 'path' is an absolute path to image.jpg or a relative path from the current directory, and 'image.jpg' is the name of one of your tagged image files.

We'll have more to say about ExifTool later, but because I believe that no tutorial should ask the reader to blindly enter commands as if they were magic incantations, I'll briefly describe each of the options used in the command above:

-f, forces printing of tags even if their values are not found. This gives us a better idea about all of the available tag names, whether or not there are currently defined values for those tags.

-s, prints tag names instead of descriptions. This is important for our purposes. We need to know the tag names so that we can request them in our Perl code. Descriptions, which are expanded, more human-readable forms of the same names obscure details we need. For example, compare the tag name 'GPSLatitude' to the description 'GPS Latitude'. We can use the tag name, but not the description to extract the latitude value from our files.

-g2, organizes output by category. All location specific information is grouped together, as is all information related to the camera, date and time tags, etc. You may feel, as I do, that this grouping makes it easier to examine the output. Also, this organization is more likely to reflect the grouping of field names used by your photo management application.

If you prefer to save the output to a file, you can add ExifTool's -w option with a file extension.

$ exiftool -f -s -g2 -w txt path/image.jpg

This command will produce the same result but write the output to the file 'image.txt' in the current directory; again, where 'image.jpg' is the name of the image file. The -w option appends the named extension to the image file's basename, creates a new file with that name, and sets the new file as the destination for output.

The tag names that correspond to the list of Aperture fields presented above are:

metadata tag name

Aperture field label

Caption-Abstract

Caption

GPSLatitude

Latitude

GPSLongitude

Longitude

DateTimeOriginal

Image Date

Sub-location

Location/Sub-Location

City

City

Province-State

Province/State

Country-PrimaryLocationName

Country Name

FixtureIdentifier

Event

Contact

People

ImageWidth

Pixel Width

ImageHeight

Pixel Height

Section 4: Making Photos Available to Google Earth

We will use some of the metadata tags from our image files to locate our photos on the map (e.g. GPSLatitude, GPSLongitude), and others to describe the photos. For example, we will include the value of the People tag in the information window that accompanies each marker to identify friends and family who appear in the associated photo.

Because we want to display and link to photos on the map, not just indicate their position, we need to include references to the location of our image files on a publicly accessible web server. You have some choice about how to do this, but for the implementation described here we will (1) Display a thumbnail in the info window of each map marker and (2) include a link to the details page for the image in a gallery created in our photo management app.

When a visitor clicks on a map marker they will see a thumbnail photo along with other brief descriptive information. Clicking a link included as part of the description will open the viewer's web browser to a page displaying a large size image and additional details. Furthermore, because the page is part of a gallery, viewers can jump to an index page and step forward and back through the complete collection. This is a complementary way to browse our photos. Taking this one step further, we could add a comments section to the gallery pages or replace the gallery altogether, instead choosing to display each photo as a weblog post for example.

The structure of the gallery created from my photo app is as follows...

/ (the root of the gallery directory)

index.html
large-1.html
large-2.html
large-3.html
...
large-n.html

assets/
css/
img/
catalog/

pictures/
picture-1.jpg
picture-2.jpg
picture-3.jpg
...
picture-n.jpg

thumbnails/
thumb-1.jpg
thumb-2.jpg
thumb-3.jpg
...
thumb-n.jpg

The application creates a root directory containing the complete gallery. Assuming we do not want to make any manual changes to the finished site, publishing is as easy as copying the entire directory to a location within the web server's document root.

assets/: Is a subfolder containing files related to the theme itself. We don't need to concern ourselves with this sub-directory.

catalog/: Contains a single catalog.plist file which is specific to Aperture and not relevant to this discussion.

pictures/: Contains the large size images included on the detail gallery pages.

thumbnails/: This subfolder contains the thumbnail images corresponding to the large size images in pictures/.

Finally, there are a number of files at the root of the gallery. These include index pages and files named 'large-n.html', where n is a number starting at 1 and increasing sequentially e.g. large-1.html, large-2.html, large-3.html, ...

The index files are the index pages of our gallery. The number of index pages generated will be dictated by the number of image files in the gallery, as well as the gallery's layout and design. index.html is always the first gallery page.

The large-n.html files are the details pages of our gallery. Each page features an individual photo with links to the previous and next photos in sequence and a link back to the index. You can see the gallery I have created for this tutorial here:

http://robreed.net/photos/tutorial_gallery/

If you take the time to look through the gallery, maybe you can appreciate the value of viewing these photos on a map. Web-based photo galleries like this one are nice enough, but the photos are more interesting when viewed in some meaningful context.

There are a couple of things to notice about this gallery.

Firstly, picture-1.jpg, thumb-1.jpg, and large-1.html all refer to the same image. So if we pick one of the three files we can easily predict the names of the other two. This relationship will be useful when it comes to writing our script.

There is another important issue I need to call to your attention because it will not be apparent from looking only at the gallery structure. Aperture has renamed all of my photos in the process of exporting them.

The name of the original file from which picture-1.jpg was generated (as well as large-1.html and thumb-1.jpg) is 'IMG_0277.JPG', which is the filename produced by my camera. Because I want to link to these gallery files, not my original photos which will stay safely tucked away on a local drive, I must run the script against the photos in the gallery. I cannot run it against the original image files because the filenames referenced in the output are unrelated to the files in the gallery.

If my photo management app provided me the option of preserving the original filenames for the corresponding photos in the gallery, then I could run the script against the original image files or the gallery photos because all of the filenames would be consistent, but this is not the case. I don't have a problem as long as I run the script on the exported photos.

However, if I'm running the script against the photos in the web gallery, either the pictures or thumbnail images must contain the same metadata as the original image files. Aperture preserves the metadata in both. Your application may not.

A simple, dependable way to confirm that the metadata is present in the gallery files is to run ExifTool first against the original file and then the same photo in the pictures/ and thumbnails/ directories in the gallery. If ExifTool reports identical metadata, then you will have no trouble using one of pictures/ or thumbnails/ as your source directory. If the metadata is not present or not complete in the gallery files, you may need to use the script on your original image files. As has already been explained, this isn't a problem unless the gallery produces filenames that are inconsistent with the original filenames, as Aperture does. In this case you have a problem. You won't be able to run the script on the original image files because of the naming issue or on the gallery photos because they don't contain metadata.

Make sure that you understand this point.

If you find yourself in this situation, then your best bet is to generate files to use with your maps from your original photos in some other way, bypassing your photo management app's web gallery features altogether in favor of a solution that preserves the filenames, the metadata, or both. There is another option which involves setting up a relationship between the names of the original files and the gallery filenames. This tutorial does not include details about how to set up this association.

Finally, keep in mind that though we've looked at the structure of a gallery generated by Aperture, virtually all photo management apps produce galleries with a similar structure. Regardless of the application used, you should find:

A group of html files including index and details pages

A folder of large size image files

A folder of thumbnails

Once you have identified the structure used by your application, as we have done here, it will be a simple task to translate these instructions.

Section 5: Referencing Files Over the Internet

Now we can talk about how to reference these files and gallery pages so that we can create a script to generate a KML file that includes these references.

When we identify a file over the internet, it is not enough to use the filename, e.g. 'thumb-1.jpg', or even the absolute and relative path to the file on the local computer. In fact these paths are most likely not valid as far as your web server is concerned. Instead we need to know how to reference our files such that they can be accessed over the global internet, and the web more specifically. In other words, we need to be able to generate a URL (Uniform Resource Locator) which unambiguously describes the location of our files. The formal details of exactly what comprises a URL⁵; are more complicated than may be obvious at first, but most of us are familiar with the typical URL, like this one:

http://www.ietf.org/rfc/rfc1630.txt

which describes the location of a document titled "Universal Resource Identifiers in WWW" that just so happens to define the formal details of what comprises a URL.

http://www.ietf.org

This portion of the address is enough to describe the location of a particular web server over the public internet. In fact it does a little more than just specify the location of a machine. The http:// portion is called the scheme and it identifies a particular protocol (i.e. a set of rules governing communication) and a related application, namely the web. What I just said isn't quite correct; at one time, HTTP was used exclusively by the web, but that's no longer true. Many internet-based applications use the protocol because the popularity of the web ensures that data sent via HTTP isn't blocked or otherwise disrupted. You may not be accustomed to thinking of it as such, but the web itself is a highly-distributed, network-based application.

/rfc/

This portion of the address specifies a directory on the server. It is equivalent to an absolute path on your local computer. The leading forward slash is the root of the web server's public document directory. Assuming no trickiness on the part of the server, all content lives under the document root.

This tells us that rfc/ is a sub-directory contained within the document root. Though this directory happens to be located immediately under the root, this certainly need not be the case. In fact these paths can get quite long.

We have now discussed all of the URL except for:

rfc1630.txt

which is the name of a specific file. The filename is no different than the filenames on your local computer. Let's manually construct a path to one of the large-n.html pages of the gallery we have created.

The address of my server is robreed.net, so I know that the beginning of my URL will be:

http://robreed.net

I keep all of my galleries together within a photos/ directory, which is contained in the document root.

http://robreed.net/photos/

Within photos/, each gallery is given its own folder. The name of the folder I have created for this tutorial is 'tutorial_gallery'. Putting this all together, the following URL brings me to the root of my photo gallery:

http://robreed.net/photos/tutorial_gallery/

We've already gone over the directory structure of the gallery, so it should make sense you to that when referring to the 'large-1.html' detail page, the complete URL will be:

http://robreed.net/photos/tutorial_gallery/large-1.html

the URL of the image that corresponds to that detail page is:

http://robreed.net/photos/tutorial_gallery/pictures/picture-1.jpg

and the thumbnail can be found at:

http://robreed.net/photos/tutorial_gallery/thumbnails/thumb-1.jpg

Notice that the address of the gallery is shared among all of these resources. Also, notice that resources of each type (e.g. the large images, thumbnails, and html pages) share a more specific address with files of that same type.

If we use the term 'base address' to refer to the shared portions of these URLs, then we can talk about several significant base addresses:

The gallery base address: http://robreed.net/photos/tutorial_gallery/
The html pages base address: http://robreed.net/photos/tutorial_gallery/
The images base address: http://robreed.net/photos/tutorial_gallery/pictures/
The thumbnails base address: http://robreed.net/photos/tutorial_gallery/thumbnails/

Note that given the structure of this particular gallery, the html pages base address and the gallery base address are identical. This need not be the case, and may not be for the gallery produced by your application.

We can hard-code the base addresses into our script. For each photo, we need only append the associated filename to construct valid URLs to any of these resources. As the script runs, it will have access to the name of the file that it is currently evaluating, and so it will be a simple matter to generate the references we need as we go.

At this point we have discussed almost everything we need to put together our script. We have:

Created a gallery at our server, which includes our photos with metadata in tow

Identified all of the metadata tags we need to extract from our photos with the script and the corresponding field names in our photo management application

Determined all of the base addresses we need to generate references to our image files

Section 6: KML

The last thing we need to understand is the format of the KML files we want to produce.

We've already looked at a fragment of KML. The full details can be found on Google's KML documentation pages , which include samples, tutorials and a complete reference for the format.

A quick look at the reference is enough to see that the language includes many elements and attributes, the majority of which we will not be including in our files. That statement correctly implies it is not necessary for every KML file to include all elements and attributes. The converse however is not true, which is to say that every element and attribute contained in any KML file must be a part of the standard.

A small subset of KML will get us most, if not all, of what you will typically see in Google Earth from other applications.

Many of the features we will not be using deal with aspects of the language that are either:

Not relevant to this project, e.g. ground overlays (GroundOverlay) which "draw an image overlay draped onto the terrain"

Minute details for which the default values are sensible

There is no need to feel shortchanged because we are covering only a subset of the language. With the basic structure in place and a solid understanding of how to script the generation of KML, you will be able to extend the project to include any of the other components of the language as you see fit.

The structure of our KML file is as follows:

 1.    <?xml version="1.0" encoding="UTF-8"?>

2.    <kml >
3.        <Document>

4.            <Folder>
5.                <name>$folder_name</name>
6.                <description>$folder_description</description>

7.                <Placemark>
8.                    <name>$placemark_name</name>
9.                    <Snippet maxLines="1">
10.                        $placemark_snippet
11.                    </Snippet>
12.                    <description><![CDATA[
13.                        $placemark_description
14.                    ]]></description>
15.                    <Point>
16.                        <coordinates>$longitude,$latitude
</coordinates>
17.                    </Point>
18.                </Placemark>

19.            </Folder>

20.        </Document>
21.    </kml>

Line 1: XML header

Every valid KML file must start with this line and nothing else is allowed to appear before it. As I've already mentioned, KML is an XML-based language and XML requires this header.

Line 2: Namespace declaration

More specifically this is the KML namespace declaration, and it is another formality. The value of the >Line 3: <Document> is a container element representing the KML file itself. If we do not explicitly name the document by including a name element then Google Earth will use the name of the KML file as the Document element <name>.

The Document container will appear on the Google Earth 'Sidebar' within the 'Places' panel. Optionally we can control whether the container is closed or open by default. (This setting can be toggled in Google Earth using a typical disclosure triangle.) There are many other elements and attributes that can be applied to the Document element. Refer to the KML Reference for the full details.

Line 4: <Folder> is another container element. The files we produce will include a single <Folder> containing all of our Placemarks, where each Placemark represents a single image. We could create multiple Folder elements to group our Placemarks according to some significant criteria. Think of the Folder element as being similar to your operating system's concept of a folder.

At this point, note the structure of the fragment. The majority of it is contained within the Folder element. Folder, in turn, is an element of Document which is itself within the <kml> container. It should make sense that everything in the file that is considered part of the language must be contained within the kml element.

From the KML reference:

A Folder is used to arrange other Features hierarchically (Folders, Placemarks, NetworkLinks, or Overlays). A Feature is visible only if it and all its ancestors are visible.

Line 5: The name element identifies an object, in this case the Folder object. The text that appears between the name tags can be any plain text that will serve as an appropriate label.

Line 6: <description> is any text that seems to adequately describe the object.

The description element supports both plain text and a subset of HTML. We'll consider issues related to using HTML in <description> at the discussion of Placemark, lines 12 - 14.

Lines 7 - 17 define a <Placemark> element.

Note that Placemark contains a number of elements that also appear in Folder, including <name> (line 8), and <description> (lines 12 - 14). These elements serve the same purpose for Placemark as they do for the Folder element, but of course they refer to a different object.

I've said that <description> can include a subset of HTML in addition to plain text. Under XML, some characters have special meaning. You may need to use these characters as part of the HTML included in your descriptions. Angle brackets (<, >) for example surround tag names in HTML, but serve a similar purpose in XML. When they are used strictly as part of the content, we want the XML parser to ignore these characters.

We can accomplish this a few different ways:

We can use entity references, either numeric character references or character entity references, to indicate that the symbol appears as part of the data and should not be treated as part of the syntax of the language. The character '<', which is required to include an image as part of the description (something we will be doing), (e.g. <img src=... />) can safely be included as the character entity reference '<' or the numeric character reference '&#60'.

The character entity references may be easier to remember and recognize on sight but are limited to the small subset of characters for which they have been defined. The numeric references on the other hand can specify any ASCII character. Of the two types, numeric character references should be preferred. There are also Unicode entity references which can specify any character at all.

In the simple case of embedding short bits of HTML in KML descriptions, we can avoid the complication of these references altogether by enclosing the entire description in a CDATA⁶ element, which instructs the XML parser to ignore any special characters that appear until the end of the block set off by the CDATA tags.

Notice the string '<![CDATA[' immediately after the opening <description> tag within <Placemark>, and the string ]]> immediately before the closing </description> tag. If we simply place all of our HTML and plain text between those two strings, it will all be ignored by the parser and we are not required to deal with special characters individually. This is how we'll handle the issue.

Lines 9 - 11: <Snippet>

The KML reference does a good job of clearly describing this element.

From the KML reference:

In Google Earth, this description is displayed in the Places panel under the name of the feature. If a Snippet is not supplied, the first two lines of the <description> are used. In Google Earth, if a Placemark contains both a description and a Snippet, the <Snippet> appears beneath the Placemark in the Places panel, and the <description> appears in the Placemark's description balloon. This tag does not support HTML markup. <Snippet> has a maxLines attribute, an integer that specifies the maximum number of lines to display. Default for maxLines is 2.

Notice that in the block above at line 9, I have included a 'maxLines' attribute with a value of 1. Of course, you are free to substitute your own value for maxLines, or you can omit the attribute entirely to use the default value.

The only element we have yet to review is <Point>, and again we need only look to the official reference for a nice description.

From the KML reference:

A geographic location defined by longitude, latitude, and (optional) altitude. When a Point is contained by a Placemark, the point itself determines the position of the Placemark's name and icon.

<Point> in turn contains the element <coordinates> which is required.

From the KML reference:

A single tuple consisting of floating point values for longitude, latitude, and altitude (in that order).

The reference also informs us that altitude is optional, and in fact we will not be generating altitude values.

Finally the reference warns:

Do not include spaces between the three values that describe a coordinate.

This seems like an easy mistake to make. We'll need to be careful to avoid it.

There will be a number of <Placemark> elements, one for each of our images.

The question is how to handle these elements. The answer is that we'll treat KML as a 'fill in the blanks' style template. All of the structural and syntactic bits will be hard-coded, e.g. the XML header, namespace declaration, all of the element and attribute labels, and even the whitespace, which is not strictly required but will make it much easier for us to inspect the resulting files in a text editor. These components will form our template.

The blanks are all of the text and html values of the various elements and attributes. We will use variables as place-holders everywhere we need dynamic data, i.e. values that change from one file to the next or one execution of the script to the next.

Take a look at the strings I've used in the block above. $folder_name, $folder_description, $placemark_name, etc. For those of you unfamiliar with Perl, these are all valid variable names chosen to indicate where the variables slot into the structure. These are the same names used in the source file distributed with the tutorial.

Section 7: Introduction to the Code

At this point, having discussed every aspect of the project, we can succinctly describe how to write the code. We'll do this in 3 stages of increasing granularity.

Firstly, we'll finish this tutorial with a natural language walk-through of the execution of the script.

Secondly, if you look at the source file included with the project, you will notice immediately that comments dominate the code. Because instruction is as important an objective as the actual operation of the script, I use comments in the source to provide a running narration. For those of you who find this superabundant level of commenting a distraction, I'm distributing a second copy of the source file with much of the comments removed.

Finally, there is the code itself. After all, source code is nothing more than a rigorously formal set of instructions that describe how to complete a task. Most programs, including this one, are a matter of transforming input in one form to output in another. In this very ge