6 min read

(For more resources related to this topic, see here.)

So far we have worked with relatively small sets of data; for larger collections, Bing Maps provide Spatial Data Services. They offer the Data Source Management API to load large datasets into Bing Maps servers, and the Query API to query the data.

In this article, we will use the Geocode Dataflow API, which provides geocoding and reverse geocoding of large datasets. Geocoding is the process of finding geographic coordinates from other geographic data, such as, street addresses or postal codes. Reverse geocoding is the opposite process, where the coordinates are used to find their associated textual locations, such as, addresses and postal codes. Bing Maps implement these processes by creating jobs on Bing Maps servers, and querying them later. All the process can be automated, which is ideal for huge amounts of data.

Please note that strict rules of data usage apply to the Spatial Data Services (please refer http://msdn.microsoft.com/en-us/library/gg585136.aspx for full details). At the moment of writing, a user with a basic account can set up to 5 jobs in a 24-hour period.

Our task in this article is to geocode the addresses of ten of the biggest technology companies in the world, such as Microsoft, Google, Apple, Facebook, and so on, and then display them on the map. The first step is to prepare the file with the companies’ addresses.

Geocoding dataflow input data

The input and output data can be supplied in the following formats:

  • XML (content type application/xml)
  • Comma separated values (text/plain)
  • Tab-delimited values (text/plain)
  • Pipe-delimited values (text/plain)
  • Binary (application/octet-stream) used with Blob Service REST API

We will use the XML format, for its clearer declarative structure.

Now, let’s open Visual Studio and create a new C# Console project named LBM.Geocoder. We then add a Datafolder, which will contain all the data files and samples with which we’ll work in this article, starting with the data.xml file we need to upload to the Spatial Data servers to be geocoded.

<?xml version="1.0" encoding="utf-8"?>
<GeocodeFeed

Version="2.0">
<GeocodeEntity Id="001"
>
<GeocodeRequest Culture="en-US" IncludeNeighborhood="1">
<Address AddressLine="1 Infinite Loop"
AdminDistrict="CA" Locality="Cupertino" PostalCode="95014" />
</GeocodeRequest>
</GeocodeEntity>
<GeocodeEntity Id="002"
>
<GeocodeRequest Culture="en-US" IncludeNeighborhood="1">
<Address AddressLine="185 Berry St"
AdminDistrict="NY" Locality="New York" PostalCode="10038"
/>
</GeocodeRequest>
</GeocodeEntity>

The listing above is a fragment of that file with the addresses of the companies’ headquarters. Please note that the more addressing information we provide to the API, the better quality geocoding we receive. In production, this file would be created programmatically, probably based on an Addresses database. The ID of GeocodeEntity, could also be stored, so that the data is matched easier once fetched from the servers. (You can find the Geocode Dataflow Data Schema, Version 2, at http://msdn.microsoft.com/en-us/library/jj735477.aspx.)

The job

Let’s add a Jobclass to our project:

public class Job
{
private readonly string dataFilePath;
public Job(string dataFilePath)
{
this.dataFilePath = dataFilePath;
}
}

The dataFilePath argument is the path to the data.xml file we created earlier.

Creating the job is as easy as calling a REST URL:

public void Create()
{
var uri =
String.Format("{0}?input=xml&output=xml&key={1}",
Settings.DataflowUri, Settings.Key);
var data = File.ReadAllBytes(dataFilePath);
try
{
var wc = new WebClient();
wc.Headers.Add("Content-Type", "application/xml");
var receivedBytes = wc.UploadData(uri, "POST",
data);
ParseJobResponse(receivedBytes);
}
catch (WebException e)
{
var response = (HttpWebResponse)e.Response;
var status = response.StatusCode;
}
}

We place all the API URLs and other settings in the Settings class:

public class Settings
{
public static string Key = "[YOUR BING MAPS KEY];
public static string DataflowUri =
"https://spatial.virtualearth.net/REST/v1/dataflows/
geocode";

public static XNamespace XNamespace =
"http://schemas.microsoft.com/search/local/ws/rest/v1";
public static XNamespace GeocodeFeedXNamespace =
"http://schemas.microsoft.com/search/local/2010/5/geocode";
}

To create the job, we need to build the Dataflow URL template with a Bing Maps Key, and parameters such as input and output formats. We specify the latter to be XML.

Next, we use a WebClient instance to load the data with a POST protocol. Then, we parse the server response:

private void ParseJobResponse(byte[] response)
{
using (var stream = new MemoryStream(response))
{
var xDoc = XDocument.Load(stream);
var job = xDoc.Descendants(Settings.XNamespace +
"DataflowJob").FirstOrDefault();
var linkEl = job.Element(Settings.XNamespace +
"Link");
if (linkEl != null) Link = linkEl.Value;
}
}

Here, we pass the stream created with the bytes received from the server to the XDocument.load method. This produces an XDocument instance, which we will use to extract the data we need. We will apply a similar process throughout the article to parse XML content. Note that the appropriate XNamespace needs to be supplied in order to navigate through the document nodes.

You can find a sample of the response inside the Data folder (jobSetupResponse.xml), which shows that the link to the job created is found under a Link element within the DataflowJob node.

Getting job status

Once we have set up a job, we can store the link on a data store, such as a database, and check for its status later. The data will be available on the Microsoft servers up to 14 days after creation.

Let’s see how we can query the job status:

public static JobStatus CheckStatus(string jobUrl)
{
var result = new JobStatus();
var uri = String.Format("{0}?output=xml&key={1}", jobUrl,
Settings.Key);
var xDoc = XDocument.Load(uri);
var job = xDoc.Descendants(Settings.XNamespace +
"DataflowJob").FirstOrDefault();
if (job != null)
{
var linkEls = job.Elements(Settings.XNamespace +
"Link").ToList();
foreach (var linkEl in linkEls)
{
var nameAttr = linkEl.Attribute("name");
if (nameAttr != null)
{
if (nameAttr.Value == "succeeded") result.
SucceededLink = linkEl.Value;
if (nameAttr.Value == "failed") result.FailedLink
= linkEl.Value;
}
}
var statusEl = job.Elements(Settings.XNamespace +
"Status").FirstOrDefault();
if (statusEl != null) result.Status = statusEl.Value;
}
return result;
}

Now, we know that to query a Data API we need to first build the URL template. We do this by attaching the Bing Maps Key and an output parameter to the job link.

The response we get from the server, stores the job status within a Link element of a DataflowJob node (the jobResponse.xml file inside the Data folder contains an example). The link we need has a name attribute with the value succeeded.

Summary

When it comes to large amounts of data, the Spatial Data Services offer a number of interfaces to store, and query user data; geocode addresses or reverse geocode geographical coordinates. The services perform these tasks by means of background jobs, which can be set up and queried through REST URLs.

Resources for Article:


Further resources on this subject:


LEAVE A REPLY

Please enter your comment!
Please enter your name here