Querying and Selecting Data

12 min read

(For more resources related to this topic, see here.)

Constructing proper attribute query syntax

The construction of property attribute queries is critical to your success in creating geoprocessing scripts that query data from feature classes and tables. All attribute queries that you execute against feature classes and tables will need to have the correct SQL syntax and also follow various rules depending upon the datatype that you execute the queries against.

Getting ready

Creating the syntax for attribute queries is one of the most difficult and time-consuming tasks that you’ll need to master when creating Python scripts that incorporate the use of the Select by Attributes tool. These queries are basically SQL statements along with a few idiosyncrasies that you’ll need to master. If you already have a good understanding of creating queries in ArcMap or perhaps an experience with creating SQL statements in other programming languages, then this will be a little easier for you. In addition to creating valid SQL statements, you also need to be aware of some specific Python syntax requirements and some datatype differences that will result in a slightly altered formatting of your statements for some datatypes. In this recipe, you’ll learn how to construct valid query syntax and understand the nuances of how different datatypes alter the syntax as well as some Python-specific constructs.

How to do it…

Initially, we’re going to take a look at how queries are constructed in ArcMap, so that you can get a feel of how they are structured.

  1. In ArcMap, open C:ArcpyBookCh8Crime_Ch8.mxd.

  2. Right-click on the Burglaries in 2009 layer and select Open Attribute Table. You should see an attribute table similar to the following screenshot. We’re going to be querying the SVCAREA field:

  3. With the attribute table open, select the Table Options button and then Select by Attributes to display a dialog box that will allow you to construct an attribute query.

    Notice the Select * FROM Burglary WHERE: statement on the query dialog box (shown in the following screenshot). This is a basic SQL statement that will return all the columns from the attribute table for Burglary that meet the condition that we define through the query builder. The asterisk (*) simply indicates that all fields will be returned:

  4. Make sure that Create a new selection is the selected item in the Method dropdown list. This will create a new selection set.

  5. Double-click on SVCAREA from the list of fields to add the field to the SQL statement builder, as follows:

  6. Click on the = button.

  7. Click on the Get Unique Values button.

  8. From the list of values generated, double-click on ‘North’ to complete the SQL statement, as shown in the following screenshot:

  9. Click on the Apply button to execute the query. This should select 7520 records.

    Many people mistakenly assume that you can simply take a query that has been generated in this fashion and paste it into a Python script. That is not the case. There are some important differences that we’ll cover next.

  10. Close the Select by Attributes window and the Burglaries in 2009 table.

  11. Clear the selected feature set by clicking on Selection | Clear Selected Features.

  12. Open the Python window and add the code to import arcpy.

    import arcpy

  13. Create a new variable to hold the query and add the exact same statement that you created earlier:

    qry = "SVCAREA" = 'North'

  14. Press Enter on your keyboard and you should see an error message similar to the following:

    Runtime error SyntaxError: can't assign to literal (<string>, line1)

Python interprets SVCAREA and North as strings but the equal to sign between the two is not part of the string used to set the qry variable. There are several things we need to do to generate a syntactically correct statement for the Python interpreter.

One important thing has already been taken care of though. Each field name used in a query needs to be surrounded by double quotes. In this case, SVCAREA is the only field used in the query and it has already been enclosed by double quotes. This will always be the case when you’re working with shapefiles, file geodatabases, or ArcSDE geodatabases. Here is where it gets a little confusing though. If you’re working with data from a personal geodatabase, the field names will need to be enclosed by square brackets instead of double quotes as shown in the following code example. This can certainly leads to confusion for script developers.

qry = [SVCAREA] = 'North'

Now, we need to deal with the single quotes surrounding ‘North’. When querying data from fields that have a text datatype, the string being evaluated must be enclosed by quotes. If you examine the original query, you’ll notice that we have in fact already enclosed the word North with quotes, so everything should be fine right? Unfortunately, it’s not that simple with Python. Quotes, along with a number of other characters, must be escaped with a forward slash followed by the character being escaped. In this case, the escape sequence would be ‘.

  1. Alter your query syntax to incorporate the escape sequence:

    qry = "SVCAREA" = 'North'

  2. Finally, the entire query statement should be enclosed with quotes:

    qry = '"SVCAREA" = 'North''

In addition to the = sign, which tests for equality, there are a number of additional operators that you can use with strings and numeric data, including not equal (> <), greater than (<), greater than or equal to (<=), less than (>), and less than or equal to (>=).

Wildcard characters including % and _ can also be used for shapefiles, file geodatabases, and ArcSDE geodatabases. These include % for representing any number of characters. The LIKE operator is often used with wildcard characters to perform partial string matching. For example, the following query would find all records with a service area that begins with N and has any number of characters after.

qry = '"SVCAREA" LIKE 'N%''

The underscore character (_) can be used to represent a single character. For personal geodatabases the asterisk (*) is used to represent a wildcard character for any number of characters, while (?) represents a single character.

You can also query for the absence of data, also known as NULL values. A NULL value is often mistaken for a value of zero, but that is not the case. NULL values indicate the absence of data, which is different from a value of zero. Null operators include IS NULL and IS NOT NULL. The following code example will find all records where the SVCAREA field contains no data:


The final topic that we’ll cover in this section are operators used for combining expressions where multiple query conditions need to be met. The AND operator requires that both query conditions be met for the query result to be true, resulting in selected records. The OR operator requires that at least one of the conditions be met.

How it works…

The creation of syntactically correct queries is one of the most challenging aspects of programming ArcGIS with Python. However, once you understand some basic rules, it gets a little easier. In this section, we’ll summarize these rules. One of the more important things to keep in mind is that field names must be enclosed with double quotes for all datasets, with the exception of personal geodatabases, which require braces surrounding field names.

There is also an AddFieldDelimiters() function that you can use to add the correct delimiter to a field based on the datasource supplied as a parameter to the function. The syntax for this function is as follows:


Additionally, most people, especially those new to programming with Python, struggle with the issue of adding single quotes to string values being evaluated by the query. In Python, quotes have to be escaped with a single forward slash followed by the quote. Using this escape sequence will ensure that Python does in fact see that as a quote rather than the end of the string.

Finally, take some time to familiarize yourself with the wildcard characters. For datasets other than personal geodatabases, you’ll use the (%) character for multiple characters and an underscore (_) character for a single character. If you’re using a personal geodatabase, the (*) character is used to match multiple characters and the (?) character is used to match a single character. Obviously, the syntax differences between personal geodatabases and all other types of datasets can lead to some confusion.

Creating feature layers and table views

Feature layers and table views serve as intermediate datasets held in memory for use specifically with tools such as Select by Location and Select Attributes. Although these temporary datasets can be saved, they are not needed in most cases.

Getting ready

Feature classes are physical representations of geographic data and are stored as files (shapefiles, personal geodatabases, and file geodatabases) or within a geodatabase. ESRI defines a feature class as “a collection of features that shares a common geometry (point, line, or polygon), attribute table, and spatial reference.”

Feature classes can contain default and user-defined fields. Default fields include the SHAPE and OBJECTID fields. These fields are maintained and updated automatically by ArcGIS. The SHAPE field holds the geometric representation of a geographic feature, while the OBJECTID field holds a unique identifier for each feature. Additional default fields will also exist depending on the type of feature class. A line feature class will have a SHAPE_LENGTH field. A polygon feature class will have both, a SHAPE_LENGTH and a SHAPE_AREA field.

Optional fields are created by end users of ArcGIS and are not automatically updated by GIS. These contain attribute information about the features. These fields can also be updated by your scripts.

Tables are physically represented as standalone DBF tables or within a geodatabase. Both, tables and feature classes, contain attribute information. However, a table contains only attribute information. There isn’t a SHAPE field associated with a table, and they may or may not contain an OBJECTID field.

Standalone Python scripts that use the Select by Attributes or Select by Location tool require that you create an intermediate dataset rather than using feature classes or tables. These intermediate datasets are temporary in nature and are called Feature Layers or Table Views. Unlike feature classes and tables, these temporary datasets do not represent actual files on disk or within a geodatabase. Instead, they are “in memory” representations of feature classes and tables. These datasets are active only while a Python script is running. They are removed from memory after the tool has executed. However, if the script is run from within ArcGIS as a script tool, then the temporary layer can be saved either by right-clicking on the layer in the table of contents and selecting Save As Layer File or simply by saving the map document file.

Feature layers and table views must be created as a separate step in your Python scripts, before you can call the Select by Attributes or Select by Location tools. The Make Feature Layer tool generates the “in-memory” representation of a feature class, which can then be used to create queries and selection sets, as well as to join tables. After this step has been completed, you can use the Select by Attributes or Select by Location tool. Similarly, the Make Table View tool is used to create an “in-memory” representation of a table. The function of this tool is the same as Make Feature Layer. Both the Make Feature Layer and Make Table View tools require an input dataset, an output layer name, and an optional query expression, which can be used to limit the features or rows that are a part of the output layer. In addition, both tools can be found in the Data Management Tools toolbox.

The syntax for using the Make Feature Layer tool is as follows:

arcpy.MakeFeatureLayer_management(<input feature layer>, <output layer
name>,{where clause})

The syntax for using the Make Table View tool is as follows:

Arcpy.MakeTableView_management(<input table>, <output table name>,
{where clause})

In this recipe, you will learn how to use the Make Feature Layer and Make Table View tools. These tasks will be done inside ArcGIS, so that you can see the in-memory copy of the layer that is created.

How to do it…

Follow these steps to learn how to use the Make Feature Layer and Make Table View tools:

  1. Open c:ArcpyBookCh8Crime_Ch8.mxd in ArcMap.

  2. Open the Python window.

  3. Import the arcpy module:

    import arcpy

  4. Set the workspace:

    arcpy.env.workspace = "c:/ArcpyBook/data/CityOfSanAntonio.gdb"

  5. Start a try block:


  6. Make an in-memory copy of the Burglary feature class using the Make Feature Layer tool. Make sure you indent this line of code:

    flayer = arcpy.MakeFeatureLayer_management("Burglary","Burglary_

  7. Add an except block and a line of code to print an error message in the event of a problem:

    print "An error occurred during creation"

  8. The entire script should appear as follows:

    import arcpy
    arcpy.env.workspace = "c:/ArcpyBook/data/CityOfSanAntonio.gdb"
    flayer = arcpy.MakeFeatureLayer_management("Burglary","Burglary_
    print "An error occurred during creation"

  9. Save the script to c:ArcpyBookCh8CreateFeatureLayer.py.

  10. Run the script. The new Burglary_Layer file will be added to the ArcMap table of contents:

  11. The Make Table View tool functionality is equivalent to the Make Feature Layer tool. The difference is that it works against standalone tables instead of feature classes.

  12. Remove the following line of code:

    flayer = arcpy.MakeFeatureLayer_management("Burglary","Burglary_

  13. Add the following line of code in its place:

    tView = arcpy.MakeTableView_management("Crime2009Table",

  14. Run the script to see the table view added to the ArcMap table of contents.

How it works…

The Make Feature Layer and Make Table View tools create in-memory representations of feature classes and tables respectively. Both the Select by Attributes and Select by Location tools require that these temporary, in-memory structures be passed in as parameters when called from a Python script. Both tools also require that you pass in a name for the temporary structures.

There’s more…

You can also apply a query to either the Make Feature Layer or Make Table View tools to restrict the records returned in the feature layer or table view. This is done through the addition of a where clause when calling either of the tools from your script. This query is much the same as if you’d set a definition query on the layer through Layer Properties | Definition Query.

The syntax for adding a query is as follows:

MakeFeatureLayer(in_features, out_layer, where_clause)
MakeTableView(in_table, out_view, where_clause)

Subscribe to the weekly Packt Hub newsletter

* indicates required


Please enter your comment!
Please enter your name here