Getting started with Google Data Studio: An intuitive tool for visualizing BigQuery Data

0
657
7 min read

Google Data Studio is one of the most popular tools for visualizing data. It can be used to pull data directly out of Google’s suite of marketing tools, including Google Analytics, Google AdWords, and Google Search Console. It also supports connectors for database tools such as PostgreSQL and BigQuery, it can be accessed at datastudio.google.com. In this article, we will learn to visualize BigQuery Data with Google Data Studio.

This article is an excerpt from the book, Learning Google BigQuery, written by Thirukkumaran Haridass and Eric Brown. This book will serve as a comprehensive guide to mastering BigQuery, and utilizing it to get useful insights from your Big Data.

The following steps explain how to get started in Google Data Studio and access BigQuery data from Data Studio:

  1. Setting up an account: Account setup is extremely easy for Data Studio. Any user with a Google account is eligible to use all Data Studio features for free:

Google Data Studio Account set up

  1. Accessing BigQuery data: Once logged in, the next step is to connect to BigQuery. This can be done by clicking on the DATA SOURCES button on the left-hand-side navigation:

Google Data Studio Beta

  1. You’ll be prompted to create a data source by clicking on the large plus sign to the bottom-right of the screen.
  2. On the right-hand-side navigation, you’ll get a list of all of the connectors available to you. Select BigQuery:

Connectors

  1. At this point, you’ll be prompted to select from your projects, shared projects, a custom query, or public datasets. Since you are querying the Google Analytics BigQuery Export test data, select Custom Query.
  2. Select the project you would like to use.
  3. In the Enter Custom Query prompt, add this query and click on the Connect button on the top right:
SELECT trafficsource.medium as Medium, COUNT(visitId) as Visits

FROM `google.com:analytics- bigquery.LondonCycleHelmet.ga_sessions_20130910` GROUP BY Medium

This query will pull the count of sessions for traffic source mediums for the Google Analytics account that has been exported.

  1. The next screen shows the schema of the data source you have created. Here, you can make changes to each field of your data, such as changing text fields to date fields or creating calculated metrics:

BigQuery

  1. Click on Create Report.
  2. Then click on Add to Report. At this point, you will land on your report dashboard. Here, you can begin to create charts using the data you’ve just pulled from BigQuery. Icons for all the chart types available are shown near the top of the page.
  3. Hover over the chart types and click on the chart labeled Bar Chart; then in the grid, hold your right-click button to draw a rectangle.

A bar chart should appear, with the Traffic Source Medium and Visit data from the query you ran:

Untitled Report

A properties prompt should also show on the right-hand side of the page:

Bar Chart properties

Here, a number of properties can be selected for your chart, including the dimension, metric, and many style settings. Once you’ve completed your first chart, more charts can be added to a single page to show other metrics if needed.

For many situations, a single bar graph will answer the question at hand. Some situations may require more exploration. In such cases, an analyst might want to know whether the visit metric influences other metrics such as the number of transactions. A scatterplot with visits on the x axis and transactions on the y axis can be used to easily visualize this relationship.

Making a scatterplot in Data Studio

The following steps show how to make a scatterplot in Data Studio with the data from BigQuery:

  1. Update the original query by adding the transaction metric. In the edit screen of your report, click on the bar chart to bring up the chart options on the right-hand- side navigation. Click on the pencil icon next to the data source titled BigQuery to edit the data source.
  2. Click on the left-hand-side arrow icon titled Edit Connection:

BigQuery3In the dialog titled Enter Custom Query, add this query:

SELECT trafficsource.medium as Medium, COUNT(visitId) as Visits, SUM(totals.transactions) AS Transactions FROM `google.com:analytics-

bigquery.LondonCycleHelmet.ga_sessions_20130910` GROUP BY Medium
  1. Click on the button titled Reconnect in order to reprocess the query. A prompt should emerge, asking whether you’d like to add a new field titled Transactions. Click on Apply.
  2. Click on Done.
  3. Once you return to the report edit screen, click on the Scatter Chart button() and use your mouse to draw a square in the report space:

Visits

The report should autoselect the two metrics you’ve created.

  1. Click on the chart to bring up the chart edit screen on the right-hand-side navigation; then click on the Style tab.
  2. Click on the dropdown under the Trendline option and select Linear to add a linear trend line, also known as linear regression line. The graph will default to blue, so use the pencil icon on the right to select red as the line color:

Trendline

Visits

Making a map in Data Studio

Data Studio includes a map chart type that can be used to create simple maps. In order to create maps, a map dimension will need to be included in your data, along with a metric. Here, we will use the Google BigQuery public dataset for Medicare data. You’ll need to create a new data source:

  1. Accessing BigQuery data: Once logged in, the next step is to connect to BigQuery. This can be done by clicking on the DATA SOURCES button on the left-hand-side navigation.
  2. You’ll be prompted to create a data source by clicking on the large plus sign to the bottom-right of the screen.
  3. On the right-hand-side navigation, you’ll get a list of all of the connectors available to you. Select BigQuery.
  4. At this point, you’ll be prompted to select from your projects, shared projects, a custom query, or public datasets. Since you are querying the Google Analytics BigQuery Export test data, select Custom Query.
  5. Select the project you would like to use.
  6. In the Enter Custom Query prompt, add this query and click on the Connect button on the top right:
SELECT CONCAT(provider_city,", ",provider_state) city, AVG(average_estimated_submitted_charges) avg_sub_charges FROM `bigquery-public-data.medicare.outpatient_charges_2014`

WHERE apc = '0267 - Level III Diagnostic and Screening Ultrasound' GROUP BY 1

ORDER BY 2 desc

This query will pull the average of submitted charges for diagnostic ultrasounds by city in the United States. This is the most submitted charge in the 2014 Medicaid data.

  1. The next screen shows the schema of the data source you have created. Here, you can make changes to each field of your data, such as changing text fields to date fields or creating calculated metrics:

Outpatient charges

  1. Click on Create Report.
  2. Then click on Add to Report. At this point, you will land on your report dashboard. Here, you can begin to create charts using the data you’ve just pulled from BigQuery. Icons for all the chart types available are shown near the top of the page.
  3. Hover over the chart types and click on the chart labeled Map  Chart; then in the grid, hold your right-click button to draw a rectangle.
  4. Click on the chart to bring up the Dimension Picker on the right-hand-side navigation, and click on Create New Dimension:

Create New Dimension

Right click on the City dimension and select the Geo type and City subtype. Here, we can also choose other sub-types (Latitude, Longitude, Metro, Country, and so on).

City dimension

Data Studio will plot the top 500 rows of data (in this case, the top 500 cities in the results set).

Hovering over each city

Hovering over each city brings up detailed data:

Lancaster Sub changes

Data Studio can also be used to roll up geographic data. In this case, we’ll roll city data up to state data.

  1. From the edit screen, click on the map to bring up the Dimension Picker and click on Create New Dimension in the right-hand-side navigation.
  2. Right-click on the City dimension and select the Geo type and Region subtype. Google uses the term Region to signify states:

City Dimensions

Once completed, the map will be rolled up to the state level instead of the city level. This functionality is very handy when data has not been rolled up prior to being inserted into BigQuery:

Final Map in BigQuery

Other features of Data Studio

  • Filtering: Filtering can be added to your visualizations based on dimensions or metrics as long as the data is available in the data source
  • Data joins: Data for multiple sources can be joined to create new, calculated metrics
  • Turnkey integrations with many Google Marketing Suite tools such as Adwords and Search Console

We explored various features of Google Data Studio and learnt to use them for visualizing BigQuery data.To know about other third party tools for reporting and visualization purpose such as R and Tableau, check out the book Learning Google BigQuery.

Learning Google BigQuery

Read Next:

Getting Started with Data Storytelling

What is Seaborn and why should you use it for data visualization?

Pandas is an effective tool to explore and analyze data – Interview Insights


Subscribe to the weekly Packt Hub newsletter

* indicates required

LEAVE A REPLY

Please enter your comment!
Please enter your name here