Adswerve

How to Connect Google Analytics with AppEngine (Python) and d3js: Another Heatmap Tutorial


January 19, 2015

In my last post I showed how to connect Google Analytics with R. In this post we will build another heat map. This time with the Google AppEngine (Python) application and connect it to a Google Analytics account.

This post will give you a starting point for developing Google Analytics applications in AppEngine. You will learn exactly how to create an AppEngine application and access Google Analytics data from it. The resources used in this post are available to the public and linked. To make it easier for you I have created a GitHub repository for this project which you can use as a starting point for any projects of your own. Commits in GitHub repository correspond to different steps from this post

The final (enhanced) product is available here http://ga-heatmap.appspot.com. Go ahead and try it out.

d3js heatmap of traffic

d3js heatmap of traffic

What is Google AppEngine?

Google AppEngine is a platform for developing and hosting scalable web applications. It supports languages like Python, PHP, Java and GO.

What is d3js?
d3js is a JavaScript library that is most often used to visualize data.

1. Set up Google AppEngine application

  1. Create a new application in Google AppEngine. To do this go to http://appengine.google.com.
    Create AppEngine Application

    Create a new AppEngine application.

    2

  2. If you do not have it installed yet, download and install Google Appengine SDK for Python from https://cloud.google.com/appengine/downloads.
  3. When you’re done installing the SDK, open it and create a new application with the same “Application ID” as you’ve used in the web part of AppEngine.
    New Application

    New Application ID

    Use the same Application ID as online.

  4. Run the App and make sure you can see the default “Hello World” on the localhost using the selected port.
    6  5
  5. Your app has been successfully set up. Navigate to the folder where the application has been created. Location of the folder was specified in the Application Directory when you’ve created the application.
    7

2. Import Google Analytics API libraries

This is how your application directory should look like.

Application directory after step 2.

First we need to connect our application with The Google APIs Client Library for Python. To do this we will need to include 4 modules to our application. 3 (apiclient, oauth2client, uritemplate) are available as a part of Google Api Library for Python here.

  1. Download the zip file
  2. Include apiclient, oauth2client and uritemplate folders to your application.

We also need to include httplib2.

  1. Download the zip file
  2. Include httplib2 to your application folder.

* to make the process of starting your own application easier, these libraries are included into the GitHub repository.
** This can also be done using pip install, but because of the way AppEngine works you need to make sure to have these 4 modules in the application folder.

3. Client Secrets File

Select APIs from left menu

Select APIs from left menu

To be able to connect our application with Google Analytics API we will also need to enable GA API for the project in the Google Console and then store the client secret file within our AppEngine application.

  1. Navigate to https://console.developers.google.com/project
  2. From the list of projects select the one you have created in the AppEngine’s web interface (in our case ga-heatmap)
  3. In the left menu select APIs and enable Google Analytics 
  4. Now click credentials in the left menu
  5. Click create new client id 
  6. Select Web Application (you may also need to configure a consent screen in this step)
  7. Under authorized javascript origins put your local and deployed URLs of the application. In our case that would be http://localhost:8080 and http://ga-heatmap.appspot.com. The appspot subdomain is automatically created for you based on your AppEngine application ID. Authorized redirect URIs should be automatically created for you, if they’re not use the same URIs as in previous field and add “/oauth2callback” to each.
    Createing a Web Application client ID

    Createing a Web Application client ID

  8. Once you create the client ID, download the client secret JSON
    Download Client Secret JSON

    Download Client Secret JSON

  9. Rename it to client_secret.json and add it to the application directory.

3. Google Analytics User Authorization

After setting everything up we’re finally ready to do some coding. In this step we will mostly be following the Google Api Client Library Guide.
Don’t get scared if you don’t understand any of the code. Again all of this is available in the GitHub repository and the Google API Guide.

  1. Open main.py (finally) in your favorite text editor.
  2. Add decorators path and callback to your WSGIApplication functions list, so that it looks something like this
     app = webapp2.WSGIApplication([
    ('/', MainHandler),
       (decorator.callback_path, decorator.callback_handler()),
    ], debug=True)
    
  3. Now create a Google Analytics decorator and service. Place it before the MainHandler class. For this step we will also import functions from the libraries we have added
    import os
    from apiclient.discovery import build
    from google.appengine.ext import webapp
    from oauth2client.appengine import OAuth2DecoratorFromClientSecrets
    
    decorator = OAuth2DecoratorFromClientSecrets(
    os.path.join(os.path.dirname(__file__), 'client_secret.json'),
    'https://www.googleapis.com/auth/analytics.readonly')
    
    service = build('analytics', 'v3')
    
  4. Use the oauth decorator for main get method like so:
    class MainHandler(webapp2.RequestHandler):
      @decorator.oauth_required
      def get(self):
        self.response.write('Hello world!')
    
    
  5. Navigate to http://localhost:8080 (or your selected port), if you get the Google Accounts – Sign in page, you’re on a great track. Awesome job!!
  6. When user of the application authorizes application to use their Google Analytics account, our application can start reading the data.

*For more Google Analytics API calls go to https://developers.google.com/apis-explorer/#p/analytics/v3/.

4. Get the data

To keep this example a little less complex we will assume that our Google Analytics View id is passed as a parameter in the url. http://localhost:8080?viewId=12345678
usually we would use HTML selectors to make it easy for the user to select a certain view.

To learn how to get a specific view id look at the 4th step of the GA and R heatmap blog post.

  1. Import the httplib2 you will also need to import os
    import os
    import httplib2
    
  2. Inside the MainHandler’s get method construct a http object like so:
    http = decorator.http()
  3. And now finally time for the query. We will query for a metric sessions and dimensions hour and dayOfWeek, for the first week of December 2014.
    report = service.data().ga().get(
    	  ids='ga:%s'%self.request.get("viewId"),
    	  metrics='ga:sessions',
    	  dimensions='ga:hour,ga:dayOfWeek',
    	  start_date='2014-12-01',
    	  end_date='2014-12-07').execute(http)
  4. Report variable now holds a dictionary of the GA data for the selected date range, dimensions and metrics. To learn more about the structure of it see Google’s reference.

Finally we have all the stored in a variable in our application, no lets clean it and display it using d3js.

5. Clean the data

The data that we will use is in the rows key of the report dictionary as a python list of the following format:

[ [hour, dayOfWeek, sessions], [hour, dayOfWeek, sessions], ...]

Hours start at 0 and go to 23 (correspond to military time) and days of week start at 0 (Sunday) and go to 6 (Saturday).

The format we need the data in for the d3js visualization that we will be using is:

[{day: <day>, hour:<hour>, value:<sessions>}, {day: <day>, hour:<hour>,  value:<sessions>}… ]

In this case days start with 1 and so the hours

So we need to transform this list of lists:
[[0, 0, 12], [0, 1, 13]…]

to this list of dictionaries:
[{day:1, hour:1, value:12], {day:1, hour:2, value:13}]

To do this lets add the following code after we assign the report variable with the data from GA.

	cleanedData = []
	for row in report['rows']:
		rowDictionary = {"day":int(row[1])+1, "hour":int(row[0]) + 1, "value":int(row[2])}
		cleanedData.append(rowDictionary)

6. Use the data to construct a cool looking Heatmap using d3js

For this step we will use the following code http://bl.ocks.org/tjdecke/5558084. But instead of the .tsv file with preloaded values we’re going to use the cleanedData.

  1. Store the HTML file from the d3js heatmap tutorial to application folder as index.html
  2. To get the cleaned data from python script to the HTML we will use jinja library. To import and set up jinja use the following two lines before the MainHandler class.
    import jinja2
    JINJA_ENVIRONMENT = jinja2.Environment(loader=jinja2.FileSystemLoader(os.path.dirname(__file__)), autoescape=True, extensions=['jinja2.ext.autoescape'])
  3. You will also need to change the app.yaml file. Open the app.yaml file in your application directory and add the following lines to the libraries. (last two rows in the default app.yaml)
    - name: jinja2
      version: latest
    
  4. Now inside the MainHandler at the end of the get method referene the index.html file and pass the data to it like so:
    		template = JINJA_ENVIRONMENT.get_template('index.html')
    		self.response.write(template.render({'cleanedData':cleanedData}))
  5. In the index.html file remove the part where data is read from the tsv and replace it with the “cleanedData” variable. To access variables that have been passed from Python inside html use double brackets “{{ cleanedData }}”.

7. Et Voila

Navigate to http://localhost:8080?viewId=[viewId] and you should see something like this:

d3js heatmap of traffic

d3js heatmap of traffic, base on the data from user’s GA account

Test it out at http://ga-heatmap.appspot.com/.

Look through all the code for the steps on GitHub. Feel free to fork and work on the your own GA project using this one as a base. Let us know in comments if you have any questions or suggestions.