May 302013
 

Some reports available in the web frontend of Google Adwords cannot be downloaded using the Adwords API. Specifically, the report with the billing summary of the clients of a MCM account, including the total budget, the total spent and the total remaining. This information allows the automation of a monitoring system to generate an alert when the remaining budget falls below a given amount.

This post explains how to automate the login process and the download of that report, using a perl script.

0. Prerequisites

The procedure explained in this posts requires that the following CPAN modules are installed:

  • WWW::Mechanize::GZip
  • LWP::Protocol::https

To install those modules on a debian or ubuntu system, the SSL libraries need to be installed previously:

1. Login process

When the url http://adwords.google.com is requested from a browser without having started a session in adwords, we are redirected to the login form, in the url:

adwords-login

After entering the username and password, the POST request is sent the the url that validates the login data (https://accounts.google.com/ServiceLoginAuth). Then, the browser navigates through some other urls requested by means of redirects 302, until the adwords home page is loaded.

In this process, the browser receives the cookies required to access the adwords frontend as a validated user.

All this process can be automated in perl using the WWW::Mechanize::GZip module available from CPAN. The following code can be used for this purpose:

The adwords home page contains a token that will be required later to perform the download a the report. Also in the home page, the values of  ‘effectiveUserId’ and ‘customerId’ (often used in urls as arguments ‘__u’ y ‘__c’) can be found. Using regular expressions, it is easy to extract these values from the HTML code retrieved:

2. Downloading the budget summary report

In the home page, the “Budget” tab gives access to the budget summary report for the clients managed by the MCM account.

There is also a button to download the report as a csv file:

adwords-budget-summary

Using Firebug to analyze the accesses requested by the browser when the report download button is pressed, we can see that a POST request is sent to the url: “https://adwords.google.com/mcm/file/ClientSummary”. In the request, a “token” and a “mcsSelector” parameter are included. The value of the “token” parameter is the value we retrieved previously from the home page, and the value of the “mcsSelector” parameter specifies the format of the report to be downloaded.

Again, using Mechanize we can reproduce this access in our script:

We can see in the last line of the example that the report received has to be converted from “UTF-16 little endian” to “UTF-8”.

3. Processing the report

Finally, we will process the report to implement the alert on low budget remaining. As the received data are in TSV (tab separated values) format, there many possible ways to parse this format in perl. For instance:

References:

http://stackoverflow.com/questions/15071327/no-content-disposition-header-in-response-from-mechanize

 Posted by at 6:36 am

 Leave a Reply

(required)

(required)