I have a customer who I built a tool to grab info from Google Analytics and insert data into a database. Why they wanted to do this is beyond me. They can actually get the exact same info from GA. But who am I to argue with a paying customer?
Anyway, back when I created this the only way to get the data was to have Google email it and then extract it from the xml file that was attached to the email. But then Google changed how they did the emails and it broke. But they didn't tell me for almost 8 months that it wasn't working. So when they did I told them we should upgrade to the new api. What a mess!
Thanks to
Jen's Bits I got a good start on how to do it. I didn't have to figure out the authentication portion. After that tho, it became dicey. Jen's example shows how to get a list of the accounts and the overall visitor count but nothing more. I had to go back and wade through the GA documentation, which is not easy. They do things really squirrley like geeks gone mad trying to come up with a new way to do APIs. Please, geeks, stick with what is already known and make it easier on us working stiffs. I hate having to figure out what you are talking about when you say "segmentation" and "ids". Spell it out with examples. Then we have the problem of tacking on ga: to the front of all the variables.
First thing to figure out is there are 2 basic classes of queries you can run on the GA API, Metrics and Dimensions. Metrics you can probably figure out but "dimensions"??? Turns out you can mix and match what you want to pull up but only certain mixing and matching. Some metrics queries can not have certain dimension queries. And vice versa.
And then, you have to put in filters. They MUST be url encoded and they show you what those are in a table.
So to pull unique visits for a specific page you have to do something like this:
https://www.google.com/analytics/feeds/data?ids=ga:1234&metrics=ga:newVisits&dimensions=ga:PagePath&filters=ga:PagePath%3D%3D/blog/blogpage.cfm&start-date=#startDate#&end-date=#endDate#"
The dates must be formatted like yyyy-mm-dd.
The ids=ga:1234 That is the id for the website you are accessing. Where do you get that? From the basic call that you can find at Jen's Bits. It is in there, you just have to pull it out, which is not part of the code given there. But basic xml skills make it a no-brainer.
Now you see my metrics are newVisits so I tell it that is what I want by appending ga: to the front of it. There is a list of what you can call in the tech documentation. Notice how I had to say the "dimensions=ga:PagePath" and instead of having it simple like just "ga:PagePath=/blog/blogpage.cfm" (fictition page, just do a path from your root) it takes 2 steps. The next is the filters query to say what the path is. "filters=ga:PagePath%3D%3D/blog/blogpage.cfm". That is a "==" in there. You can do other filters of course.
Then I wanted to do the home page. index.cfm Do you think the path is just index.cfm? No. It is /index.cfm So you have to think of it like
http://www.mysite.com/ and take the path with everything after the "m" in "com". So start with the "/" and go from there.
One key thing is that your page must have the GA code on it. Seems obvious but not so it seems from people doing this in php, java, ruby etc. Of course thn they are not in their right minds anyway since they are'nt using ColdFusion. :)
Finally, I noticed the results were not the same as what is in the actual GA website. Off but not by a whole lot. I can chalk that up to the data is not up to the minute accurate in the api call.