top | item 6107843

Why we'll use Google Universal Analytics over Mixpanel and KISSMetrics

53 points| duwip | 12 years ago |blog.fleex.tv

25 comments

order

snide|12 years ago

As someone who has built several top 1000 trafficed websites over the past decade here is what the publishing industry definitely needs out of an analytics program.

1. Please give me a report that can prove that my user traffic is real.

2. Please give me a report that can prove that the traffic is healthy.

I know that I can get this from analytics now, but it needs to be the focus.

For a decade I've competed against content websites that for the most part game seo traffic, build click traps and generally pollute the Internet with secondary source content. I've always had fairly large audiences on my sites, with healthy 50% returning visitor rates. However, when it comes to getting ad dollars, I always lost to competitors who had much larger volume mostly because they were either buying meaningless inbound links or using some other scam like click trap "we recommend this hot girl talking about prostate cancer" photos to goose their numbers. Meanwhile we'd create quality content and my sites would have hundreds of comments, while theirs would have very little. It didn't matter that my audience was more engaged, advertisers bought volume.

I just need something that I can show to an advertiser (or even better, that they have access to and can compare) that says... hey, this website isn't a constructed fabrication made to fake volume and take your money you sucker. This is a real website.

A lot of the industry right now is based upon buying links from aging front door portals (Yahoo, MSN, AOL) which still do ungodly amounts of traffic with a mostly Internet illiterate audience. Sites buy these links, convert them into CPM click traps on their targeted magazine sites and sell their inventory to advertisers who don't know that the whole thing is shell game. They think they're buying ads on a hot new site with explosive growth.

lukestevens|12 years ago

Chartbeat is attempting to help with 1 & 2 with "Engaged Time": https://chartbeat.com/publishing/for-editorial/ . (I've no connection).

I'm building an analytics interface for GA though and would love to chat about what else publishers need in an analytics interface - luke at itsninja if you'd like to chat.

alexatkeplar|12 years ago

Hey snide - I think we can probably help you at Snowplow Analytics. We warehouse all your atomic event data (including page views and in-page pings - v hard to fake) with IP address, browser fingerprint, 1st party cookie, optional 3rd party cookie, optional business defined-user ID, user timezone, browser features, useragent... If that sounds useful for proving your audience to advertisers, get in touch!

omarchowdhury|12 years ago

Do you have more information on how I can see a live view of the practices you discussed in your last paragraph? I was under the impression that traffic from the top portals was costly and not exactly suitable as a component in an arbitrage play like you mentioned.

joevandyk|12 years ago

I started saving all my page views in a postgresql database. Schema is pretty simple.

I have the following tables:

    sessions
      session_id (uuid type)
      created_at
    
 
    page_views
      page_view_id
      session_id
      created_at
      site_id
      path
      query_string (hstore)
      user_agent
      referral_url
      ip_address
      user_id
      http_method (get, post, etc)
      details (hstore, used to tag page views/actions)
   
This allows me to simply query all my page views against data in my live database. I can see the path a user took to place an order. I can easily integrate a/b tests. If someone uses a coupon on the site and we want to see if they later came back and viewed/purchased more, we can easily write a sql query to figure that out. We can simply figure out lifetime customer value, even if not logged in. If we're getting a large amount of traffic from a certain affiliate, we can alert our staff.

It's really awesome to be able to have your data in the same place. Having analytics data spread out to GA made it difficult to match that data against ours. If we need to scale out to multi-terabytes, postgres_fdw will make querying against the analytical database simple.

Since we're also tracking affiliate purchases to pay out commissions, I also have another table that that stores additional information about a page view if they came from an affiliate site (click id, the affiliate network, etc).

Here's the plpgsql function I use for saving the sessions and page views: https://gist.github.com/joevandyk/f63523cdd1a3aa75d0ec

duwip|12 years ago

Yeah, we do that kind of stuff as well. At least you know what your data means. But when you start getting millions of hits a day, you won't necessarily want to spend some time scaling your system... In that case leaving it to the pros and focusing instead on your product may prove the most sensible move.

mikeknoop|12 years ago

The last paragraph is important. I spent some time earlier this week when I learned about Universal Analytics -- but quickly discovered that UserID tracking hasn't shipped yet.

Can anyone on the GA team speculate about a release date for the uid bits?

j_s|12 years ago

I was not aware that the new analytics would track users. One interpretation of section 7 of the Google Analytics Terms of Service is that tracking individuals is not allowed:

http://www.google.com/analytics/terms/us.html

  > You will not [...] use  the Service to track, collect or 
  > upload any data that personally identifies an individual 
http://productforums.google.com/forum/#!topic/analytics/tTaq...

  > you cannot store names or ip addresses in a custom var, 
  > but you can store ids that need your backend to resolve 
  > into a person identification

Brandon0|12 years ago

Tracking an individual is different than storing personally identifiable information. I can assign you an arbitrary (or seemingly arbitrary) userID (that is unique to you), but does not personally identify you, as a way to track you. This arbitrary userID is meaningless to any third parties. What I cannot assign you, is your name, email address, or even IP address as a way to track you since anyone that sees that information could figure out who it belongs to.

jamiequint|12 years ago

This article is really making a big deal out of nothing. All the "major issues" brought up here only create problems in edge cases. When you're trying to drive growth or understand your users (the purpose of metrics at the end of the day) you should not be focused on edge cases.

In most cases the reason you care about tracking logged-out -> logged-in behavior is to measure onboarding behavior, understanding what the user does pre-signup so you can do a better job of driving signups. Signup is not a multi-client process in the common case so being able to track multi-client behavior pre-signup doesn't really matter at all.

duwip|12 years ago

Agreed, these are edge cases. They did create a lot of questions for me though, and made the whole thing rather confusing as a user.

As to how much of an issue these edge cases represent, I find it hard to get a real sense of it. I guess it really depends on the situation, what you want to measure and the user experience you offer to your visitors.

taf2|12 years ago

My gripe about google universal analytics or analytics.js vs ga.js is

broken backwards compatibility (cookie data is no longer stored in the same way) this was an interface many add/systems used and depend on from the days of Urchin.

Otherwise, new interface is pretty slick, features look good, the API to send data server side is so much nicer.

broken compatibility just kinda sucks though

jdangu|12 years ago

> For one, there can’t be 2 [clientID, userID] couples with the same userID: with the way mixpanel does things, this is essentially a technically impossible scenario (...) And yet one user can access your site through different clients, leading to a systematic overestimation of the number of visitors hitting your site.

Really? Anyone can confirm this behavior? I'm pretty sure KissMetrics doesn't have this limitation.

losvedir|12 years ago

Indeed, and this is why we ended up choosing KM over MP. With KM you just "identify" a visitor whenever you want and if there's already another anonymous cookie, it'll tie together all events retroactively. We couldn't find an easy way to do this with MP when we looked at it.

KaoruAoiShiho|12 years ago

Last I checked User based analytics is directly against the Google TOS. You are not supposed to store any identifying information about specific users, probably because Google has been under privacy scrutiny. So not only is google not for user based tracking they prohibit it, making them a real non-starter in any case.

duwip|12 years ago

Check out the Google I/O video I mention in the article if you need convincing. As far as not collecting user data fo privacy reasons, I think brandon0's comment says it all.