top | item 23411047

Analytics Without Google

203 points| luxurytent | 5 years ago |justbartek.ca

89 comments

order
[+] epoch_100|5 years ago|reply
Props to the author for ditching Google. I just want to point out that there are a LOT of non-Google analytics options out there; leaving Google Analytics really isn’t too hard if you’re willing to take the leap.

Plug: those who want a modern client-side analytics tool that’s free, self hosted, and open source might consider Shynet [0]. (Disclosure: I maintain it.) It’s a bit simpler/cleaner than Matomo, but exists in the same category.

[0] https://github.com/milesmcc/shynet

[+] tjbiddle|5 years ago|reply
Google Ads still dominates though, and if you're doing paid advertising - you're not just shooting yourself in the foot, but you're lopping off a limb - or two. I wish it wasn't the case.

Any non-Google analytics options integrate well with Google Ads, especially for retargeting?

[+] Danbana|5 years ago|reply
I feel like this is meant to be used for a website. Do you have an article about how to use it from an app? Maybe I'd have to make some urls the app hits to download 1 bit or something, if i want to track a certain action: app open, features used
[+] slykar|5 years ago|reply
The thing I'm interested in when using Google Analytics is tracking user path to see the bounce rate in a checkout process for example. You can calculate conversion rates for different user segments. People who just want to see "how many visits I got" don't benefit from GA. Developers often miss the point of GA, because they do not work in sales.
[+] marcus_holmes|5 years ago|reply
OK, so let's pull this apart:

> The thing I'm interested in when using Google Analytics is tracking user path to see the bounce rate in a checkout process for example

There are lots of ways to get this data direct from the servers, and much, much more reliably and effectively.

> You can calculate conversion rates for different user segments.

Maybe. You only have data on those people who don't have adblockers that block GA. Which is a sizable segment of the internet. Also that they're not using a VPN, etc. Stats collected via GA are inherently less reliable than stats collected from your own site because GA is easily blocked.

Plus, there have been reports since forever that GA's stats are just not that reliable in the first place.

I've experienced this myself - I used to admin a WP blog, and the numbers from the site log and GA were around 20% different.

If you're relying on GA stats to calculate the results of any a/b testing, then you need to put in at least a +/- 25% error factor (i.e. if the A test converts 10% better than the B test, you have no idea whether that's a real thing, or a product of GA giving you inaccurate data - you'd need at least a 25% swing to begin to think it might be a real customer preference).

> Developers often miss the point of GA, because they do not work in sales.

Yes. But have you listened to their objections, rather than dismissing them as "not working in sales"? Developers do have some knowledge of this subject. And there are lots of ways of getting this data that doesn't increase page load and compromise security like GA does.

[+] sdoering|5 years ago|reply
Bounce Rate in the checkout funnel? Do you mean the "Exit Rate" at specific steps in the funnel?

Just to clarify, because as far as I know from some years in the industry, in every tool I got to know a "Bounce" is defined as a single tracking hit from a user id without further measured activity.

Meaning after the first measured page view (the entry to the site) leaving the site couldn't be called a bounce.

None the less: If one is interested in a user privacy aware solution to replace GA with a focus on funnels, eCommerce, and such I would probably (up to a certain scale of traffic) recommend to take a look at Matomo (former Piwik). Can be run on your own server, has a lot of the basic functions of GA, a great API, can be used to do goal/path analysis as well as marketing performance reporting.

If a company/site reaches a certain scale I would probably recommend to use a paid solution like Adobe Analytics (or paid GA), after having done an evaluation into the real needs of said company/site.

[+] nicbou|5 years ago|reply
I run a fairly simple website supported by donations and affiliate income. Even then, Analytics offers more than view counts. Here are a few questions it can answer for me:

- Is supporting IE11 financially justified?

- Which articles generate the most income, and why?

- Which components are useful, and which are just noise?

- Where are my visitors from? (it affects how I can help them)

[+] sputr|5 years ago|reply
Annoyed with cookie consent on your page?

Easy: use a 1st party self-hosted tracker (matomo/piwik), anomnize last digits of IP, respect DNT and provide an optout (matomo provides an embeddable widget) on your privacy policy page.

And bam, no legal need for cookie consent OR notification! No popup at all! And you still get perfectly usable statistics for most applications.

[+] kevingrahl|5 years ago|reply
I’m no expert here but wouldn’t that go against the E-privacy Directive 2009/136/EC, according to which you must:

Receive users’ consent before you use any cookies except strictly necessary cookies?

[+] esperent|5 years ago|reply
One possible benefit that I don't see discussed much in this context is bypassing ad blockers. I run a tech focused website and Google analytics registers about 10k visits/month. I figure that a good chunk of my visitors have ublock and so they don't show up. Presumably, alternative analytics or self hosted analytics are not blocked, so I'd get more accurate stats. Is this a correct assumption?
[+] sdoering|5 years ago|reply
As I also said in another comment, you are correct in assuming, that you are bypassing ad blockers with this solution.

This solution takes your server log files and does an analysis/reporting based on that. As these log are written, when the browser accesses the resources (sites) you get clear stats oof how often your resources (sites/files) got requested.

As others stated nobody will filter out bots for you. So this will inflate the numbers with traffic from "none users". But also "adblock users" will show up.

As always - web stats are but an approximation of the real world. Their analysis depends on a lot of factors. From experience with different clients I know of cases were AdBlockers blocked up to 25% of traffic from showing up in analytics on sites with a more tech savvy audience.

[+] dmkii|5 years ago|reply
It’s true, but GA also does a pretty good job at filtering out bot traffic for example, so it depends on what your definition of more accurate is. There are also ways to send GA hits through a sub domain to make it look self hosted and bypass ad blockers.
[+] JackWritesCode|5 years ago|reply
Bypassing ad-blockers is simple enough, just roll your own custom domain. I'm working on a blog post that covers how we (Fathom) did this with Caddy as a multi-region reverse proxy
[+] AdriaanvRossum|5 years ago|reply
Thanks for the mention of Simple Analytics [1] Bartek. We do love to be a paid service while we know it doesn’t suit everybody’s need. This way we don’t need to find any other way of making money (with the data of our customers).

[1] https://simpleanalytics.com

[+] gpanders|5 years ago|reply
I noticed this post mentions GoatCounter but says that it doesn't meet his needs because it isn't self-hosted.

It is not by default, but you certainly can self host it, and the author has been quite open about that being a viable path for people interested in doing so. I self host GoatCounter myself and it works very well.

[+] galfarragem|5 years ago|reply
+1 to Goatcounter. Not affiliated with it, just an happy (free) user on a low traffic/hobby project. Simple and no BS. More that enough if you just want to know the big picture and care about privacy. The data I get is this (my data is public but it’s up to you): https://slowernews.goatcounter.com/
[+] Carpetsmoker|5 years ago|reply
It looks like I have some work to do on clarifying the homepage here, hah.
[+] luxurytent|5 years ago|reply
Hey, yeah. This is my bad! I didn’t dig deep into GoatCounter but on second glance I can see the link to source. I amended my article to clarify what I know (not much)
[+] owenshen24|5 years ago|reply
Also a happy donating user for the free plan. (No affiliation, I just like the project.)
[+] johnchristopher|5 years ago|reply
Nice. Can it be used to compare this year's first quarter mobile usage for dutch speaking visitors who came from Facebook to last year's same profiles but who came from the newsletter ?
[+] wldlyinaccurate|5 years ago|reply
I'm not associated with them at all, but Fathom are worth looking into if you want an analytics platform that respects user privacy: https://usefathom.com/
[+] JackWritesCode|5 years ago|reply
Hey, thanks for the shoutout. We're a great option for so many reasons but here are a few recent things"

* We allow our users to bypass ad-blockers and exclude themselves from tracking their own page views (https://usefathom.com/blog/custom-domains-embed-code)

* We are both full time on Fathom, bootstrapped (we've turned down millions in VC) and profitable (https://usefathom.com/blog/quit)

* We are adding in unlimited uptime monitoring on Friday (SMS, Telegram, Slack and email notifications)

* We run on auto-scaling infrastructure and are used by individuals, small businesses, governments and some of the biggest companies in the world

[+] joe5150|5 years ago|reply
GoAccess is really cool. Matomo is free and self-hosted if you want more of the features of Google Analytics like tracking time on page.
[+] XCSme|5 years ago|reply
I am also working on a similar product to Matomo, it's not free but it also provides some of the Matomo's premium features for a much cheaper price: https://usertrack.net
[+] AndrewStephens|5 years ago|reply
I like this blog post and I support anyone who removes a third party tracker from their site.

There are companies that live and die by analytics and demographics but your personal blog doesn't need the information that GAnalytics sucks up.

GoAccess (suggested in a article) is a fine choice, although I found it did not do a good job of filtering out bot hits. For most people this might not be a big deal but it annoyed me.

In the end I just wrote[0] a simple hit counter that triggers off a js beacon.

[0] https://github.com/andrewstephens75/visitlog

[+] celsoazevedo|5 years ago|reply
Something like Matomo running on a cheap VPS (less than $5/month) is a good Google Analytics replacement for small projects.

It's cheaper than Simple Analytics (mentioned in the article) and no data is sent to a service operated by a 3rd party because it's self-hosted.

[+] XCSme|5 years ago|reply
I am also working on a similar product to Matomo, it also provides some of the Matomo's premium features for a much cheaper price: https://usertrack.net
[+] chatmasta|5 years ago|reply
GoAccess is cool but it won't help you much if you have a static website or if you serve the majority of requests from a CDN. In those cases, Matomo (previously piwik) is a good solution for client-side JS-based analytics similar to Google Analytics.
[+] sradman|5 years ago|reply
I’m not familiar with GoAccess or Matomo but after some Googling it looks like Matomo requires self-hosting it’s PHP/MySQL server.

If you are using a CDN then I’m assuming that multiple access.log files need to be aggregated at some point unless the CDN has a service that automates the process. Logstash and the full ELK Stack (or alternative) seem to be required when multiple heterogeneous servers are involved in serving content. Browser-side JavaScript analytics seems to avoid the DevOps surrounding the ELK Stack. GoAccess seems like a minimalist solution when you control a single httpd-style server and can run a local daemon to process the access.log file.

As a side note, it looks like you need an Enterprise account to use Cloudflare’s Logpush Service or Logpull API. Amazon’s CloudFront has an advantage here.

Avoiding Google Analytics is non-trivial for the lazy and/or price conscious.

[+] PenguinCoder|5 years ago|reply
How does Matomo work better with a static site, than GoAccess? GoAccess reads the server logs and creates metrics. Matomo requires javascript; that isn't 'static website'.
[+] zubspace|5 years ago|reply
I like GoAccess for following reasons:

* No javascript.

* Does not need a database.

* Realtime stats in html format.

I use it in my blog, but also believe, that the numbers are completely inflated. I don't trust them. This has been discussed on Github a few times [1], so don't expect accurate numbers (yet).

It's also hard to see what's going on recently on you server, because you only get totals. I'd love if I could change the time interval of the shown html stats.

I like the way GoAccess is going though and I hope it will improve.

[1] https://github.com/allinurl/goaccess/issues/964#issuecomment...

[+] darekkay|5 years ago|reply
Interesting, I've switched from Google Analytics to GoAccess ~5 years ago. I let both analytics run for a month and compared the results. The relative numbers were very similar (so I get the same information about what blog posts are most popular), but the absolute numbers were in fact lower for GoAccess. It might be because tech blog visitors are using AdBlockers more often (and hence block GA).

> I'd love if I could change the time interval of the shown html stats.

GoAccess displays the data that you pass it. So while it doesn't have any date filter option (at least the last time I've checked), you can just filter your logs beforehand. There's even a more simple solution that I'm using: Set the logrotate to a specific time frame (e.g. weekly), so you can pass "access.log" to GoAccess to only get the latest stats. You can still pass "access.log*" to get ALL stats at the same time.

[+] girzel|5 years ago|reply
I feel like all I need is a good script to filter out bots and spiders. I could take the rest from there.
[+] luxurytent|5 years ago|reply
This is an interesting thought. I wonder if I could pre process my log files to get rid of this noise ahead of piping to a log analyzer.
[+] monus|5 years ago|reply
I’m not associated with Countly ( https://count.ly/ ), but I heard good things from my friends using it. It’s open source and makes money with enterprise edition.
[+] tobilg|5 years ago|reply
If you run on AWS and want cheap and privacy-focussed website analytics, you might also look at https://ownstats.cloud/
[+] wharfjumper|5 years ago|reply
Thanks. That fits our use-case. Now just need to figure out how to do the visualizations...
[+] traceroute66|5 years ago|reply
I was going to say "what about Matomo". But then Matomo are still lurking around in the dark ages of Python 2 and are not worthy of recommendation until they pull their finger out and sort out their analytics scripts that rely on that obsolete thing (https://github.com/matomo-org/matomo-log-analytics/pull/242, https://github.com/matomo-org/matomo-log-analytics/issues/3 etc. etc.)
[+] sdoering|5 years ago|reply
I am totally not a friend of anyone still using py2.x (esp. as it has EOLed [1].

Non the less one doesn't need to use log analysis and py2.X if you want to use Matomo as a GA replacement in the frontend.

So yeah - they should fix the log analysis part with py2 -> py3. But that doesn't invalidate the whole tool imho.

[1]: https://www.python.org/dev/peps/pep-0373/

[+] fareesh|5 years ago|reply
Has anyone had the experience of seeing over-reporting of some metrics in Google Analytics as compared to an internal tracking system? Is this data generally seen to be 100% reliable?
[+] markosaric|5 years ago|reply
Nice work! Every time I see the "de-Google-ing" posts, there are people in comments saying but can it do this or can it integrate with that Google product.

I'm working on a Google Analytics alternative myself [1] and we make it clear that it is not meant as a clone or a full blown replacement of Google Analytics.

Some people are fine running GA and are happy to integrate with Google Ads and the rest of the Google ecosystem.

On the other hand, some would prefer to focus more on privacy of their visitors or on not having to get cookie / GDPR consent or on having a faster loading website or support a more independent web etc. And alternative solutions to Google products such as these are more meant for those use cases.

[1]: https://plausible.io/