Props to the author for ditching Google. I just want to point out that there are a LOT of non-Google analytics options out there; leaving Google Analytics really isn’t too hard if you’re willing to take the leap.
Plug: those who want a modern client-side analytics tool that’s free, self hosted, and open source might consider Shynet [0]. (Disclosure: I maintain it.) It’s a bit simpler/cleaner than Matomo, but exists in the same category.
Google Ads still dominates though, and if you're doing paid advertising - you're not just shooting yourself in the foot, but you're lopping off a limb - or two. I wish it wasn't the case.
Any non-Google analytics options integrate well with Google Ads, especially for retargeting?
I feel like this is meant to be used for a website. Do you have an article about how to use it from an app? Maybe I'd have to make some urls the app hits to download 1 bit or something, if i want to track a certain action: app open, features used
The thing I'm interested in when using Google Analytics is tracking user path to see the bounce rate in a checkout process for example. You can calculate conversion rates for different user segments. People who just want to see "how many visits I got" don't benefit from GA. Developers often miss the point of GA, because they do not work in sales.
> The thing I'm interested in when using Google Analytics is tracking user path to see the bounce rate in a checkout process for example
There are lots of ways to get this data direct from the servers, and much, much more reliably and effectively.
> You can calculate conversion rates for different user segments.
Maybe. You only have data on those people who don't have adblockers that block GA. Which is a sizable segment of the internet. Also that they're not using a VPN, etc. Stats collected via GA are inherently less reliable than stats collected from your own site because GA is easily blocked.
Plus, there have been reports since forever that GA's stats are just not that reliable in the first place.
I've experienced this myself - I used to admin a WP blog, and the numbers from the site log and GA were around 20% different.
If you're relying on GA stats to calculate the results of any a/b testing, then you need to put in at least a +/- 25% error factor (i.e. if the A test converts 10% better than the B test, you have no idea whether that's a real thing, or a product of GA giving you inaccurate data - you'd need at least a 25% swing to begin to think it might be a real customer preference).
> Developers often miss the point of GA, because they do not work in sales.
Yes. But have you listened to their objections, rather than dismissing them as "not working in sales"? Developers do have some knowledge of this subject. And there are lots of ways of getting this data that doesn't increase page load and compromise security like GA does.
Bounce Rate in the checkout funnel? Do you mean the "Exit Rate" at specific steps in the funnel?
Just to clarify, because as far as I know from some years in the industry, in every tool I got to know a "Bounce" is defined as a single tracking hit from a user id without further measured activity.
Meaning after the first measured page view (the entry to the site) leaving the site couldn't be called a bounce.
None the less: If one is interested in a user privacy aware solution to replace GA with a focus on funnels, eCommerce, and such I would probably (up to a certain scale of traffic) recommend to take a look at Matomo (former Piwik). Can be run on your own server, has a lot of the basic functions of GA, a great API, can be used to do goal/path analysis as well as marketing performance reporting.
If a company/site reaches a certain scale I would probably recommend to use a paid solution like Adobe Analytics (or paid GA), after having done an evaluation into the real needs of said company/site.
I run a fairly simple website supported by donations and affiliate income. Even then, Analytics offers more than view counts. Here are a few questions it can answer for me:
- Is supporting IE11 financially justified?
- Which articles generate the most income, and why?
- Which components are useful, and which are just noise?
- Where are my visitors from? (it affects how I can help them)
Easy: use a 1st party self-hosted tracker (matomo/piwik), anomnize last digits of IP, respect DNT and provide an optout (matomo provides an embeddable widget) on your privacy policy page.
And bam, no legal need for cookie consent OR notification! No popup at all!
And you still get perfectly usable statistics for most applications.
One possible benefit that I don't see discussed much in this context is bypassing ad blockers. I run a tech focused website and Google analytics registers about 10k visits/month. I figure that a good chunk of my visitors have ublock and so they don't show up. Presumably, alternative analytics or self hosted analytics are not blocked, so I'd get more accurate stats. Is this a correct assumption?
As I also said in another comment, you are correct in assuming, that you are bypassing ad blockers with this solution.
This solution takes your server log files and does an analysis/reporting based on that. As these log are written, when the browser accesses the resources (sites) you get clear stats oof how often your resources (sites/files) got requested.
As others stated nobody will filter out bots for you. So this will inflate the numbers with traffic from "none users". But also "adblock users" will show up.
As always - web stats are but an approximation of the real world. Their analysis depends on a lot of factors. From experience with different clients I know of cases were AdBlockers blocked up to 25% of traffic from showing up in analytics on sites with a more tech savvy audience.
It’s true, but GA also does a pretty good job at filtering out bot traffic for example, so it depends on what your definition of more accurate is. There are also ways to send GA hits through a sub domain to make it look self hosted and bypass ad blockers.
Bypassing ad-blockers is simple enough, just roll your own custom domain. I'm working on a blog post that covers how we (Fathom) did this with Caddy as a multi-region reverse proxy
Thanks for the mention of Simple Analytics [1] Bartek. We do love to be a paid service while we know it doesn’t suit everybody’s need. This way we don’t need to find any other way of making money (with the data of our customers).
I noticed this post mentions GoatCounter but says that it doesn't meet his needs because it isn't self-hosted.
It is not by default, but you certainly can self host it, and the author has been quite open about that being a viable path for people interested in doing so. I self host GoatCounter myself and it works very well.
+1 to Goatcounter. Not affiliated with it, just an happy (free) user on a low traffic/hobby project. Simple and no BS. More that enough if you just want to know the big picture and care about privacy. The data I get is this (my data is public but it’s up to you): https://slowernews.goatcounter.com/
Hey, yeah. This is my bad! I didn’t dig deep into GoatCounter but on second glance I can see the link to source. I amended my article to clarify what I know (not much)
Nice. Can it be used to compare this year's first quarter mobile usage for dutch speaking visitors who came from Facebook to last year's same profiles but who came from the newsletter ?
GoatCounter and others use non-identifiable hashes to track a unique visit, but they only retain that hash for some time[1]. I think in your example, you'd have to use a solution that uniquely identifies a session and indeed, keeps track of it.
I'm not associated with them at all, but Fathom are worth looking into if you want an analytics platform that respects user privacy: https://usefathom.com/
I am also working on a similar product to Matomo, it's not free but it also provides some of the Matomo's premium features for a much cheaper price: https://usertrack.net
I like this blog post and I support anyone who removes a third party tracker from their site.
There are companies that live and die by analytics and demographics but your personal blog doesn't need the information that GAnalytics sucks up.
GoAccess (suggested in a article) is a fine choice, although I found it did not do a good job of filtering out bot hits. For most people this might not be a big deal but it annoyed me.
In the end I just wrote[0] a simple hit counter that triggers off a js beacon.
I am also working on a similar product to Matomo, it also provides some of the Matomo's premium features for a much cheaper price: https://usertrack.net
GoAccess is cool but it won't help you much if you have a static website or if you serve the majority of requests from a CDN. In those cases, Matomo (previously piwik) is a good solution for client-side JS-based analytics similar to Google Analytics.
I’m not familiar with GoAccess or Matomo but after some Googling it looks like Matomo requires self-hosting it’s PHP/MySQL server.
If you are using a CDN then I’m assuming that multiple access.log files need to be aggregated at some point unless the CDN has a service that automates the process. Logstash and the full ELK Stack (or alternative) seem to be required when multiple heterogeneous servers are involved in serving content. Browser-side JavaScript analytics seems to avoid the DevOps surrounding the ELK Stack. GoAccess seems like a minimalist solution when you control a single httpd-style server and can run a local daemon to process the access.log file.
As a side note, it looks like you need an Enterprise account to use Cloudflare’s Logpush Service or Logpull API. Amazon’s CloudFront has an advantage here.
Avoiding Google Analytics is non-trivial for the lazy and/or price conscious.
How does Matomo work better with a static site, than GoAccess? GoAccess reads the server logs and creates metrics. Matomo requires javascript; that isn't 'static website'.
I use it in my blog, but also believe, that the numbers are completely inflated. I don't trust them. This has been discussed on Github a few times [1], so don't expect accurate numbers (yet).
It's also hard to see what's going on recently on you server, because you only get totals. I'd love if I could change the time interval of the shown html stats.
I like the way GoAccess is going though and I hope it will improve.
Interesting, I've switched from Google Analytics to GoAccess ~5 years ago. I let both analytics run for a month and compared the results. The relative numbers were very similar (so I get the same information about what blog posts are most popular), but the absolute numbers were in fact lower for GoAccess. It might be because tech blog visitors are using AdBlockers more often (and hence block GA).
> I'd love if I could change the time interval of the shown html stats.
GoAccess displays the data that you pass it. So while it doesn't have any date filter option (at least the last time I've checked), you can just filter your logs beforehand. There's even a more simple solution that I'm using: Set the logrotate to a specific time frame (e.g. weekly), so you can pass "access.log" to GoAccess to only get the latest stats. You can still pass "access.log*" to get ALL stats at the same time.
I’m not associated with Countly ( https://count.ly/ ), but I heard good things from my friends using it. It’s open source and makes money with enterprise edition.
Has anyone had the experience of seeing over-reporting of some metrics in Google Analytics as compared to an internal tracking system? Is this data generally seen to be 100% reliable?
Nice work! Every time I see the "de-Google-ing" posts, there are people in comments saying but can it do this or can it integrate with that Google product.
I'm working on a Google Analytics alternative myself [1] and we make it clear that it is not meant as a clone or a full blown replacement of Google Analytics.
Some people are fine running GA and are happy to integrate with Google Ads and the rest of the Google ecosystem.
On the other hand, some would prefer to focus more on privacy of their visitors or on not having to get cookie / GDPR consent or on having a faster loading website or support a more independent web etc. And alternative solutions to Google products such as these are more meant for those use cases.
[+] [-] epoch_100|5 years ago|reply
Plug: those who want a modern client-side analytics tool that’s free, self hosted, and open source might consider Shynet [0]. (Disclosure: I maintain it.) It’s a bit simpler/cleaner than Matomo, but exists in the same category.
[0] https://github.com/milesmcc/shynet
[+] [-] tjbiddle|5 years ago|reply
Any non-Google analytics options integrate well with Google Ads, especially for retargeting?
[+] [-] Danbana|5 years ago|reply
[+] [-] slykar|5 years ago|reply
[+] [-] marcus_holmes|5 years ago|reply
> The thing I'm interested in when using Google Analytics is tracking user path to see the bounce rate in a checkout process for example
There are lots of ways to get this data direct from the servers, and much, much more reliably and effectively.
> You can calculate conversion rates for different user segments.
Maybe. You only have data on those people who don't have adblockers that block GA. Which is a sizable segment of the internet. Also that they're not using a VPN, etc. Stats collected via GA are inherently less reliable than stats collected from your own site because GA is easily blocked.
Plus, there have been reports since forever that GA's stats are just not that reliable in the first place.
I've experienced this myself - I used to admin a WP blog, and the numbers from the site log and GA were around 20% different.
If you're relying on GA stats to calculate the results of any a/b testing, then you need to put in at least a +/- 25% error factor (i.e. if the A test converts 10% better than the B test, you have no idea whether that's a real thing, or a product of GA giving you inaccurate data - you'd need at least a 25% swing to begin to think it might be a real customer preference).
> Developers often miss the point of GA, because they do not work in sales.
Yes. But have you listened to their objections, rather than dismissing them as "not working in sales"? Developers do have some knowledge of this subject. And there are lots of ways of getting this data that doesn't increase page load and compromise security like GA does.
[+] [-] sdoering|5 years ago|reply
Just to clarify, because as far as I know from some years in the industry, in every tool I got to know a "Bounce" is defined as a single tracking hit from a user id without further measured activity.
Meaning after the first measured page view (the entry to the site) leaving the site couldn't be called a bounce.
None the less: If one is interested in a user privacy aware solution to replace GA with a focus on funnels, eCommerce, and such I would probably (up to a certain scale of traffic) recommend to take a look at Matomo (former Piwik). Can be run on your own server, has a lot of the basic functions of GA, a great API, can be used to do goal/path analysis as well as marketing performance reporting.
If a company/site reaches a certain scale I would probably recommend to use a paid solution like Adobe Analytics (or paid GA), after having done an evaluation into the real needs of said company/site.
[+] [-] nicbou|5 years ago|reply
- Is supporting IE11 financially justified?
- Which articles generate the most income, and why?
- Which components are useful, and which are just noise?
- Where are my visitors from? (it affects how I can help them)
[+] [-] sputr|5 years ago|reply
Easy: use a 1st party self-hosted tracker (matomo/piwik), anomnize last digits of IP, respect DNT and provide an optout (matomo provides an embeddable widget) on your privacy policy page.
And bam, no legal need for cookie consent OR notification! No popup at all! And you still get perfectly usable statistics for most applications.
[+] [-] kevingrahl|5 years ago|reply
Receive users’ consent before you use any cookies except strictly necessary cookies?
[+] [-] esperent|5 years ago|reply
[+] [-] sdoering|5 years ago|reply
This solution takes your server log files and does an analysis/reporting based on that. As these log are written, when the browser accesses the resources (sites) you get clear stats oof how often your resources (sites/files) got requested.
As others stated nobody will filter out bots for you. So this will inflate the numbers with traffic from "none users". But also "adblock users" will show up.
As always - web stats are but an approximation of the real world. Their analysis depends on a lot of factors. From experience with different clients I know of cases were AdBlockers blocked up to 25% of traffic from showing up in analytics on sites with a more tech savvy audience.
[+] [-] dmkii|5 years ago|reply
[+] [-] JackWritesCode|5 years ago|reply
[+] [-] AdriaanvRossum|5 years ago|reply
[1] https://simpleanalytics.com
[+] [-] gpanders|5 years ago|reply
It is not by default, but you certainly can self host it, and the author has been quite open about that being a viable path for people interested in doing so. I self host GoatCounter myself and it works very well.
[+] [-] galfarragem|5 years ago|reply
[+] [-] Carpetsmoker|5 years ago|reply
[+] [-] luxurytent|5 years ago|reply
[+] [-] owenshen24|5 years ago|reply
[+] [-] johnchristopher|5 years ago|reply
[+] [-] luxurytent|5 years ago|reply
[1] https://github.com/zgoat/goatcounter/blob/master/docs/sessio...
[+] [-] wldlyinaccurate|5 years ago|reply
[+] [-] JackWritesCode|5 years ago|reply
* We allow our users to bypass ad-blockers and exclude themselves from tracking their own page views (https://usefathom.com/blog/custom-domains-embed-code)
* We are both full time on Fathom, bootstrapped (we've turned down millions in VC) and profitable (https://usefathom.com/blog/quit)
* We are adding in unlimited uptime monitoring on Friday (SMS, Telegram, Slack and email notifications)
* We run on auto-scaling infrastructure and are used by individuals, small businesses, governments and some of the biggest companies in the world
[+] [-] joe5150|5 years ago|reply
[+] [-] XCSme|5 years ago|reply
[+] [-] AndrewStephens|5 years ago|reply
There are companies that live and die by analytics and demographics but your personal blog doesn't need the information that GAnalytics sucks up.
GoAccess (suggested in a article) is a fine choice, although I found it did not do a good job of filtering out bot hits. For most people this might not be a big deal but it annoyed me.
In the end I just wrote[0] a simple hit counter that triggers off a js beacon.
[0] https://github.com/andrewstephens75/visitlog
[+] [-] celsoazevedo|5 years ago|reply
It's cheaper than Simple Analytics (mentioned in the article) and no data is sent to a service operated by a 3rd party because it's self-hosted.
[+] [-] XCSme|5 years ago|reply
[+] [-] chatmasta|5 years ago|reply
[+] [-] sradman|5 years ago|reply
If you are using a CDN then I’m assuming that multiple access.log files need to be aggregated at some point unless the CDN has a service that automates the process. Logstash and the full ELK Stack (or alternative) seem to be required when multiple heterogeneous servers are involved in serving content. Browser-side JavaScript analytics seems to avoid the DevOps surrounding the ELK Stack. GoAccess seems like a minimalist solution when you control a single httpd-style server and can run a local daemon to process the access.log file.
As a side note, it looks like you need an Enterprise account to use Cloudflare’s Logpush Service or Logpull API. Amazon’s CloudFront has an advantage here.
Avoiding Google Analytics is non-trivial for the lazy and/or price conscious.
[+] [-] PenguinCoder|5 years ago|reply
[+] [-] zubspace|5 years ago|reply
* No javascript.
* Does not need a database.
* Realtime stats in html format.
I use it in my blog, but also believe, that the numbers are completely inflated. I don't trust them. This has been discussed on Github a few times [1], so don't expect accurate numbers (yet).
It's also hard to see what's going on recently on you server, because you only get totals. I'd love if I could change the time interval of the shown html stats.
I like the way GoAccess is going though and I hope it will improve.
[1] https://github.com/allinurl/goaccess/issues/964#issuecomment...
[+] [-] darekkay|5 years ago|reply
> I'd love if I could change the time interval of the shown html stats.
GoAccess displays the data that you pass it. So while it doesn't have any date filter option (at least the last time I've checked), you can just filter your logs beforehand. There's even a more simple solution that I'm using: Set the logrotate to a specific time frame (e.g. weekly), so you can pass "access.log" to GoAccess to only get the latest stats. You can still pass "access.log*" to get ALL stats at the same time.
[+] [-] darekkay|5 years ago|reply
* https://news.ycombinator.com/item?id=22813168
* https://news.ycombinator.com/item?id=21890027
* https://news.ycombinator.com/item?id=19883876
* https://news.ycombinator.com/item?id=18810035
[+] [-] girzel|5 years ago|reply
[+] [-] luxurytent|5 years ago|reply
[+] [-] monus|5 years ago|reply
[+] [-] tobilg|5 years ago|reply
[+] [-] wharfjumper|5 years ago|reply
[+] [-] traceroute66|5 years ago|reply
[+] [-] sdoering|5 years ago|reply
Non the less one doesn't need to use log analysis and py2.X if you want to use Matomo as a GA replacement in the frontend.
So yeah - they should fix the log analysis part with py2 -> py3. But that doesn't invalidate the whole tool imho.
[1]: https://www.python.org/dev/peps/pep-0373/
[+] [-] fareesh|5 years ago|reply
[+] [-] markosaric|5 years ago|reply
I'm working on a Google Analytics alternative myself [1] and we make it clear that it is not meant as a clone or a full blown replacement of Google Analytics.
Some people are fine running GA and are happy to integrate with Google Ads and the rest of the Google ecosystem.
On the other hand, some would prefer to focus more on privacy of their visitors or on not having to get cookie / GDPR consent or on having a faster loading website or support a more independent web etc. And alternative solutions to Google products such as these are more meant for those use cases.
[1]: https://plausible.io/