top | item 24198329

Umami: Self-hosted open-source alternative to Google Analytics

820 points| bananaoomarang | 5 years ago |umami.is

227 comments

order
[+] mcao|5 years ago|reply
Hi everyone!

Author of Umami here. I totally did not expect this response so it looks like you all hugged my little server to death. The demo should be back up now.

A little background. This is a side project I started 30 days ago because I was tired of how slow and complicated Google Analytics was. I just wanted something really simple and fast that I could browse quickly without diving through layers of menus. So I created Umami to track my own websites and then open sourced it. The stack is React, Redux, and Next.js with a Postgresql backend.

Would be happy to answer any questions you have.

[+] sorenbs|5 years ago|reply
This is a really cool project. I’m happy to see that you are using Prisma for data access. If you are interested we can set up a shared slack channel so you can provide feedback and we can make sure we support everything you need for this project :-)
[+] anderspitman|5 years ago|reply
Since it's self-hosted, is there a reason you went postres rather than something simpler like sqlite or even flat files?
[+] ksec|5 years ago|reply
Are there any reason why you dont use currently available Open Sources solutions and decided to create your own? ( Other than it is fun to do it yourself :D )

I am wondering why in the past 2 years we went form having little to zero GA alternative to all of a sudden having dozens of them.

I am genuinely curious.

[+] malisper|5 years ago|reply
One of the claims of Umami is that it's GDPR compliant:

> Umami does not collect any personally identifiable information so it is GDPR and CCPA compliant. No cookie notices are needed because Umami does not use cookies.

From auditing the source code, this doesn't seem to be the case. First, it claims it doesn't use cookies, but it clearly uses localStorage to store a "sessionKey"[0].

The other claim, that Umami is GDPR and CCPA compliant because it does not collect any personally identifiable information is only half true. While the data collected isn't PII (because you can't use it on it's own to identify a user), it's still "personal data". This is because the "sessionKey" stored alongside all events is actually a pseudonymous user identifier. It's really just a hash of the user's IP along with a few other properties[1]. Because the data Umami collects, when combined with some other data, can be attributed back to the user, the data is still considered "personal data". That means you're still subject to most of GDPR such as GDPR deletion requests[2].

[0] https://github.com/mikecao/umami/blob/f4ca353b5c68750bf391e5...

[1] https://github.com/mikecao/umami/blob/master/lib/session.js#...

[2] https://gdpr-info.eu/art-17-gdpr/

[+] eric4smith|5 years ago|reply
Lots of home-grown analytics are very privacy focussed these days and do not use cookies. That's a good thing.

For simple sites like blogs, simple low volume ecommerce, etc.

But for more "serious" eCommerce, SAAS based applications and sites that are concerned with marketing on email, social and web then then optimizing what you show then and finally generating leads for salespeople to call or actual sales...

Cookies or local storage, or some way of tracking the customer across all the channels and their actions are essential.

If one can avoid using Google Analytics, then that's a good thing also.

But let's get real -- the idea of a cookie-less future is not gonna happen because people actually do business in the web.

[+] andrewzah|5 years ago|reply
I have been using goatcounter [0] and love the simplicity. I used to use Matomo, but they want a lot of money to see the referrals from google search/etc. And it's a heavier dependency. Goatcounter is a drop-in golang binary.

[0]: https://github.com/zgoat/goatcounter

[+] lxe|5 years ago|reply
I've seen a bunch of these simple self-hosted log dashboards here on HN, but I don't think they directly compare with google analytics, which is just a much more powerful and much much more complicated product. Not to say this isn't a great product, but it really isn't an alternative to GA.
[+] slg|5 years ago|reply
I wonder how many users actually use those advance features. As someone who has only ever used GA to help provide insight into developmental priorities (i.e. not for marketing), this doesn't help too much. For example, this tells you the browser but it doesn't tell you the browser version. It tells you the device being used, but it doesn't tell you the resolution of that device. It tells you the country of your visitors, but it doesn't tell you the user's language. It tells you pages users visit, but it doesn't tell you the order in which they visit them.

This isn't a criticism of Umami. It looks like a nice clean app that accomplishes what it is trying to do. But if this is all you needed from Google Analytics than that tool was overkill in the first place.

[+] swores|5 years ago|reply
Alternative doesn't have to mean it offers exactly the same - for example a bike is an alternative travel option compared to a bus.
[+] ggregoire|5 years ago|reply
Do you know any good resources to learn the intricacies of Google Analytics and its related marketing concepts?
[+] arielm|5 years ago|reply
This looks really nice! If... you’re only looking for high level numbers for something like a personal blog or a simple landing page for a mobile app.

I wouldn’t call this a replacement to Google Analytics.

The reason to have something like Google Analytics is to track traffic at a more granular level, and with very specific intent.

Some of the things I _rely_ on include:

- custom parameters - segments - goals - A/B testing - specific views

And that’s just the short list.

Now, I use Analytics heavily because we spend a lot of effort on growth, both organic (content, seo) and paid (ads), so knowing what’s going on at that level is essential.

If you don’t, there’s not much reason to use something like GA.

[+] vs4vijay|5 years ago|reply
[+] XCSme|5 years ago|reply
Thanks for mentioning https://www.userTrack.net, I'm the author and still working full-time on improving it. Let me know if you have any questions/remarks about userTrack.
[+] gentleman11|5 years ago|reply
The problem with matomo (not their fault) is that Microsoft flags your site as distributing malware and you disappear from search engines. You have to fill out a bunch of forms to fix it. It’s listed in the matomo faq and is basically either from a bot falsely reporting you, or some other glitch. It’s why my blog is still invisible to bing users: if you visit in edge, you get huge menacing red warnings.
[+] tiffanyh|5 years ago|reply
There are a bunch of Github "awesome software" lists.

One thing I haven't seen is someone categorize open source web traffic analytics into Client Side Analytics (via javascript) and Web Server Log analytics.

Since each approach drastically changes the data collected and reported.

[+] tedivm|5 years ago|reply
Fathom started as open source, but the founders stopped supporting the open source project. It's basically abandoned at this point, with no new releases in almost two years and only updates to the README.
[+] woutr_be|5 years ago|reply
I’ve been using GoatCounter for a couple of months now, it’s great! Super simple interface with all the data you need.
[+] ahnick|5 years ago|reply
+1 for Plausible.

I've been using it for our name generator product Mashword (https://mashword.com) and it was really straightforward to implement. It's reasonably priced, has a clean interface and graphs, is privacy protecting and supports using your own domain for pulling in the js include.

[+] poidos|5 years ago|reply
+1 for matomo. Very easy to set up a self-hosted instance, good documentation, and works well. NB: My site is pretty low-traffic.
[+] diafygi|5 years ago|reply
Of these, do any have a funnel tracking feature that shows what visitors went through a specific series of pages/events? Seeing how users moved about the site and seeing how many converted is a deal breaker for me.
[+] Deukhoofd|5 years ago|reply
Any privacy oriented analytics tools not purely focused on website analytics?
[+] asddubs|5 years ago|reply
posthog is really unfortunately named
[+] thinkmassive|5 years ago|reply
A comparison of Umami and Matomo (formerly Piwik) would be helpful since they seem very similar. I looked at both websites and didn't see any mention of the other project.
[+] colechristensen|5 years ago|reply
Is there a similar product that does this server side (without injected javascript telemetry) with http logs?
[+] anderspitman|5 years ago|reply
Keep in mind if you're using a CDN (ie CloudFlare) your absolute numbers will be way off.
[+] eli|5 years ago|reply
You end up with a lot of noise from bots and crawlers (using bogus user agents) if you're just looking at server logs.
[+] ln_00|5 years ago|reply
to be honest, if you are using nginx, just use / run https://goaccess.io/ It collects the same information as umami and is even more lightweight, since it just runs whenever you tell it to.

just add the command as a cron job, and you get an auto generated static dashboard. very neat.

[+] chrisblackwell|5 years ago|reply
I'm very excited to see this space heating up. It seems for years we defaulted to using Google Analytics and no one wanted in the market. Now there are plenty alternatives, with many of them open source.
[+] dzink|5 years ago|reply
It needs more granularity of OS versions and browser versions. Knowing which iOS version your users have is important to decide on what base level version you need for an iOS app, for example.
[+] eden_h|5 years ago|reply
When I've seen GA used or recommended to people, it's because their use case is tracking the marketing performance of their website.

Tackling the privacy focus for GA is great, but they're a good deal of products out there that already fill that niche, not to mention the requirements of the privacy crowd usually being a venture into itself.

If you wanted to make it relatively competitive for marketing, the simplest addition would be adding labelling via regex for referrers.

i.e. - Some users want to be able to group Baidu, Google, DuckDuckGo, into a single bucket for comparison. Some users want to break them down into common market segments by country. "https://www.baidu.com/link?url=FyYbCZqj65Vc7A4XeSNrOcQCS2qFX...

is from your live demo referrers, and makes it difficult to actually assess the amount of traffic from Baidu. Using a regex label means that users can break down traffic from Paid/Organic marketing fairly quickly, and start to build up dashboards they can use.

If you ever extended it to allow multiple labels for each hit, could re-run the regex over past data, and could build reports off it, you'd easily have a benefit over GA that would start to wean the marketing crowd off it.

[+] hitekker|5 years ago|reply
For something this simple, I was hoping to see an option for SQlite, not just MySQL and PostgresSQL.
[+] busymichael|5 years ago|reply
Congrats on launching -- really impressive. One important issue that these self hosted analytics solve is ad blocking. Ad blocking by users really undermines the ability of a site or app to figure out what is working and not working. When you host your own analytics, you can get usability information for all of your users, not just those that don't block. That allows you to make a better product.

I have been working on something similar at https://argyle.cc -- we combine cloud analytics with a self-hosted analytics collector js. That gives you the best of both worlds: privacy focused, user respecting analytics, but full featured reporting in the cloud and ad-blocker resistance. It also allows event tracking to be done over js/web or in-line/server side.

[+] marcus_holmes|5 years ago|reply
I'd love to use this. But 34 dependencies?

I know ~10 of them are React, and there's some in there that make sense. But I haven't got the time to audit them all, and re-audit it every time any of those dependencies update .

And escape-string-regexp? Really? it's literally 2 lines of code [0]. Why have I got to give the maintainer of that project commit access to this program that will be seeing potentially sensitive data?

Why, if the developer couldn't come up with those 2 lines themselves, isn't this a Stack Overflow copy/paste?

[0]https://github.com/sindresorhus/escape-string-regexp/blob/ma...

[+] m90|5 years ago|reply
Is there a way for me as a user to opt out of this tool other than relying on third party tools like uBlock? I'm starting to get annoyed by so many "privacy focused" tools with literally no consent options at all.
[+] shattl|5 years ago|reply
Wow, Piwik is now Matomo, how fast time flies!
[+] nickthemagicman|5 years ago|reply
Am I wrong for thinking that Google Analytics has bad UI?

As a noob at UI it was bizarre and unintuitive for me.

Just finding the region locations of the traffic was odd and didn't make immediate sense.