Datomic is Free | WingNews

[+] augustl|2 years ago|reply

Datomic's is perfect for probably 90% of small-ish backoffice systems that never has to be web scale (i.e. most of what I do at work).

Writing in a single thread removes a whole host of problems in understanding (and implementing) how data changes over time. (And a busy MVCC sql db spends 75% of its time doing coordination, not actual writes, so a single thread applying a queue of transactions in sequence can be faster than your gut feeling might tell you.)

Transactions as first-class entities of the system means you can easily add meta-data for every change in the system that explains who and why the change happened, so you'll never again have to wonder "hmm, why does that column have that value, and how did it happen". Once you get used to this, doing UPDATE in SQL feels pretty weird, as the default mode of operation of your _business data_ is to delete data, with no trace of who and why!

Having the value of the entire database at a point in time available to your business logic as a (lazy) immutable value you can run queries on opens up completely new ways of writing code, and lets your database follow "functional core, imperative shell". Someone needs to have the working set of your database in memory, why shouldn't it be your app server and business logic?

Looking forward to see what this does for the adoption of Datomic!

[+] electroly|2 years ago|reply

> Someone needs to have the working set of your database in memory, why shouldn't it be your app server and business logic?

This one confused me. The obvious reason why you don't want the whole working set of your database in the app server's memory is because you have lots of app servers, whereas you only have one database[1]. This suggests that you put the working set of the database in the database, so that you still only need the one copy, not in the app servers where you'd need N copies of it.

The rest of your post makes sense to me but the thing about keeping the database's working set in your app server's memory does not. That's something we specifically work to avoid.

[1] Still talking about "non-webscale" office usage here, that's the world I live in as well. One big central database server, lots of apps and app servers strewn about.

[+] epolanski|2 years ago|reply

> Datomic's is perfect for probably 90% of small-ish backoffice systems that never has to be web scale (i.e. most of what I do at work).

So is any cloud-managed db offering and at that scale we talking very small costs anyway.

Why datomic instead?

[+] xpe|2 years ago|reply

> Datomic's is perfect for probably 90% of small-ish backoffice systems that never has to be web scale (i.e. most of what I do at work).

I don’t think I agree with this as stated. It is too squishy and subjective to say “perfect”.

More broadly, the above is not and should not be a cognitive “anchor point” for reasonable use cases for Datomic. Making that kind of claim requires a lot more analysis and persuasion.

[+] fulafel|2 years ago|reply

> Someone needs to have the working set of your database in memory, why shouldn't it be your app server and business logic?

This is Ions in the Cloud version, or for on-prem version the in-process peer library.

[+] Lutger|2 years ago|reply

Datomic always seemed like a really cool thing to use. However, I'm not familiar with Clojure or any other JVM based language, nor do I have the time to learn it. And I can't find any supported way to use it with other languages (I'm not even talking about popular frameworks), or am I missing something?

It doesn't feel like the people behind Datomic actually want to have users outside of the Clojure world, which will be rather limiting to adoption.

[+] brundolf|2 years ago|reply

Something I've been curious about: how well (or badly) would it scale to do something similar on a normal relational DB (say, Postgres)?

You could have one or more append-only tables that store events/transactions/whatever you want to call them, and then materialized-views (or whatever) which gather that history into a "current state" of "entities", as needed

If eventual-consistency is acceptable, it seems like you could aggressively cache and/or distribute reads. Maybe you could even do clever stuff like recomputing state only from the last event you had, instead of from scratch every time

How bad of an idea is this?

[+] Scarbutt|2 years ago|reply

Datomic's is perfect for probably 90% of small-ish backoffice systems that never has to be web scale

How do they scale it for Nubank? (millions of users)

[+] spariev|2 years ago|reply

One thing which is quite hard to do in Datomic is simple pagination on a large sorted dataset, as one can easily do with LIMIT/OFFSET in MySQL for example. There are solutions for some of the cases, but general case is not solved, as far as I remember (it’s been a while I used it extensively)

[+] pachico|2 years ago|reply

You seem to describe the Event Source paradigm rather than a database :)

[+] JimmyRuska|2 years ago|reply

> doing UPDATE in SQL feels pretty weird, as the default mode of operation of your _business data_ is to delete data, with no trace of who and why!

It's a good idea to version your schema changes using something like liquibase into git, that gets rid of at least some of those pains. Liquibase works on a wide variety of databases, even graphs like Neo4j

I got the same feeling in Erlang many times, once write operations start getting parallel you worry about atomic operations, and making an Erlang process centralize writes through its message queue always feels natural and easy to reason about.

[+] mst|2 years ago|reply

I guess NuBank (Cognitect's owners) have concluded that the paid licensing business wasn't worth the hassle compared to having the developer time involved spent on other things.

Releasing only binaries, while I understand people being grumpy about it, seems like an interesting way of keeping their options open going forwards. Since it was always closed source, it now being 'closed source but free' is still a net win.

The Datomic/Cognitect/NuBank relationship is an interesting symbiotic dynamic and while I'm sure we can all think of ways it might go horribly wrong in future I rather hope it doesn't.

[+] motoboi|2 years ago|reply

Very probably they understood that having a property database makes hiring and onboarding more difficult than necessary.

Open sourcing the database helps on that.

[+] samuell|2 years ago|reply

Question to people having used Datomic:

Based on experience with Prolog, I always thought using Datalog in a database like Datomic would mean being able to model your data model using stored queries as a very expressive way of creating "classes". And that by modeling your data model using nested such queries, you alleviate the need for an ORM, and all the boilerplate and duplication of defining classes both in SQL and as objects in OO code ... since you already modelled your data model in the database.

Does Datomic live up to that vision?

[+] bvanderveen|2 years ago|reply

From experience:

Datomic Cloud is slow, expensive, resource intensive, designed in the baroque style of massively over-complicated CloudFormation astronautics. Hard to diagnose performance issues. Impossible to backup. Ran into one scenario where apparently we weren't quick enough to migrate to the latest version, AWS had dropped support for $runtime in Lambda, and it became impossible to upgrade the CloudFormation template. Had to write application code to export/reimport prod data from cluster to another—there was no other migration path (and yes, we were talking to their enterprise support).

We migrated to Postgres and are now using a 10th of the compute resources. Our p99 response times went from 1.3-1.5s to under 300ms once all the read traffic was cut over.

Mother Postgres can do no wrong.

Still, Datomic seems like a cool idea.

[+] lgrapenthin|2 years ago|reply

As someone who is using Datomic Pro in production for many years now I must agree with you. One time I began a project with Datomic Cloud and it was a disaster similar to what you described. I learned a lot about AWS, but after about half a year we switched to Datomic Pro.

There were some cool ideas in Datomic Cloud, like IONs and its integrated deployment CLI. But the dev workflow with Datomic Pro in the REPL, potentially connected to your live or staging database is much more interactive and fun than waiting for CodeDeploy. I guess there is a reason Datomic Pro is the featured product on datomic.com again. It appears that Cognitect took a big bet with Datomic Cloud and it didn't take off. Soon after the NuBank acquisition happened. That being said, Datomic Cloud was not a bad idea, it just turned out that Datomic Pro/onPrem is much easier to use. Also of all their APIs, the "Peer API" of Pro is just the best IME, especially with `d/entity` vs. "pull" etc.

[+] JulianWasTaken|2 years ago|reply

I don't doubt your story of course, and I love Postgres, but comparing apples to oranges no?

Datomic's killer feature is time travel.

Did you simply not use that feature once you moved off Datomic (and if so why'd you pick Datomic in the first place)? Or are you using Postgres using some extension to add in?

[+] outworlder|2 years ago|reply

Are they _forcing_ you to use CloudFormation? Or is it just the officially supported mechanism?

> Mother Postgres can do no wrong.

I'll say that Postgres is usually the answer for the vast majority of use-cases. Even when you think you need something else to do something different, it's probably still a good enough solution. I've seen teams pitching other system just because they wanted to push a bunch of JSON. Guess what, PG can handle that fine and even run SQL queries against that. PG can access other database systems with its foreign data wrappers(https://wiki.postgresql.org/wiki/Foreign_data_wrappers).

The main difficulty is that horizontally scaling it is not trivial(although not impossible, and that can be improved with third party companies).

[+] ggleason|2 years ago|reply

> Datomic Cloud is slow, expensive, resource intensive, designed in the baroque style of massively over-complicated CloudFormation astronautics. Hard to diagnose performance issues. Impossible to backup.

You should give TerminusDB a go (https://terminusdb.com/), it's really OSS, the cloud version is cheap, fast, there are not tons of baroque settings, and it's easy to backup using clone.

TermiusDB is a graph database with a git-like model with push/pull/clone semantics as well as a datalog.

[+] ithrow|2 years ago|reply

I guess this is why datomic.com front page now defaults to datomic pro and not cloud.

[+] panick21_|2 years ago|reply

They should have just focused on having a great Kubernetes setup experience, their focus on this cloud stuff always seemed strange to me.

[+] avodonosov|2 years ago|reply

Why backups were impossible? Couldn't you backed up the storage resources?

[+] froggertoaster|2 years ago|reply

> Mother Postgres can do no wrong.

Simple, eloquent, damn true.

[+] martypitt|2 years ago|reply

Only the binaries are made available, not the source, which is interesting.

I guess they don't claim to be open source, they're claiming to be free, which is - in itself - awesome.

Last time I checked, you couldn't push binaries to maven central, without also releasing the source. That may have changed.

[+] miroljub|2 years ago|reply

They say it's under the Apache 2 licence, so it is open source.

EDIT: I was wrong. They actually released binaries under the Apache licence, not the source code. Which is, mildly said, deceptive. I don't even have an idea what that actually means.

[+] eternalban|2 years ago|reply

Someone (forget who but he worked there) was giving a presentation of Datomics in some downtown (NYC) bank circa 2014 iirc. Per the presenter -- iirc someone asked a specific technical question -- even people working for the company don't get to see the full source. Only a small team has access to the full source, and he said he wasn't one of them.

[+] casion|2 years ago|reply

Maven actually has a tutorial on publishing binaries without source. So I assume it's ok when they tell you how to do it.

[+] blatant303|2 years ago|reply

Datomic is an event-sourced db, and it makes it hard to introduce retroactive corrections to the data when your program's semantic already rely on using datomic's time travelling abilities: at one point you'll need to to distinguish between event time and recording time as explained in this excellent blog post:

https://vvvvalvalval.github.io/posts/2018-11-12-datomic-even...

This is why I' rather use XTDB [1], a database similar to datomic in spirit, but with bitemporality baked in.

[1] https://www.xtdb.com

[+] adamfeldman|2 years ago|reply

What is Datomic, you ask? It's a database written in Clojure. https://hn.algolia.com/?q=datomic

  Datomic is an operational database management system - designed for transactional, domain-specific data. It is not designed to be a data warehouse, nor a high-churn high-throughput system (such as a time-series database or log store).
  It is a good fit for systems that store valuable information of record, require developer and operational flexibility, need history and audit capabilities, and require read scalability.

(via https://docs.datomic.com/pro/getting-started/brief-overview....)

[+] derefr|2 years ago|reply

> Is it Open Source?

> Datomic binaries are provided under the Apache 2 license which grants all the same rights to a work delivered in object form.

So... no?

(I say that, but "Datomic binaries" presumably refers to compiled JVM class files; and JVM bytecode is notoriously easy to decompile back to legible source code, with almost all identifiers intact. Would Apache-licensing a binary, imply that you have the right to decompile it, publish an Apache-licensed source-code repo of said decompilation, and then run your own FOSS project off of that?)

[+] hombre_fatal|2 years ago|reply

Aside, I remember HN in 2009 or so where Clojure was a daily homepage staple and Rich Hickey was putting out his talks about Clojure and code design.

I watched a lot of that and used Clojure fulltime for five years. Wonder what he's up to these days.

[+] beders|2 years ago|reply

Every single day I wish the architects at my current job had chosen Datomic instead of Postgresql. It would have saved us so so much time and trouble. The time traveling ability alone would have been so useful so many times.

Also the ability to annotate transactions is awesome.

So many goodies.

Here's a good summary:

https://medium.com/@val.vvalval/what-datomic-brings-to-busin...

[+] brianwawok|2 years ago|reply

Datomic LOOKED cool 10-12 years when it first came out. But they started from day 1 with a price, so most people, me included, just passed over it.

I think they went way too fast to commercial, and needed to go a freemium model to actually get market share.

[+] xpe|2 years ago|reply

This doesn’t quite reflect the history. Datomic had various free/trial options. They evolved a little bit. Someone who watched the pricing and licenses very closely probably could do a better timeline than I could.

[+] rektide|2 years ago|reply

It notably powers Roam, a kind of interesting notes/mini Notion product.

There's a reasonably interesting writeup of the tech details that helps show off Atomics value some, https://www.zsolt.blog/2021/01/Roam-Data-Structure-Query.htm... https://news.ycombinator.com/item?id=29295532

[+] jchw|2 years ago|reply

Not complaining about the actual announcement itself here: seems pretty sweet all things considered, But: the "Is it Open Source?" section should lead with "No." It's not a complicated question, and it's not a complicated answer. I think it's weird to talk about having "all the same rights" without explaining why that matters particularly (it does matter, it's just not explained much!) but it is somewhat tangential to the question being posed which has a very clear and straightforward answer.

I hope more companies consider this unusual arrangement at least as an alternative to other approaches. Permissively licensed binaries can come in handy, though it certainly comes with it's risks. For example, Microsoft released the binaries for its WebView2 SDK under the BSD license; this is nice of course, but the side-effect is that we can (and did) reverse engineer the loader binary back to source code. I suspect that's unlikely to happen for any substantially large commercial product, and I am not a lawyer so I can't be sure this isn't still legally dubious, but it's still worth considering: the protections of a EULA are completely gone here, if you just distribute binaries under a vanilla permissive open source license.

[+] thayne|2 years ago|reply

> Is it Open Source?

> Datomic binaries are provided under the Apache 2 license which grants all the same rights to a work delivered in object form.

That doesn't answer the question at all. I assume the answer is no, because otherwise they would just say yes, and have a link to the source code somewhere. But that is such a weird, and possibly duplicitous way to answer.

[+] kgwxd|2 years ago|reply

I really like Clojure and the ideas behind Datomic but free without source is a trap, every time. They have to make money somehow, but they already sold to a bank. If that bank wants devs willing to work on their systems after the current generation moves on, I think they'd be better off going open source and to continue paying good devs to work on it. Everyone already knows lock-in is bad for businesses. Devs will seek non-proprietary solutions first, if they can't find it, there are already plenty of proven proprietary solutions they'll settle on way before Datomic. Open the source, sell the support.

[+] fulafel|2 years ago|reply

> Datomic Cloud will be available on AWS Marketplace with no additional software cost.

This is cool as well. It's a CloudFormation template based product you can deploy from AWS Marketplace.

[+] CrimsonCape|2 years ago|reply

Since the conversation seems to be focusing on the Apache 2.0 license, what would you do? Clearly there isn't a lot of precedent for "closed-source, free-to-use" licenses.

In this case Datomic maintains development control over their product and "source of truth" is still themselves, and the implicit assumption is that you enthusiastically use their product for free with no strings attached because you respect them as the source of truth.

[+] yencabulator|2 years ago|reply

> Clearly there isn't a lot of precedent for "closed-source, free-to-use" licenses.

Freeware has been a thing for mere four decades now.

https://en.wikipedia.org/wiki/Freeware

[+] endisneigh|2 years ago|reply

The price is certainly right, but has anyone used this in production? What was your experience like?

[+] ddellacosta|2 years ago|reply

My personal experience was using Datomic backed by DynamoDB, at the second Clojure company I worked at. In particular I remember feeling like it was hard to anticipate and understand its performance characteristics in particular, and how indices can be leveraged effectively. Maybe if we had chosen Postgres as a backing store that would have been better? I dunno.

Using it was pretty nice at the scale of a small startup with a motivated team, but scaling it up organizationally-speaking was a challenge due to Datalog's relative idiosyncrasy and poor tooling around the database itself. This was compounded by the parallel challenge of keeping a Clojure codebase from going spaghetti-shaped, which happens in that language when teams scale without a lot of "convention and discipline"--it may be easier to manage otherwise. All of that said, this was years ago so maybe things have changed.

At this point I'd choose either PostgreSQL or SQLite for any project I'm getting started with, as they are both rock-solid, full-featured projects with great tooling and widespread adoption. If things need to scale a basic PostgreSQL setup can usually handle a lot until you need to move to e.g. RDS or whatever, and I'm probably biased but I think SQL is not really that much worse than Datalog for common use-cases. Datalog is nice though, don't get me wrong.

EDIT: one point I forgot to make: the killer feature of being an immutable data store that lets you go back in time is in fact super cool, and it's probably exactly what some organizations need, but it is also costly, and I suspect the number of organizations who really need that functionality is pretty small. The place I was at certainly didn't, which is probably part of the reason for the friction I experienced.

[+] Naomarik|2 years ago|reply

https://sayartii.com/ is using Datomic stored on postgres that I have set up on Linode. That was all done back in 2020 and haven't needed to touch it. Site now gets ~180M monthly reqs and I store an enormous amount of analytic data on Datomic (was supposed to be temporary) so users can see impressions/clicks per day for each advertisement. I'm surprised it's still working.

Development experience is extremely nice using clojure. I've used it for two other projects and has been very reliable. My latest project didn't really need any of its features compared to a traditional rdbms but I opted for it anyways so I don't have to write sql.

[+] jackrusher|2 years ago|reply

Yes, it's used by many companies in production. There's a partial list here:

https://www.datomic.com/customers.html

[+] adamfeldman|2 years ago|reply

One example: https://www.datomic.com/nubanks-story.html

[+] raybb|2 years ago|reply

I haven't used it but I guess Nubank uses it. https://www.youtube.com/watch?v=qIdrT6r77gA

[+] xmlblog|2 years ago|reply

Netflix, Facebook (Meta), Nubank, and many others.

[+] AzzieElbab|2 years ago|reply

Congratulations to Rich Hickey's children!! I hope your college experience was excellent. Disclaimer: that is how Rich explained why Datomic stayed closed source.

[+] avodonosov|2 years ago|reply

If you work with Datomic, my little helper functions may be useful: https://github.com/avodonosov/datomic-helpers

368 comments