top | item 37103532

(no title)

pixel_tracing | 2 years ago

I’d like to push back on these suggestions a bit.

1. Why use Postgres distributed cluster vs say an incremental store that supports real time data like Materialize? A streaming database sounds like the right use case for your requirements no? Under 1 min real time latency? Is Postgres distributed able to do it efficiently? (Never tried Postgres dist.)

2. Why use typescript at all? Pick a language that actually enforces class validation and enforces type validation baked into the language itself (like Rust or Go)? Sounds like programmer error is root cause of issue that should be looked into as a mitigation step (add real class validation at the minimum)

3. Regarding audit tables, are you also keeping audit tables for user and events tables too? That would seem… excessive and now duplicated data (especially billions of rows). Doesn’t the database come with audit tables baked into it?

discuss

order

habitue|2 years ago

2. Typescript is the language they're using. Generally people don't change their entire codebase to a different language because they run across a bug that would have been hypothetically solved by a different language. Additionally, this is a validation / coercion issue. It is a risk at the boundary of any two systems, you have to translate network bytes into a type your type system understands. If you translate it wrong, the type system is helpless to save you.

1. A streaming database isn't necessarily what they need here, postgres is plenty fast for most use cases. I'd move to a specialized tool like materialize if they've squeezed all of the juice out of postgres.

3. Postgres doesn't have audit tables built in, though there are multiple tools and plugins for postgres that can hook into things and do it for you. They went with a custom trigger solution, maybe that was sufficient for their use case.

malisper|2 years ago

I used to work at Heap, although I left 4 years ago

> Why use Postgres distributed cluster vs say an incremental store that supports real time data like Materialize

Materialize didn't exist when Heap was founded 10 years ago. Also, Materialize is dependent on knowing what queries you are running up front. Not to mention Heap is dealing with petabytes of data. Materialize only recently introduced multi-node support, so I would be surprised if it's being used at that kind of scale.

> Why use typescript at all?

Heap was originally written in CoffeeScript. It was the decision the semi-technical CEO made. Migrating to Typescript was the best option that allowed Heap to keep their existing codebase.

> Regarding audit tables, are you also keeping audit tables for user and events tables too?

No. Only the distributed metadata had audit logging when I was there

> Doesn’t the database come with audit tables baked into it?

No