top | item 44838210

(no title)

j_kao | 6 months ago

Author here! We were really motivated to turn a "distributed system" problem into a "monolithic system" from an operations perspective and felt this was achievable with current hardware, which is why we went with in-process, embedded storage systems like RocksDB and Tantivy.

Memory-mapping lets us get pretty far, even with global coverage. We are always able to add more RAM, especially since we're running in the cloud.

Backfills and data updates are also trivial and can be performed in an "immutable" way without having to reason about what's currently in ES/Mongo, we just re-index everything with the same binary in a separate node and ship the final assets to S3.

discuss

order

benjiro|6 months ago

Why not just use a open source solution like paradedb ... .

Paradedb = postgres pg_search plugin (the base is tantivy). Need anything else like vectors or whatever, get the plugins for postgres.

The only thing your missing is a LSM solution like RocksDB. See Orioledb what is supposed to become a plugin storage engine for postgres but not yet out of beta.

Feels like people reinvent the wheel very often.

farsa|6 months ago

What was your experience like putting such thing together?