top | item 36797336

(no title)

t90fan | 2 years ago

Running regular incremental repairs is the norm, as nodes will from time to time have trouble talking to each other due to real world network reasons, or will go down, for things like OS patching. We had a (daily) cron job for it. I come from the software side not the DBA side of things but my main advice from running Cassandra at scale in production (it was part of an Apigee stack) is don't basically! It was very not realisable, would consume huge volumes of memory (especially during repairs), bandwidth (doing a repair is very chatty as it has to sync lots of data) and disk space (tombstoning meant deleted records take up space until compaction runs), and was generally not much fun to manage, and it was difficult to hire people who knew much about it to do so. I would not build a solution myself using it going forward. We also had to periodically (weekly) do "full" repairs to work around Cassandra bugs, silent data corruption etc...

discuss

leokennis|2 years ago

We (as in, my company, not me myself) run large Cassandra clusters in the critical path of bank transaction processing (in the order of 2-25 million payments per day, each requiring a lot of database queries) and it's going pretty well...

https://www.youtube.com/watch?v=0QsLU9na2uE

But yes, you win some (mainly resilience, availability and disaster avoidance, possibly tunable consistency will help you) you lose some.

hsjqllzlfkf|2 years ago

To do 2-25 million transactions per day you might as well use SQLite. Sounds like this was a career development push more than anything.

darkstar_16|2 years ago

We run a Cassandra cluster in production and its a pretty small cluster yet all that you mentioned seems to resonate. We do use Cassandra reaper to automate some of the tasks but no one wants to touch Cassandra in general in the team.

hardwaresofton|2 years ago

Thanks for sharing your experience -- I know I've spent a lot of time in the past worrying about FS corruption, but generally expecting that the database sitting on top of it should never get corrupted, mostly because I use postgres so much.

I don't have the experience you do in this situation, but my first reaction to this was definitely "don't use Cassandra". But I also never really understood the use-case where Cassandra shines as a solution either (seems like only companies with a lot of data really seem to get wins from it?)

rickette|2 years ago

Can recommend https://cassandra-reaper.io/ for most of the management stuff you're mentioning. Still not free though, running Cassandra requires (some) effort in my experience.

TideAd|2 years ago

Scaling up also takes up a lot of resources so you're never able to scale up in response to load without hosing your database even more.