top | item 7439732

(no title)

ivansmf | 12 years ago

Hi - I figured you know what you're doing based on the details you picked on. I suspect you'd have way more fun reading the raw logs, but that would make for a post that is incredibly hard to parse. I will try to figure out a venue for them.

It was 100M keys per loader, so 30x100M for the low latency cluster and 45x100M for the high latency cluster. I sized to keep running long enough to go over half dozen major compactions, not necessarily to eat storage space.

Most of the IO is sequential, which really makes no difference for our backend (sorry, I have to leave it at that). Read performance ends up affecting the performance when the system reaches its limits of pending compactions, which I limited in the config. I could probably get a higher number without it, but it looked more realistic to set limits and I'd not deploy a server without the limits.

I had to make the fsyncs more frequent, not less, to reduce the odds of an operation sitting behind a long flush. This was a counter-intuitive finding for me, but it makes sense considering Persistent Disks throttle both IO sizes and IOPS. We do that to make sure a noisy neighbouring VM won't affect yours. So I turned on trickle_fsync and tuned the size of the maximum syncs to never have a slow flush, but a lot of smaller flushes.

Tuning the Java GC was for the same reason - I did not want long GC cycles. The settings I used were based on the guidance from Java memory ergonomics. The DataStax distribution has tips and hints in the cassandra-env.sh file itself, which I read it and followed through. I also read more about Java memory than I ever want to read again from various vendors. As the joke goes "I did even the things that contradicted the other things" before getting to the settings you saw.

I hope you appreciate the fact I set limits on some of the most dangerous knobs, for instance limiting the RPC server and using HSHA as opposed to unbounded sync.

I understand the limitations of the benchmark I used, and the limitations of the post. I tried to stay within recommended settings for all subsystems, have safe limits for everything I found dangerous, and I picked the workload because that is what our customers do. I don't advocate one storage solution over another, we love everyone that buys our solutions :)

BTW, while we do not sell Cassandra as a service, we do sell Cloud SQL (https://developers.google.com/cloud-sql/). Maybe the sales guys will give me a cut :)

Sharing the config was more work, but I suspected I was going to learn something by doing it. Thank you for the feedback!

PS: sorry for the delay replying - Y Combinator says I am posting too much. I am not quite sure at what point I will hit my daily quota of replies.

discuss

No comments yet.