(no title)
aristus | 1 year ago
After a few minutes of loading data, the kernel calmed down and it worked like a champ. Millions of transactions per second across billions of records, on a $500 computer... and a card that cost more than my car.
Definitely wouldn't do it that way these days, but it was an impressive bit of kit.
sargun|1 year ago
Somehow we end up with a FusionIO card in tow. We go from something like 5,000 read QPS to 300k reads QPS on pgbench using the cheapest 2TB card.
Ever since then, I’ve always thought that reaching for vertical scale is more tenable than I originally thought. It turns out hardware can do a lot more than we think.
hinkley|1 year ago
But the tricky bit there is that you may need to set up the response to contain the results of the read that is triggered by a successful write. Otherwise you have to solve lag problems on the replica.
immibis|1 year ago
Not to mention that individual servers, no matter how expensive, cost a tiny fraction of the equivalent cloud.
Remember the LMAX Disruptor hype? Their pattern was essentially to funnel all the data for the entire business logic onto one core, and make sure that core doesn't take any bullshit - write the fastest L1-cacheable nonblocking serial code with input and output in ring buffers. Pipelined business processes can use one core per pipeline stage. They benchmarked 20 million transactions per second with this pattern - in 2011. They ran a stock exchange on it.
linsomniac|1 year ago
As an experiment, I sent them a 600GB Intel SSD in laptop drive form factor. They took down the secondary node, installed the SSD, and brought it back up. We let DRBD sync the arrays, and then failed the primary node over to this SSD node. I added the SSD to the logical volume, then did a "pvmove" to move the blocks from the 8 drive array to the SSD, and over the next few hours the load steadily dropped down to nothing.
It was fun to replace 8x 3.5" 10K drives with something that fit comfortably in the palm of my hand.
hinkley|1 year ago