top | item 45469716

(no title)

privatelypublic | 4 months ago

I'm interested in... why? What are you building that loading data from disk is so lopsided vs CPu load from compiling, or network load/latency(one 200ms of "is this the current git repo?" Is a heck of a lot of NVMe latency... and its going to be closer to 2s than 200ms)

discuss

order

finaard|4 months ago

I'm running the same setup - our larger builders have 2 32-core epycs with 2TB RAM. We were doing that type of setup already almost two decades ago in a different company, and in that one for over a decade now - back then that was the only option for speed.

Nowadays nvmes might indeed be able to get close - but we'd probably need to still span over multiple SSDs (reducing the cost savings), and the developers there are incredible sensitive to build times. If a 5 minute build suddenly takes 30 seconds more we have some unhappy developers.

Another reason is that it'd eat SSDs like candy. Current enterprise SSDs have something like a 10000 TBW rating, which we'd exceed in the first month. So we'd either get cheap consumer SSDs and replace them every few days, or enterprise SSDs and replace them every few months - or stick with the RAM setup, which over the live of the build system will be cheaper than constantly buying SSDs.

jauntywundrkind|4 months ago

> Current enterprise SSDs have something like a 10000 TBW rating

Running the numbers to verify: a read-write-mixed enterprise SSD will typically have 3 DWPD (drive writes per day), across it's 5 year warranty. At 2TB, that would be 10950 TBW, so that sort of checks out. If endurance was a concern, upgrading to a higher capacity would linearly increase the endurance. For example the Kioxia CD8P-V. https://americas.kioxia.com/en-us/business/ssd/data-center-s...

Finding it a bit hard to imagine build machines working that hard, but I could believe it!

trogdor|4 months ago

> Current enterprise SSDs have something like a 10000 TBW rating, which we'd exceed in the first month

Wow. What’s your use case?

rbanffy|4 months ago

> If a 5 minute build suddenly takes 30 seconds more we have some unhappy developers

They sound incredibly spoiled. Where should I send my CV?

motorest|4 months ago

> I'm interested in... why? What are you building that loading data from disk is so lopsided vs CPu load from compiling (...)

This has been the basic pattern for ages, particularly with large C++ projects. C++ builds, specially with the introduction of multi-CPU and multi-core systems, turns builds into IO-bound workflows, specially during linking.

Creating RAM disks to speed up builds is one of the most basic and low effort strategies to improve build times, and I think it was the main driver for a few commercial RAM drive apps.

john01dav|4 months ago

Why do we need commercial ram drive apps when Linux has tmpfs, or is this a historical thing?

mikepurvis|4 months ago

For the ROS ecosystem you’re often building dozens or hundreds of small CMake packages, and those configure steps are very io bound— it’s a ton of does this file exist, what’s in this file, compile this tiny test program, etc.

I assume the same would be true for any project that is configure-heavy.

bob1029|4 months ago

> one 200ms of "is this the current git repo?" Is a heck of a lot of NVMe latency... and its going to be closer to 2s than 200ms

I don't know where you're buying your NVMe drives, but mine usually respond within a hundred microseconds.