top | item 32520977

(no title)

Moral_ | 3 years ago

A lot of the reasons they've had to build most of this stuff themselfs is because they decided for some reason to use freeBSD.

The NUMA work they did, I remember being in a meeting with them as a Linux Developer at Intel at the time. They bought NVMe drives or were saying they were going to buy NVMe drives from Intel which got them access to "on the ground" kernel developers and CPU people from Intel. Instead of talking about NVMe they spent the entire meeting asking us about howt the Linux kernel handles NUMA and corner cases around memory and scheudling. If I recall correctly I think they asked if we could help them upstream BSD code for NVMe and NUMA. I think in that meeting there was even some L9 or super high up NUMA CPU guy from Hillsborough they some how convinced to join.

The conversation and technical discussion was quite fun, but it was sort of funny to us at the time they were having to do all this work on the BSD kernel that was solved years ago for linux.

Technical debt I guess.

discuss

order

cperciva|3 years ago

Netflix tried Linux. FreeBSD worked better.

Thaxll|3 years ago

It's hard to believe in 2022, Google, Amazon, FB etc .. all use Linux, all CDN use Linux as well, and some services serve even more traffic than Netflix ( Youtube ). BSD faster than Linux is a myth, the fact that 99% of those run on Linux means more people worked on those problems means it's most likely always faster.

The funny thing is the rest of Netflix runs on Ubuntu, only those edge CDN runs on BSD.

throw0101c|3 years ago

*At the time when they created the OCA project.

If someone was going to do a similar comparison now the results could be different.

dboreham|3 years ago

By some definition of better.

jeffbee|3 years ago

I still don't get the NUMA obsession here. It seems like they could have saved a lot of effort and a huge number of powerpoint slides by building a box with half of these resources and no NUMA: one CPU socket with all the memory and one PCIe root complex and all the disks and NICs attached thereto. It would be half the size, draw half the power, and be way easier to program.

drewg123|3 years ago

This is a testbed to see what breaks at higher speed. Our normal production platforms are indeed single socket and run at 1/2 this speed. I've identified all kinds of unexpected bottlenecks on this testbed, so it has been worth it.

We invested in NUMA back when Intel was the only game in town, and they refused to give enough IO and memory bandwidth per-socket to scale to 200Gb/s. Then AMD EPYC came along. And even though Naples was single-socket, you had to treat it as NUMA to get performance out of it. With Rome and Milan, you can run them in 1NPS mode and still get good performance, so NUMA is used mainly for forward looking performance testbeds.

jiggawatts|3 years ago

Modern CPUs like the AMD EPYC server processor are "always NUMA", even in single-socket configurations!

They have 9 chips on what is essentially a tiny, high-density motherboard. Effectively they are 8-socket server boards that fit in the palm of your hand.

The dual-socket version is effectively a 16-socket motherboard with a complex topology configured in a hierarchy.

Take a look at some "core-to-core" latency diagrams. They're quite complex because of the various paths possible: https://www.anandtech.com/show/16214/amd-zen-3-ryzen-deep-di...

Intel is not immune from this either.Their higher core-count server processors have two internal ring-bus networks, with some cores "closer" to PCIe devices or certain memory buses: https://semiaccurate.com/2017/06/15/intel-talks-skylakes-mes...

Bluecobra|3 years ago

If you are buying servers at scale the costs will certainly add up vs. buying two processors. If you buy single proc servers, that is double the amount of chassis, rail kits, power supplies, power cables, drives, iLO/iDRAC licenses, etc.

muststopmyths|3 years ago

Can you buy non NUMA mainstream CPUs though? Honest question because I’d love to be rid of that BS too

ksec|3 years ago

Is NUMA a solved issue on Linux? Correct me if I am wrong but I was under the impression it may be better handled under certain conditions, but NUMA, the problem in itself is hardly solved.

alberth|3 years ago

Maybe Brendan Gregg can further enlighten his new coworkers at Intel why Netflix chose both AMD & FreeBSD.