With chiplets being a thing these days, I guess this is really about where in the overall system is the best place to put the package boundary and socket/pins? And then another approach to that same question would be what apple is doing with in-package ddr5 (which I think I heard amd is copying with a custom line for... Microsoft / azure I think it was?).
My understanding is that one factor pushing hyperscalers towards dual-socket was the cost of the network fabric - for the longest time, having two CPU sockets per NIC / per leaf switch port was the overall system price/performance sweet spot for many workloads. More sockets required more expensive CPUs, while single-socket servers need twice as many NICs and twice as many top-of-rack switchports.
With newer/faster ethernet standards you still need twice as many NICs but you can often split the lanes coming out of a switch chip and use a Y cable.
EPYC brings plenty of NUMA complexity in a single socket unfortunately. If you just want to solve system performance riddles then one socket is plenty. I seem to recall that Facebook publicly announced that they switched their web server systems to 1 socket more than 8 years ago. Since that time Netflix has written several times about how they carefully keep the sides of a 2S server from interfering with each other and I always wondered why they bother, why they don't just saw the system in half and save themselves the trouble.
Netflix's CDN is optimized to reduce space and effort needed by ISPs to install them.
They want a small box, because ISPs have limited space.
They want a single LACP group, because ISPs have limited ports, and to use only one IP address, because ISPs have limited addresses.
And they want to make it easy to plug in properly, so that they can reduce communication with the ISP.
These all add up to a dual socket node over two single socket nodes in one box. Although, as single socket capabilities increase, they may end up with a single socket node instead.
Not mentioned in this is the issue of memory scaling.
DRAM price per GB has been roughly flat for well over a decade - consumer prices hit $4/GB in 2011, and have fluctuated around there ever since - most of the drop in real cost since then has been due to declining value of that $4. Prices for large enterprise/hyper scalers are probably similar, as it’s a low-margin commodity market.
Two sockets gets you more memory channels and more DINNS, but as memory price causes the RAM/CPU ratio to drop, and single-channel bandwidth increases with DDR5, that becomes less important.
Of course that’s one of those things you can’t really say to customers, kind of like “you don’t really need 250hp in a passenger sedan”.
I thought that most of the reason that Apple's newer cpus are supposed to be so good for LLMs is that the integrated memory lets them have more channels than usual?
If single threaded perf is all that matters, probably Epyc 9175F. Zen5, 16 chiplets, one core per. Each core has 32MB of L3. Boosts to 5Ghz. 128 lanes of pci-e 5.0.
If/when they make a v-cache version of this, that'll most likely be even better: Zen5 v-cache doesn't have the clock speed penalty that previous generations did (because the cache is underneath instead of on top) and 96MB of L3 per core would be monstrous.
> (..) you really want those 2x 12 memory channels a Dual EPYC system offers (...)
I had to check and I was amazed that there are companies selling workstations with dual EPYC processors, providing a whopping 256 CPU cores and over 2TB of DDR5. All in a desktop form factor. Amazing.
tbrownaw|1 year ago
Polizeiposaune|1 year ago
With newer/faster ethernet standards you still need twice as many NICs but you can often split the lanes coming out of a switch chip and use a Y cable.
chipdart|1 year ago
snakeyjake|1 year ago
Are you one of the reasons why SEO spam sites are clicked on so often?
archerx|1 year ago
LoganDark|1 year ago
nkrisc|1 year ago
throawayonthe|1 year ago
[deleted]
jeffbee|1 year ago
toast0|1 year ago
They want a small box, because ISPs have limited space.
They want a single LACP group, because ISPs have limited ports, and to use only one IP address, because ISPs have limited addresses.
And they want to make it easy to plug in properly, so that they can reduce communication with the ISP.
These all add up to a dual socket node over two single socket nodes in one box. Although, as single socket capabilities increase, they may end up with a single socket node instead.
pjdesno|1 year ago
DRAM price per GB has been roughly flat for well over a decade - consumer prices hit $4/GB in 2011, and have fluctuated around there ever since - most of the drop in real cost since then has been due to declining value of that $4. Prices for large enterprise/hyper scalers are probably similar, as it’s a low-margin commodity market.
Two sockets gets you more memory channels and more DINNS, but as memory price causes the RAM/CPU ratio to drop, and single-channel bandwidth increases with DDR5, that becomes less important.
Of course that’s one of those things you can’t really say to customers, kind of like “you don’t really need 250hp in a passenger sedan”.
tbrownaw|1 year ago
rkagerer|1 year ago
toast0|1 year ago
If/when they make a v-cache version of this, that'll most likely be even better: Zen5 v-cache doesn't have the clock speed penalty that previous generations did (because the cache is underneath instead of on top) and 96MB of L3 per core would be monstrous.
mtreis86|1 year ago
wmf|1 year ago
throawayonthe|1 year ago
[deleted]
Tepix|1 year ago
chipdart|1 year ago
I had to check and I was amazed that there are companies selling workstations with dual EPYC processors, providing a whopping 256 CPU cores and over 2TB of DDR5. All in a desktop form factor. Amazing.
unknown|1 year ago
[deleted]