top | item 35891071

(no title)

Terminal135 | 2 years ago

The repo claims that the servers themselves throttle the GPUs, but isn't it the GPUs themselves that can throttle or maybe the OS? Neither of those are controlled by the server (hopefully) so is there a different system at play here?

discuss

order

f_devd|2 years ago

I can actually answer this (as it is how I stumbled on to the repo), it's through a signal from the motherboard called Pwrbrk (Power Brake), Pin 30 on PCIe. It tells the PCIe device to maintain a low-power mode, in the case of Nvidia GPUs it's about 50W (300Mhz out of 2100Mhz in my case).

You can check if it's active using `nvidia-smi -q | grep Slowdown` as shown in the post

csdvrx|2 years ago

No, that's controlled by the server: try lspci -vv on any linux system. Look at the link speed and width, like LnkSta: Speed 8GT/s, Width x2: x2 means 2 lanes.

Try:

`sudo lspci -vv | grep -P "[0-9a-f]{2}:[0-9a-f]{2}\.[0-9a-f]|downgrad" |grep -B1 downgrad`

Besides the speed, you can have another problem with lanes limitations.

For example, AMD CPUs have a lot of lanes, but unless you have an EPYC, most of them are not exposed, so the PCH tries to spread its meager set among the devices connected to your PCI bus, and if you have a x16 GPU, but also a WIFI adapter, a WWAN card and a few identical NVMe, you may find only of the NVMe benchmarks at the throughput you expect.

toast0|2 years ago

> For example, AMD CPUs have a lot of lanes, but unless you have an EPYC, most of them are not exposed, so the PCH tries to spread its meager set among the devices connected to your PCI bus, and if you have a x16 GPU, but also a WIFI adapter, a WWAN card and a few identical NVMe, you may find only of the NVMe benchmarks at the throughput you expect.

Most AM4 boards put an x16 slot direct to the CPU, and an x4 direct linked NVMe slot. That's 20 of the 24 lanes; the other 4 lanes go to the chipset, which all the rest of the peripherals are behind. (There's some USB and other I/O from the cpu, too). AM5 CPUs added another 4 lanes, which is usually a second cpu x4 slot.

Early AM4 boards might not have a cpu x4 NVMe slot, and those 4 cpu lanes might not be exposed, and the a300/x300 chipsetless boards don't tend to expose everything, but where else are you seeing AMD boards where all the CPU lanes aren't exposed?

ilyt|2 years ago

> For example, AMD CPUs have a lot of lanes, but unless you have an EPYC, most of them are not exposed, so the PCH tries to spread its meager set among the devices connected to your PCI bus, and if you have a x16 GPU, but also a WIFI adapter, a WWAN card and a few identical NVMe, you may find only of the NVMe benchmarks at the throughput you expect.

example from my X670E board

* first NVME = 4x gen 5

* second= 4x gen 4

* 2 USB ports connected to CPU (10/5 Gbit)

and EVERYTHING ELSE goes thru 4x gen 4 PCIE bus, including additional 3x nvme, 7 SATA ports, a bunch of USBs, few 1x PCIE ports, network, etc.

formerly_proven|2 years ago

PCIe devices can only draw a limited wattage until the host clears them for higher power. There is also a separate power brake mechanism (optional part of PCIe) mentioned in the article, which has been proposed by nVidia for PCIe so it seems likely their GPUs support it.