The limit is the number of outstanding cache line requests to the memory controller. CPUs have a fixed number of slots for this, around 10-12 usually. Intel calls them LFBs (Line Fill Buffers) and AMD MSHRs (Miss Status Holding Registers). When the slots are filled, the CPU can issue no more requests and has to wait for them to complete. Apple M chips (probably) have more slots and the memory is physically packaged together with the CPU, so they get better numbers.
foota|1 month ago
IgorPartola|1 month ago
memoriuaysj|1 month ago
it's why GPUs have their memory chips in a circle around the GPU chip.