(no title)
T3OU-736 | 9 months ago
On SGI, the CrayLink HW had perfomance counters visible via the Performance CoPilot (nee PCP).
On Linux, NUMA arch has similar things (numastat, Intel's PCM, other tools). Depending on the workload, it may matter, but if the OS/tooling does not expose the counters, it isn't even possible to quantify the impact.
SGI's IRIX, due to the sheer physical size of their larger ccNUMA systems (AFAIK, AMD's NUMA is from SGI's ccNUMA), had the option to auto-migrate the workloads when certain CPU to working memory latency thresholds were reached.
No comments yet.