top | item 11509209

(no title)

melchebo | 10 years ago

A lot of their bugs are induced when you have a cgroup with multithreaded/-process tasks and a cgroup with just a few using processor time.

Just testing one benchmark will not show it, unless you have something else running too.

discuss

brendangregg|10 years ago

Ok, depends what you mean by something else running. Another PID? That's why I tested make -j32. I also tested multi-threaded applications from a single PID (and more threads than our CPU count), since that best reflects our application workloads.

They ought to be posting it to lkml, where many engineers regularly do performance testing. I've looked enough to think that my company isn't really hurt by this.

melchebo|10 years ago

Just read the paper, it's explained. Or read the presentation, it uses pictures.

Basically they run R, a single threaded statistics tool which is setup to hog a core, and in some other cgroup a wildly multithreaded tool. If you have a NUMA system (check with `lstopo`) then it's possible that the scheduler thinks the many tasks in one domain of cores is balanced with just R on one core of another domain. Meaning you can have several (ex: 7 out of 8) cores idle. It has to do with the way hierarchical rebalancing is coded, and that their 8x 8-core AMD machine has a deep hierarchy.