(no title)
pjdesno | 1 month ago
$ time ./wc-avx2 < bible-100.txt
82113300
real 0m0.395s
user 0m0.196s
sys 0m0.117s
"System" time is the amount of CPU time spent in the kernel on behalf of your process, or at least a fairly good guess at that. (e.g. it can be hard to account for time spent in interrupt handlers) With an old hard drive you would probably still see about 117ms of system time for ext4, disk interrupts, etc. but real time would have been much longer. $ time ./optimized < bible-100.txt > /dev/null
real 0m1.525s
user 0m1.477s
sys 0m0.048s
Here we're bottlenecked on CPU time - 1.477s + 0.048s = 1.525s. The CPU is busy for every millisecond of real time, either in user space or in the kernel.In the optimized case:
real 0m0.395s
user 0m0.196s
sys 0m0.117s
0.196 + 0.117 = 0.313, so we used 313ms of CPU time but the entire command took 395ms, with the CPU idle for 82ms.In other words: yes, the author managed to beat the speed of the disk subsystem. With two caveats:
1. not by much - similar attention to tweaking of I/O parameters might improve I/O performance quite a bit.
2. the I/O path is CPU-bound. Those 117ms (38% of all CPU cycles) are all spent in the disk I/O and file system kernel code; if both the disk and your user code were infinitely fast, the command would still take 117ms. (but those I/O tweaks might reduce that number)
Note that the slow code numbers are with a warm cache, showing 48ms of system time - in this case only the ext4 code has to run in the kernel, as data is already cached in memory. In the cold cache case it has to run the disk driver code, as well, for a total of 117ms.
No comments yet.