top | item 33712414

(no title)

rrnewton | 3 years ago

Alas the performance overhead in realtime is not great yet. It still uses ptrace currently, which often results in a multiple-X slowdown (but at least it doesn't "subscribe" to every syscall like strace does, because some are naturally deterministic). Reverie's whole design is to make it support swappable backends, and this ptrace backend is just the reference implementation. The `experimental/reverie-sabre` directory in the Reverie repo contains our high performance backend, but it's still work-in-progress. It uses binary instrumentation and in our early experiments is 10X faster than our current backend in the worst case (i.e. strace is >10X faster when rewritten with reverie-sabre and run on a program that does nothing but syscalls).

But to the second part of your question about deterministic benchmarking, that is really a separate question. Hermit defines a deterministic notion of virtual time, which is based on the branches retired and system calls executed by all threads. When you run hermit with `--summary`, it reports a total "Elasped virtual global time", which is completely deterministic:

$ hermit run --summary /bin/date

...

Elapsed virtual global (cpu) time: 5_039_700ns

Therefore, any program that runs under hermit can get this deterministic notion of performance. We figured that could be useful for setting performance regression tests with very small regression margins (<1%), which you can't do on normal noisy systems. Compilers are one place I've worked where we wanted smaller performance regression alarms (for generated code) than we could achieve in practice. We haven't actually explored this application yet though. There's a whole small field of people studying performance modeling and prediction, and if one wanted to try this deterministic benchmarking approach, they might want take some of that knowledge and build a more accurate (correlated with wall time) performance model, more realistic than Hermit's current virtual time that is.

discuss

dekhn|3 years ago

Does running /bin/date under hermit always return the same time? Or does it just follow the same codepath to retrieve the actual time?

rrnewton|3 years ago

Well, the starting datetime at the beginning of execution in the container is whatever you set it to:

$ hermit run --epoch=2022-01-01T00:00:00Z /bin/date

Fri Dec 31 16:00:00 PST 2021

We, somewhat eccentrically, put it in last millennium by default. It used to default to the original Unix epoch back in 12/31/1969, but that was causing some software to be very unhappy ;-).

The reproducibility guarantee is that the behavior of the program is a deterministic function of its initial configuration. The epoch setting is one aspect of that initial configuration (as are file system inputs, RNG seeds, etc).