Holy crap. I was just writing a blog to complain about the state of Rust benchmarking and I think this might address most of my points. The biggest one is the ability to have benchmarks collocated within the library like tests which is the biggest annoyance.
It’s also nice to see that it can report multiple counters in parallel. I put up a similar feature[1] for criterion recently but I fear the project isn’t being maintained anymore…
Haven’t looked deeply into divan yet but the other requirements I have for criterion’s power is to run tests with statistical guarantees on the results, terminate quickly when statistical significance is reached (—quick), provide a comparison of the delta from a previous benchmark, and to run async code. Wonder how this stacks up.
* statistical power is principled in terms of following a good paper about how to pick how many iterations, but the variance stuff isn’t implemented yet. By definition, if I’m reading the readme correctly, its execution mode is equivalent to —quick.
* baseline evaluation not there yet.
My overall gripe remains that this stuff isn’t available natively within the Rust/cargo ecosystem. I should be able to swap out benchmarking frameworks without having to rewrite my entire codebase to try out a new one.
Wow, this really looks fantastic from the examples, great work!
I am looking forward to machine-readable output getting implemented. I was using criterion recently and couldn't for the life of me get CSV output to work correctly (and according to their docs it's a feature they are looking to remove). Ended up writing a python script to scrape all the data out of the folders of JSON files it makes.
Are you saying in terms of how long the benchmark takes? Have you tried `--quick`? The duration of the test doesn't matter so much for the time it takes Criterion to benchmark - what Criterion is trying to do is run the function enough times that it thinks it has a statistically defensible estimate of how expensive your code is.
[+] [-] vlovich123|2 years ago|reply
It’s also nice to see that it can report multiple counters in parallel. I put up a similar feature[1] for criterion recently but I fear the project isn’t being maintained anymore…
Haven’t looked deeply into divan yet but the other requirements I have for criterion’s power is to run tests with statistical guarantees on the results, terminate quickly when statistical significance is reached (—quick), provide a comparison of the delta from a previous benchmark, and to run async code. Wonder how this stacks up.
[1] https://github.com/bheisler/criterion.rs/pull/722
[+] [-] vlovich123|2 years ago|reply
* Async support not there yet
* statistical power is principled in terms of following a good paper about how to pick how many iterations, but the variance stuff isn’t implemented yet. By definition, if I’m reading the readme correctly, its execution mode is equivalent to —quick.
* baseline evaluation not there yet.
My overall gripe remains that this stuff isn’t available natively within the Rust/cargo ecosystem. I should be able to swap out benchmarking frameworks without having to rewrite my entire codebase to try out a new one.
[+] [-] super_scooper|2 years ago|reply
I am looking forward to machine-readable output getting implemented. I was using criterion recently and couldn't for the life of me get CSV output to work correctly (and according to their docs it's a feature they are looking to remove). Ended up writing a python script to scrape all the data out of the folders of JSON files it makes.
[+] [-] forrestthewoods|2 years ago|reply
[+] [-] vlovich123|2 years ago|reply
[+] [-] jawline|2 years ago|reply
[+] [-] chaosprint|2 years ago|reply
[+] [-] malkia|2 years ago|reply
[+] [-] bafe|2 years ago|reply