I've long had the idea for trying to write a Valgrind tool to help with this by analyzing struct usage. Something to profile how hot and cold the various fields of my structs are, and also to correlate which fields in a struct are frequently accessed together (i.e., within N cycles of each other). A tool for the profile part of "profile before optimizing" to go with the optimizations you mentioned.
I'm not sure how feasible this is. But if someone else wants to steal this idea and implement it for me, be my guest. :-)
The problem with this kind of instrumentation is that it is very expensive to collect, which affects the data collected in a way that may skew it from true runtime performance. Maybe that is still good enough! (It also feels difficult to implement.)
He shows the complete opposite, how to seperate data, so they won't appear on the same cache line. This is of course nonsense for single threaded accesses, but beneficial for concurrent accesses.
a_e_k|2 years ago
I'm not sure how feasible this is. But if someone else wants to steal this idea and implement it for me, be my guest. :-)
loeg|2 years ago
rurban|2 years ago