I know that at least two engineers from the runtime team have seen the post in the #darkarts channel of gopher slack. One of them left a fire emoji :).
I'll probably bring it up in the by-weekly Go runtime diagnostics sync [1] next Thursday, but my guess is that they'll have the same conclusion as me: Neat trick, but not a good idea for the runtime until hardware shadow stacks become widely available and accessible.
Thanks! And to answer you question: No, it won't speed up Go programs for now. This was mostly a fun research project for me.
The low hanging fruits to speed up stack unwinding in the Go runtime is to switch to frame pointer unwinding in more places. In go1.21 we contributed patches to do this for the execution tracer. For the upcoming go1.23 release, my colleague Nick contributed patches to upgrade the block and mutex profiler. Once the go1.24 tree opens, we're hoping to tackle the memory profiler as well as copystack. The latter would benefit all Go programs, even those not using profiling. But it's likely going to be relative small win (<= 1%).
Once all of this is done, shadow stacks have the potential to make things even faster. But the problem is that we'll be deeply in diminishing returns territory at that point. Speeding up stack capturing is great when it makes up 80-90% of your overhead (this was the case for the execution tracer before frame pointers). But once we're down to 1-2% (the current situation for the execution tracer), another 8x speedup is not going to buy us much, especially when it has downsides.
The only future in which shadow stacks could speed up real Go programs is one where we decide to drop frame pointer support in the compiler, which could provide 1-2% speedup for all Go programs. Once hardware shadow stacks become widely available and accessible, I think that would be worth considering. But that's likely to be a few years down the road from now.
nikolayasdf123|1 year ago
felixge|1 year ago
I'll probably bring it up in the by-weekly Go runtime diagnostics sync [1] next Thursday, but my guess is that they'll have the same conclusion as me: Neat trick, but not a good idea for the runtime until hardware shadow stacks become widely available and accessible.
[1] https://github.com/golang/go/issues/57175
nickcw|1 year ago
Do you see this speeding up real world Go programs?
felixge|1 year ago
The low hanging fruits to speed up stack unwinding in the Go runtime is to switch to frame pointer unwinding in more places. In go1.21 we contributed patches to do this for the execution tracer. For the upcoming go1.23 release, my colleague Nick contributed patches to upgrade the block and mutex profiler. Once the go1.24 tree opens, we're hoping to tackle the memory profiler as well as copystack. The latter would benefit all Go programs, even those not using profiling. But it's likely going to be relative small win (<= 1%).
Once all of this is done, shadow stacks have the potential to make things even faster. But the problem is that we'll be deeply in diminishing returns territory at that point. Speeding up stack capturing is great when it makes up 80-90% of your overhead (this was the case for the execution tracer before frame pointers). But once we're down to 1-2% (the current situation for the execution tracer), another 8x speedup is not going to buy us much, especially when it has downsides.
The only future in which shadow stacks could speed up real Go programs is one where we decide to drop frame pointer support in the compiler, which could provide 1-2% speedup for all Go programs. Once hardware shadow stacks become widely available and accessible, I think that would be worth considering. But that's likely to be a few years down the road from now.