top | item 46330741

(no title)

barfoure | 2 months ago

Some of this can be reduced to a trivial form, which is to say practiced in reality on a reasonable scale, by getting your hands on a microcontroller. Not RTOS or Linux or any of that, but just a microcontroller without an OS, and learning it and learning its internal fetching architecture and getting comfortable with timings, and seeing how the latency numbers go up when you introduce external memory such as SD Cards and the like. Knowing to read the assembly printout and see how the instruction cycles add up in the pipeline is also good, because at least you know what is happening. It will then make it much easier to apply the same careful mentality to this which is ultimately what this whole optimization game is about - optimizing where time is spent with what data. Otherwise, someone telling you so-and-so takes nanoseconds or microseconds will be alien to you because you wouldn’t normally be exposed to an environment where you regularly count in clock cycles. So consider this a learning opportunity.

discuss

simonask|2 months ago

Just be careful not to blindly apply the same techniques to a mobile or desktop class CPU or above.

A lot of code can be pessimized by golfing instruction counts, hurting instruction-level parallelism and microcode optimizations by introducing false data dependencies.

Compilers outperform humans here almost all the time.

Pannoniae|2 months ago

Compilers massively outperform humans if the human has to write the entire program in assembly. Even if a human could write a sizable program in assembly, it would be subpar compared to what a compiler would write. This is true.

However, that doesn't mean that looking at the generated asm / even writing some is useless! Just because you can't globally outperform the compiler, doesn't mean you can't do it locally! If you know where the bottleneck is, and make those few functions great, that's a force multiplier for you and your program.

jesse__|2 months ago

> Compilers outperform humans here almost all the time.

I'm going to be annoying and nerd-snipe you here. It's, generally, really easy to beat the compiler.

https://scallywag.software/vim/blog/simd-perlin-noise-i

dardeaup|2 months ago

"A lot of code can be pessimized by golfing instruction counts"

Can you explain what this phrase means?

barfoure|2 months ago

It is not about outperforming the compiler - it’s about being comfortable with measuring where your clock cycles are spent, and for that you first need to be comfortable with clock cycle scale of timing. You’re not expected to rewrite the program in assembly. But you should have a general idea given an instruction what its execution entails, and where the data is actually coming from. A read from different busses means different timings.

Compilers make mistakes too and they can output very erroneous code. But that’s a different topic.