I have had the pleasure to work with lots of other people's code of varying styles and quality. Nothing is harder to read and understand than code which is deeply nested calls from one little helper (or wrapper) function to another. Nothing is easier to read and understand than code which just flows straight through from top to bottom of a big function.
There are other tradeoffs of code reuse and speed and worst-case-speed and probability of introducing bugs. If you haven't read the article, do, its worth it.
I love that Carmack tries to measure which styles introduce more bugs! Who else does that? Seriously, I would love to see more of that.
Interestingly, Casey Muratori accidentally demonstrates during one of his Handmade Hero sessions that the compiler won't always be able to optimize certain bits of code that are put in a function as opposed to being inline.
In the video, he inlines a very simple function and his game gets twice as fast for no apparent reason. It's instructive to watch him dive into the generated assembly to figure out why.
Imagine how fantastic it must be working with someone like Carmack. Sure, the first few code reviews would be fairly traumatic - as you quickly realise just how much faster and generally better he is than you - but I think after a little while you could just relax and try to absorb as much as possible.
I love how everything in these emails is delivered as a calm series of reflections, chronicling with great honesty his own changing opinions over time - nothing is a diktat.
I also found it rather heartening that he makes the same copy/paste mistakes that the rest of us do - how many times have you duplicated a line and put "x" or "width" on both lines..? Seemingly Carmack can actually tell you how many times he's done that!
Conventional wisdom says that you should write a function `is_player_untouched` and substitute the composite expressions with function calls, but the code in question can be refactored in a much more straightforward way:
Had the function body been split into more functions for "clarity", you would be doing duplicate calls to `is_player_untouched()` which go unnoticed because they would be buried deep in the call graph.
This leads to brittle code - it's easy to determine if is_player_untouched state changes between those two conditions when writing the code but when someone else edits that code it's also easy to introduce something that will break it and the monolithic/big function makes it hard to keep track of assumptions like this and it leads to a lot of code changes in those big functions as well. Even worse this kinds of bugs usually end up being hard to test if you're not covered with unit tests, and games rarely are, and big functions go against testing practices.
Hopefully, that should be something the compiler could do on its own; assuming the language has appropriate semantics, it could identify the common expression, prove that it's value can't have changed in between, and then do the substitution on its own.
Is there any language where this is implemented, or is the effort too great for the gains?
If you occasionally inline all the functions and unroll all the loops, you can occasionally find optimizations that even the compiler won't be able to make.
For example, in quaternion-based rotation math, there exists a "sandwich product" where you take the (non-commutative) product of the transform and the input, followed by the product of that result and the conjugate of the transform.
It turns out that several of the embedded multiplication terms cancel out in that double operation, and if you avoid calculating the canceled terms in the first place, you can do a "sandwich product" in about 60% the total floating-point operations as two consecutive product operations.
In the application that used spatial transforms and rotations, the optimized quaternion functions were faster than the 4x4 matrix implementation, whereas the non-optimized quaternion functions were slightly slower. That change alone (adding an optimized sandwich product function) cut maybe 30 minutes off of our longest bulk data processing times.
You would never be able to figure that out from this.
out = ( rotation * in ) * ( ~rotation );
You have to inline all the operations to find the terms that cancel (or collapse into a scalar multiplication).
I think there's a big point that's being missed here. Carmack is conflating inlining code with writing functional code. These are different things.
I'd agree that if the majority of your code is mutating state, it makes sense to mash all that together in one place. You want to keep an eye on the dirty stuff.
But on the other hand, inlining pure functions that don't use or mutate any global state doesn't make sense to me. Why is making it "not possible to call the function from other places" a benefit?
>> Why is making it "not possible to call the function from other places" a benefit?
When it's a pure function that's not a problem. When it changes state then you lose track of ordering and such. That's his point, state changes need to be kept in the one big function so you can keep track of them easily.
And of course, almost all interesting software has mutable state. Otherwise you're just doing a computation and looking for a single output.
Carmack addressed this, at least as I read it. One of his concluding guidelines was that it purely functional methods are a virtue and should be made purely functional wherever possible (he phrases it in terms of "const"). He did also day, however, that in his domain non-trivial pure-functional methods rarely apply.
The thing that struck me was Carmack's relentless pursuit of perfection. I can't think of many people who'd describe a single frame of input latency as a cold sweat moment!
When you are making videogames, a (video) frame of latency is a big deal.
When I worked on Guitar Hero and Rock Band, we worried about sub-frame latency (timing is more important when you're hitting a drum than when you're firing a gun).
I particularly liked the analogy with hardware where you do an operation unconditionally and eventually inhibit the result. Thinking about this, I would compare software written in that fashion with a well-oiled, smooth engine.
Long functions might read easier but you lose some testing precision. I think recent focus on testing has lead to shorter functions with as little responsibility as possible. When short functions fail a test you have smaller surface area to look for the cause.
Doesn’t this imply the need for a new language feature? So that well-defined sections of inline code can be pulled out, initial conditions set in a testing environment, and then executed independently.
I guess this could trip up if the compiler optimisations available when considering all the code at once means that the out-of-context code actually does something different in testing...
This is a pet peeve of mine. If you made a block of code into a separate function, I'm assuming it's called from multiple places. Or maybe it used to be. Or will be soon. But still.
I'm surprised at the enthusiasm for really long functions here. I find my experience is just the opposite—I find it much more difficult to read code written that way than when the different sections are broken up into smaller functions.
It is of course essential that the smaller functions be well-named and manage side-effects carefully. That is, they should either be pure functions, or the side effects should be "what the function does", so that readers of the main function don't generally need to read the function's code to understand its side effects.
Yes, a long monolithic block is hard to read. But he's suggesting using comments and braces to visually separate it into blocks, so the end result is a happy halfway point between monolithic and broken-down functions.
The intro suggests that he agrees that pure functions are an even better solution.
Well, yes, but in games, you have a lot of global state. You wind up manipulating so much state that you may indeed want this kind of large function so that you can see everything.
A side benefit to having separate, clearly named functions: they often play better with tools.
You can jump to them by name from a completely different part of the code (with some editor support). You see them in the headers of your Git hunks so you don't lose context.
In languages that are heavy on type inference, like Haskell or Python+MyPy, they make for a convenient boundary at which you can assert your types, to help with type errors.
I wonder how much of this change is because he can no longer keep track of so much state in his head (or just doesn't want to).
I only say this because I've gone through a similar transition of valuing my mental computation time in the last 20 years of coding :).
The efficiency of inlining is compelling when you code the whole thing at once, in one session. Once you decide to break the work up over multiple sessions, it's too much to keep in your head over multiple days (or weeks).
Although I am a fan of the LISP school of program design (minimal global state, build small functions and macros, make sure they work, and than build more functions and macros on top of that, until you have an abstraction that you can build your app on), Carmack raises some interesting points: If you're handling a lot of global, mutable state, you may want to abstract minimally, so that you can see where that state is mutating, which makes bugs easier to spot.
Maybe the key idea is to put all the global state mutations as closely together as possible, so they can be compared and contrasted as easily as possible.
The argument in the article is actually a "for programmers" argument. Mainly that sometimes it's easier to reason about inline code than deeply chained functions, because you don't necessarily know all the implications of calling that function. Or you don't know the potential starting states when writing the function.
>I now strongly encourage explicit loops for everything, and hope the compiler unrolls it properly.
I get why this is a thing. Sometimes an unrolled loop is faster. But if this is really an issue, why isn't there a [UnRoll] modifier or a preprocessor or something that handles that for you?
Something like this:
for (int i = 0; i < x; i++;) {
dothing(x[i]);
}
versus:
unroll for (int i = 0; i < x; i++;) {
dothing(x[i]);
}
Only the compiler / preprocessor would unroll the second one. You have the best of both worlds with a reduced chance of subtle errors.
One of his definitions of a pure function is one that only has by value parameters and doesn't change state. Am I correct in thinking that in C++, the advent of C++11 lambdas allows you to be explicit about this and prevent the compiler from allowing you to accidentally use variables from outside the scope of the function's parameters, by writing lambadas (named, if necessary, like a normal function) with no-capture lists ("[]") which would force you to work in a more pure style. In C++, what other method might help you enforce purity?
Correct me if I'm wrong (I only skimmed it) but this is less about not liking inlining than having deterministic/time-bounded performance.
These are two separate/orthogonal issues, I doubt he would turn his nose up at the processor doing less work iff it was also deterministic and had predictable worst-case timing.
I think Carmack is conflating FP with good abstractions.
Haskell abstractions are often good because they flow from category theory and there are usually well established mathematical laws associated with them. I'm thinking of the "monad laws" and the "monoid laws."
Mathematicians tend to create abstractions if the abstraction satisfies coherent and provable properties. Programmers tend to be less rigorous about what and how they abstract.
There is nothing about C++ that prevents making good abstractions. It's just the culture of the language. Industry programmers are taught to not duplicate code and to keep functions short but they are not taught the fundamentals of what makes a good abstraction.
[+] [-] dzdt|9 years ago|reply
There are other tradeoffs of code reuse and speed and worst-case-speed and probability of introducing bugs. If you haven't read the article, do, its worth it.
I love that Carmack tries to measure which styles introduce more bugs! Who else does that? Seriously, I would love to see more of that.
[+] [-] lj3|9 years ago|reply
In the video, he inlines a very simple function and his game gets twice as fast for no apparent reason. It's instructive to watch him dive into the generated assembly to figure out why.
https://www.youtube.com/watch?v=B2BFbs0DJzw
[+] [-] imtringued|9 years ago|reply
This also results in the subexpression dynXAxis.y and dynYAxis.y being lifted outside the loop.
Now we've moved half of the code outside the loop but we aren't done yet.
The same can be done with PixelPx and dx, the trick is to then replace dxnXAxis.x with
Expanding yields We can now lift the subexpression out of the loop.The only thing that is now done in the loop is
which can be further simplified to The same happens with nYAxis.x.EDIT: Sorry for the bad formatting. The markdown parser ate my asterisks so I put things into code blocks which requires a new line each time.
[+] [-] sitkack|9 years ago|reply
[+] [-] jackmott|9 years ago|reply
[+] [-] bshimmin|9 years ago|reply
I love how everything in these emails is delivered as a calm series of reflections, chronicling with great honesty his own changing opinions over time - nothing is a diktat.
I also found it rather heartening that he makes the same copy/paste mistakes that the rest of us do - how many times have you duplicated a line and put "x" or "width" on both lines..? Seemingly Carmack can actually tell you how many times he's done that!
[+] [-] samlittlewood|9 years ago|reply
http://blogs.valvesoftware.com/abrash/valve-how-i-got-here-w...
(And interesting history about Valve)
[+] [-] smegel|9 years ago|reply
Hopefully because he is saying "did you really think vain attempts at premature optimization were going to impress me?".
[+] [-] bluecmd|9 years ago|reply
[+] [-] yannickt|9 years ago|reply
It's an interesting read. Previous HN discussion: https://news.ycombinator.com/item?id=5383650
[+] [-] bluetomcat|9 years ago|reply
[+] [-] rubber_duck|9 years ago|reply
[+] [-] Jaruzel|9 years ago|reply
[+] [-] icebraining|9 years ago|reply
Is there any language where this is implemented, or is the effort too great for the gains?
[+] [-] logfromblammo|9 years ago|reply
For example, in quaternion-based rotation math, there exists a "sandwich product" where you take the (non-commutative) product of the transform and the input, followed by the product of that result and the conjugate of the transform.
It turns out that several of the embedded multiplication terms cancel out in that double operation, and if you avoid calculating the canceled terms in the first place, you can do a "sandwich product" in about 60% the total floating-point operations as two consecutive product operations.
In the application that used spatial transforms and rotations, the optimized quaternion functions were faster than the 4x4 matrix implementation, whereas the non-optimized quaternion functions were slightly slower. That change alone (adding an optimized sandwich product function) cut maybe 30 minutes off of our longest bulk data processing times.
You would never be able to figure that out from this.
You have to inline all the operations to find the terms that cancel (or collapse into a scalar multiplication).[+] [-] apeace|9 years ago|reply
I'd agree that if the majority of your code is mutating state, it makes sense to mash all that together in one place. You want to keep an eye on the dirty stuff.
But on the other hand, inlining pure functions that don't use or mutate any global state doesn't make sense to me. Why is making it "not possible to call the function from other places" a benefit?
How about calling that code from a unit test!
[+] [-] phkahler|9 years ago|reply
When it's a pure function that's not a problem. When it changes state then you lose track of ordering and such. That's his point, state changes need to be kept in the one big function so you can keep track of them easily.
And of course, almost all interesting software has mutable state. Otherwise you're just doing a computation and looking for a single output.
[+] [-] MyNameIsFred|9 years ago|reply
[+] [-] jon-wood|9 years ago|reply
[+] [-] dfan|9 years ago|reply
When I worked on Guitar Hero and Rock Band, we worried about sub-frame latency (timing is more important when you're hitting a drum than when you're firing a gun).
[+] [-] bluetomcat|9 years ago|reply
[+] [-] theandrewbailey|9 years ago|reply
[+] [-] ctrlrsf|9 years ago|reply
[+] [-] leojfc|9 years ago|reply
I guess this could trip up if the compiler optimisations available when considering all the code at once means that the out-of-context code actually does something different in testing...
[+] [-] virtualized|9 years ago|reply
[+] [-] typedef_struct|9 years ago|reply
[+] [-] sickbeard|9 years ago|reply
[+] [-] panic|9 years ago|reply
[+] [-] nickm12|9 years ago|reply
It is of course essential that the smaller functions be well-named and manage side-effects carefully. That is, they should either be pure functions, or the side effects should be "what the function does", so that readers of the main function don't generally need to read the function's code to understand its side effects.
[+] [-] mark-r|9 years ago|reply
The intro suggests that he agrees that pure functions are an even better solution.
[+] [-] qwertyuiop924|9 years ago|reply
[+] [-] vfaronov|9 years ago|reply
You can jump to them by name from a completely different part of the code (with some editor support). You see them in the headers of your Git hunks so you don't lose context.
In languages that are heavy on type inference, like Haskell or Python+MyPy, they make for a convenient boundary at which you can assert your types, to help with type errors.
[+] [-] Practicality|9 years ago|reply
I only say this because I've gone through a similar transition of valuing my mental computation time in the last 20 years of coding :).
The efficiency of inlining is compelling when you code the whole thing at once, in one session. Once you decide to break the work up over multiple sessions, it's too much to keep in your head over multiple days (or weeks).
[+] [-] qwertyuiop924|9 years ago|reply
Not a bad idea.
[+] [-] mgregory22|9 years ago|reply
[+] [-] p0nce|9 years ago|reply
Robert C. Martin encourages style B because it reads topdown and replaces comments with names.
[+] [-] anotherhacker|9 years ago|reply
It seems counter-intuitive but in the long run this mentality best serves the business.
[+] [-] softawre|9 years ago|reply
[+] [-] sophacles|9 years ago|reply
[+] [-] schlipity|9 years ago|reply
I get why this is a thing. Sometimes an unrolled loop is faster. But if this is really an issue, why isn't there a [UnRoll] modifier or a preprocessor or something that handles that for you?
Something like this:
versus: Only the compiler / preprocessor would unroll the second one. You have the best of both worlds with a reduced chance of subtle errors.[+] [-] hellofunk|9 years ago|reply
[+] [-] pbsd|9 years ago|reply
[+] [-] TickleSteve|9 years ago|reply
These are two separate/orthogonal issues, I doubt he would turn his nose up at the processor doing less work iff it was also deterministic and had predictable worst-case timing.
[+] [-] bluetomcat|9 years ago|reply
[+] [-] Animats|9 years ago|reply
It's not particularly helpful to a server that's fielding vast numbers of requests of various types.
[+] [-] saynsedit|9 years ago|reply
Haskell abstractions are often good because they flow from category theory and there are usually well established mathematical laws associated with them. I'm thinking of the "monad laws" and the "monoid laws."
Mathematicians tend to create abstractions if the abstraction satisfies coherent and provable properties. Programmers tend to be less rigorous about what and how they abstract.
There is nothing about C++ that prevents making good abstractions. It's just the culture of the language. Industry programmers are taught to not duplicate code and to keep functions short but they are not taught the fundamentals of what makes a good abstraction.