Should programmers learn machine code?
45 points| RiderOfGiraffes | 16 years ago | reply
It had 16KB of RAM, and permanent storage was sound on a fairly standard external cassette tape drive. I can still draw the waveform of a bit, and the start/stop patterns. You could program it in a simple BASIC that had limited variables, and no subroutine parameters.
No parameters in subroutines!
But I was hooked. I wrote two BASIC programs, then ran out of patience. It had to run faster! A 1.77 MHz Z80 running interpreted BASIC wasn't fast enough.
So I smashed the stack. I won't go into the details, but I bootstrapped into assembler (the BASIC did have PEEK and POKE) and wrote a compiler. The BASIC version, two copies of the machine code version, and the variables, all fitted into the 16KB memory. Two copies because it wasn't relocatable, so I compiled to copy 1, used that to compile to copy 2, then machine-code saved copy 2 to cassette.
It was cool writing different variants of sort, then sorting the data that was in the memory mapped screen. You could see the heavier items fall to the bottom in quicksort, or migrate one at a time to the top in a bubble sort. It was pretty raw, and enormous fun.
Why do I tell you this?
Today at work I took a subroutine that was taking 171ms per call and making the GUI run as slow as a snail on valium, and re-wrote it to take less than 10ms per call. The methods I used were straight out of the techniques I learned in that first 3 months of machine code programming (not assembler -machine code), and my colleagues couldn't really understand it.
There followed an impromtu "training session" in which I explained how CPU cores work, how machine code can be mapped to them, how assembly code matches machine code, and the way the instructions actually match the hardware, in some sense.
A lot of what I talked about is now outdated, it is, after all, 30 years old. But the techniques are still applicable on occasion. They were amazed that they didn't know this stuff, and intrigued as to how I do.
It's just what I grew up with.
Should programmers in C++, Haskell, Lisp, Python, etc, know these things? Or are they really mostly irrelevant, becoming more so, and soon to be known only by true specialists and dinosaurs like me?
[+] [-] jacquesm|16 years ago|reply
As for the why of it:
No other programming language will give you the feeling that it is you that controls the computer as good as assembly ever will, other than the lucky few that get to program microcode.
Assembler is what it all eventually boils down to, it gives you a first hand view at the von Neumann bottleneck and how small the letterbox is through which the CPU views the memory.
After you get a good long hard look at that your programs will never be the same.
[+] [-] stcredzero|16 years ago|reply
If you're missing those three, then you've got an important hole in your knowledge as a programmer: Machine language, Operating Systems, Implementation of Programming Languages.
Much of why computers and their software is put together in this was all clicks into place if you have an integrated view of those three.
[+] [-] mkn|16 years ago|reply
My interest in programming stems from wanting to use them to solve problems, not from theoretical interest in how computers work. Most of my colleagues in the UW engineering department I was in learned just enough Matlab to get the homework done. I learned enough to program a solution to the homework. I originally learned QBasic to program a Mandelbrot-set generator on a 286, because I wanted to generate the Mandelbrot set.
Should a web developer learn machine code? I dunno. You can get pretty far in web development with a good head, some Railscasts and tutorials, and a bit of practice. Maybe the question is "When should a web developer learn machine code?" I think the answer to that question, if he/she is a working web developer, is, "When he/she has to, when it comes up."
For myself, I look at these--let's be frank--esoteric things like machine code, higher-order whatever, compiler-design, and so on and think, "That may be interesting, but I've got work to do and fun to have. I'll wait until I have to know it before I learn it." I've already had to learn about lambdas. Very fun stuff, but also very useful stuff.
Now, should the macho programmer learn machine code? Of course. But the macho programmer should also be manufacturing his own computer chips as well, to better fit with his pure and obviously correct programming methods.
[+] [-] arghnoname|16 years ago|reply
That said, "learn it when you need it" is problematic because of the case where you don't know what you don't know. In my own experience, I've learned lots of things and then thought back to previous problems and had little epiphanies on better solutions. People that don't experience that probably aren't improving very much.
Perhaps it would be useful if various communities (like webdev, etc) put together curriculum guides for "things you don't know you don't know, but should know." Of course, what should be on the list would be a source of bitter contention I'm sure.
[+] [-] Sapient|16 years ago|reply
Before multi-core processors became common, I had a major argument with another senior developer at work who completely believed that simply adding threads would speed up execution of his functions (that's what threads do right?!).
Edit: The point is, programmers don't know nearly enough about how their programs interact with the hardware they run on.
[+] [-] amalcon|16 years ago|reply
Case in point: I once had a C++ program in which replacing malloc() with a wrapper that rounded up all allocations over 1kb to multiples of 1kb actually increased execution speed by 20%. Those may not have been the exact numbers, but that's the general idea. Guess why.
. . .
The program was allocating many large objects with small variations in size. The system default malloc/free implementation would work efficiently for small objects, but would store large allocations in a linked list, bucketed by size. In the worst-case (all objects are large and of different sizes), freeing n objects would take O(n^2) time. This program came very close to that ideal. The rounding prevented that worst case by using fewer buckets.
Of course, that version never saw production use. It was just used to demonstrate the nature of this particular performance problem.
[+] [-] yan|16 years ago|reply
[+] [-] daeken|16 years ago|reply
I also have to recommend the book Reversing by Eldad Eilam if you're interested in low-level hacking from a reversing perspective.
[+] [-] russell|16 years ago|reply
Some friends and I are going to start an old programmers' home, like the old sailors homes. Obviously we should include you as a charter member.
[+] [-] gruseom|16 years ago|reply
That's interesting. Who? What? How?
[+] [-] petercooper|16 years ago|reply
C is about as low as you need to know nowadays. x86 was critical to know in the 90s due to performance issues, but since the majority of software nowadays is running on some sort of VM or pulling heavily on frameworks (CLR, JVM, OS X's frameworks), it's mostly pointless to use x86 in all but the most unusual of projects.
The only reason I'd still advise a basic familiarity with assembler is to rid the notion that C is as low as it gets - you can get totally different results from different C compilers (and even incompatible results in some nasty cases) and it's worth at least knowing there's something underneath there that you can take a basic look at.
[+] [-] Sapient|16 years ago|reply
http://www.pbm.com/~lindahl/mel.html
[+] [-] psadauskas|16 years ago|reply
8-page series on lwn, with comments: http://lwn.net/Articles/250967/ [pdf] http://people.redhat.com/drepper/cpumemory.pdf
[+] [-] jacquesm|16 years ago|reply
[+] [-] varjag|16 years ago|reply
[+] [-] gruseom|16 years ago|reply
Alan Kay says that people who are serious about software have to be serious about hardware.
[+] [-] flogic|16 years ago|reply
[+] [-] antonovka|16 years ago|reply
* Why your program behaves the way it does
* Why your performance profile looks the way it does
* Gives you the skills you need to debug inevitable problems in your high level tools.
I highly recommend Write Great Code: Understanding the Machine:
http://www.amazon.com/Write-Great-Code-Understanding-Machine...
[+] [-] AndrewDucker|16 years ago|reply
[+] [-] kragen|16 years ago|reply
[+] [-] RiderOfGiraffes|16 years ago|reply
Actually, most of what I did are standard techniques that assembly programmers know, C programmers have seen, and web programmers don't need to bother with.
I can't actually post code here because it's commercial in confidence, yada yada yada, but I will mention the techniques.
Firstly, they were processing an entire image, then transforming it, then getting the part they wanted. We tranformed the program flow to be demand-driven, which meant that large amounts of the image weren't processed at all because they weren't needed.
Then we analysed the data and deduced that by taking logs we only needed a dynamic range of 6 bits. We therefore took the entire image into a byte map with the top two bits of each byte set to 0.
Processing 4 bytes at a time in words saved enormous amounts of memory access, and avoided register spills. Unrolling loops by the right amount meant that we stayed in cache, but had less overhead.
Once the processing had been re-worked, we looked at the assembly produced, and found that most of the time the processing could be separated into a double indirection.
Finally, by further reducing the resolution of the system, computing an approximation, then reworking those parts of the data where the errors were too large we managed to halve the work being done.
It was a good two days of analysis, and the code shrank by a factor of 2, but it's now tough to understand. We have documented the results. It's embedded work, and yes, we finally went to assembly to get the last 30% of performance.
Every part of it is standard in its field. I think some of my programmers hadn't created routines to work byte-wise eight bytes at a time in a 64 bit word, and others hadn't done the transform via logs to throw away unnecessary resolution. Another had never analysed unrolling a loop as compared with staying in cache. We used tricks such as (x & -x) to find the least bit of a number, although in 32-bit space that becomes something like (x & (!x + 0x01010101)).
And so on. A large collection of esoteric tricks, all at once.
Let me finish by saying that programs can't be made to run faster, they can only be made to do less work. An interesting aphorism when "optimised" code.
[+] [-] wildjim|16 years ago|reply
And it's true: the best optimising compiler in the world cannot work out the intentions you had for your code -- sometimes choosing one type of algorithm will optimise quite differently from your intention -- and if you have even basic understanding of what's happening on the hardware, you're already ahead.
Funnily enough, in my last job I met someone who had the same sort of experiences, and understanding of the actual hardware, and we lamented that no-one was growing up these days understanding the low-level stuff, incl. bus dymnamics, interaction between I/O, memory and CPU, etc, etc.
Doubly interesting was in my current job, met someone who understood JVM byte-code as well as assembler, and he claimed it was key to understanding how to get truly great performance from Java.
[+] [-] Sapient|16 years ago|reply
[+] [-] gjm11|16 years ago|reply
I'd guess that much more software slowness comes from other failings, though. Poorly chosen algorithms that would make for slow performance even if written in machine code and microoptimized by experts. Forgetting that a random disc read takes millions of CPU cycles. That sort of thing.
(Most of the programmers I've worked with have been writing embedded stuff, or close enough to others who are doing so, that they have some exposure to assembler-level thinking. Though I expect those of us who grew up in the age of 8-bit micros could still teach most of them a trick or two.)
[+] [-] jacquesm|16 years ago|reply
Once you have a thorough understanding of how to program a CPU and familiarity with the instruction set, you'll probably take a look at the output of your assembler to see what it is that it produces.
That and a reference manual will give you a taste for machine code.
As long as you do not use macros it is pretty straightforward to convert machine code back into assembler, macros muddle the waters here because a single assembler macro can generate lots of machine code.
There is value in machine code too, because the instruction set is created in such a way as to facilitate decoding by the simplest set of microcode to achieve a given amount of functionality. Silicon is expensive so it pays off to design your microcode and by extension the instruction set in this way. Machine code opcodes can usually be split up into 'fields'.
The elegance of microcode is something that has its own beauty, but it may be that that - just like machine language - is an acquired taste.
I realize riderofgiraffes wanted to make a distinction between learning assembly (which lots of people do) and learning machine language (which probably less people do), but since the one can be trivially translated in to the other machine language is more of an exercise in hand assembly and studying the layout of opcodes than anything else, it comes with the territory of anybody that takes assembly serious.
Unlike for instance "C", which you could be programming for a lifetime without ever looking at the assembly output.
If you used an assembler to translate your assembly program - without macros - into machine language and you then ran the resulting dump using nothing but a reference manual and a pile of paper you'd effectively be executing machine language. They go hand in hand.
But learning assembly (usually) comes first.
If you want to go one step further you can create your machine language programs without the aid of an assembler, I once knew enough 6502 and 6809 opcodes and operands by heart to do this for programs that were longer than was probably normal, even today I still remember a couple of them even though I haven't had any use for any of that lately, I should probably 'garbage collect' ;)
I have found that particular skill to be less of value because as soon as I could I wrote an assembler to do those pesky branch computations for me.
The main driving factor here was the cost of an assembler, on pocket money it was a lot cheaper to go to the library and look up the opcodes then go home and program the thing instead of spending a years worth of 'income'.
Hand assembling small bits of code (say a subroutine less than 50 bytes long to speed up a piece of code when no assembler was available) got me some pretty scared looks by people that I met though ;)
The benefits of machine language over assembler are there, but the 'return on investment' is probably diminishing at that point. Many people would argue that that already goes for assembly, obviously I'm not one of those.
[+] [-] spectre|16 years ago|reply
[+] [-] acg|16 years ago|reply
By far the best attributes of programmers are the want to constantly improve their code and experience. However sometimes a wealth of experience can lock a programmer into a fixed way of thinking. I know quite a number of C++ programmers who seemingly don't seem to get OO for instance. Others may not approach a problem in the best way.
You journey seems to imply should new programmers follow your path to get an understanding of machines. A lot of this technology is not relevant. Multicore systems are fairly modern, and hacking java bytecode is probably just as good preparation as hacking a Tandy. In face many of those old microcomputers were much simpler that even some VM architectures.
It sounds like you may have been hired into a team precisely because of this sort of experience you have. However not every optimization these days is a low level optimization. I note the YouTube architecture and how much of the service was/is written in a scripting language.
It wouldn't take long to find an opensource project that you could easily make use of these skills. Far from being a dinosaur there are whole industries that still use this sort of knowledge: but perhaps not on the desktop.
[+] [-] zweinz|16 years ago|reply
[+] [-] SamAtt|16 years ago|reply
The reason I say that is because modern languages have abstracted most of the direct interaction away. So a preoccupation with what's happening at the assembly level can be more of a distraction if your language of choice isn't already second nature to you.
[+] [-] noonespecial|16 years ago|reply
[+] [-] logic|16 years ago|reply
It might be "woefully obsolete", but a derivative of the HC11 is in an engine control unit (ECU) from a 90's-era Mitsubishi model that quite a few folks, myself included, take a hobby interest in as means to making our cars go a little faster than is reasonable or was intended, on the cheap.
Another platform (that I have a much stronger personal interest in) is SH4-based. 256k of ROM (1M on newer versions of the car), 256k of RAM (512k on the newer version), and that's all you get to control a modern fuel-injected, variable-timing four-cylinder engine in real time.
This "old stuff" is still very applicable, depending on the field; the automotive industry is wrapped tightly in a time distortion field. ;)
[+] [-] coryrc|16 years ago|reply
[+] [-] jacquesm|16 years ago|reply