There are so many layers of abstraction in the Windows API
that it’s a miracle that anyone could maintain it - which probably
explains the increasing level of bloat in new versions of Windows.
Are there fewer layers of abstraction in other popular OSEs?
Yes – it's a product of the way Windows was actually two operating systems, the Windows 3/95/98/ME lineage and the newer OS/2 / NT lineage.
The NT side had the advantage of being designed at all rather than growing organically by overworked developers hacking in whatever they needed right now, and had a number of assumptions (e.g. not starting with the 16-bit API real-mode model) which avoided some gnarly hacks.
The problem was compatibility: most of the apps had been developed on 16-bit Windows 3 or, later, Win95 and at the time Microsoft's dominance was far from a given so they were pathologically afraid of breaking compatibility, which meant that the Windows “platform” included a lot of weird semi-or-undocumented corners designed to avoid breaking specific apps and the NT side had to reimplement bugwards-compatible versions of most of them to avoid breaking shipped apps.
(This might seem excessive – and I would generally agree – but it's important to remember that much of the damage was done in the pre-internet era when shipping updates to software meant putting a box in the main with a pile of floppies or, for the really rich people, CDs. Getting someone to upgrade to a version of an app which didn't rely on an implementation quirk could take many years.)
Raymond Chen has written about this extensively at http://blogs.msdn.com/b/oldnewthing/ and one of my favorite examples is the Shell Folders registry key:
The closest you come to this on another mainstream OS is OS X, where they maintained Carbon (i.e. the supported subset of the classic Mac OS APIs) on top of the modern core but that was both more limited and was rapidly deprecated because Apple is far less concerned with breaking backwards compatibility than Microsoft used to be.
Really interesting article, sounds like a fun adventure. Couple of points:
1.) "0x12b9b0a5. This equals 314159269 in decimal. Yep, that’s the first 8 digits of pi right there" -- actually, it's not. The first 8 digits of pi are 3.14159265, it's not clear why the Microsoft developers ended with a 9 instead of a 5, perhaps a mistake? perhaps to help with coprime-ness?
2.) The "ANSI C" version uses mbstowcs which isn't any kind of ANSI C function I've ever heard of.
Fantastic article, really enjoyed reading, many thanks.
Those are the first 9 digits if you count 3 - the 9 at the end is the 9th digit, not the 8th, so the first 8 digits are correct. You're correct about the ANSI C part though, it's actually C99 - I've corrected that part. Thanks.
It claims to be "standard C", which usually these days means ISO C rather than ANSI C. For details about mbstowcs, see the C99 standard, section 7.20.8.1.
> I’d need to write a device driver to call it, which is both something I’ve never touched at all before, and something that’s nigh impossible without access to the DDK or NDK.
The routine itself would be copyrighted, but someone should be able to make a clean room reimplementation of the original algorithm, assuming its not protected by patents. (IANAL)
A few years ago I implemented the storage system for a special-purpose diagnostic camera. The specification defined (very long) filenames for the saved images using a timestamp and some other data. I used a mostly off-the-shelf microcontroller/NAND/USB mass storage reference implementation, hooked it up a side-channel to the FPGA, and had everything working pretty nicely. Until the test harness that just continuously commanded pictures to be taken reached 105 iterations. After that, the camera timed out waiting on the storage subsystem to store the image.
The problem turned out to be the code that found the 8.3 filename: it did the longfi~1.bin, checked to see if that file existed and if so, incremented to longfi~2.bin, then checked that... but never did the checksum trick described here, just kept iterating. (bear in mind this was a tiny 8-bit microcontroller that didn't have the RAM to just read all the directory entries at once and keep them around for comparison) Finding the proper 8.3 filename this way took longer than the timeout period after 104 collisions.
Of course, we only cared about the long filename and never saw the 8.3 filename, so my fix was simply to use an appropriate hash of the long filename to ensure a good probability of uniqueness.
If the checksums collide, then the number after the tilde is incremented again, (eg. SOBC84~2.ASP). This time, it won't stop at ~4, so you can go up to ~10 and beyond. The file name will be shortened accordingly to fit the number in (eg. SOBC8~10.ASP). This was tested on Windows 7 x64.
[+] [-] WalterGR|10 years ago|reply
[+] [-] acdha|10 years ago|reply
The NT side had the advantage of being designed at all rather than growing organically by overworked developers hacking in whatever they needed right now, and had a number of assumptions (e.g. not starting with the 16-bit API real-mode model) which avoided some gnarly hacks.
The problem was compatibility: most of the apps had been developed on 16-bit Windows 3 or, later, Win95 and at the time Microsoft's dominance was far from a given so they were pathologically afraid of breaking compatibility, which meant that the Windows “platform” included a lot of weird semi-or-undocumented corners designed to avoid breaking specific apps and the NT side had to reimplement bugwards-compatible versions of most of them to avoid breaking shipped apps.
(This might seem excessive – and I would generally agree – but it's important to remember that much of the damage was done in the pre-internet era when shipping updates to software meant putting a box in the main with a pile of floppies or, for the really rich people, CDs. Getting someone to upgrade to a version of an app which didn't rely on an implementation quirk could take many years.)
Raymond Chen has written about this extensively at http://blogs.msdn.com/b/oldnewthing/ and one of my favorite examples is the Shell Folders registry key:
http://blogs.msdn.com/b/oldnewthing/archive/2003/11/03/55532...
The closest you come to this on another mainstream OS is OS X, where they maintained Carbon (i.e. the supported subset of the classic Mac OS APIs) on top of the modern core but that was both more limited and was rapidly deprecated because Apple is far less concerned with breaking backwards compatibility than Microsoft used to be.
[+] [-] voidlogic|10 years ago|reply
This is application servers, not OSs, and its old, but is a pretty famous example:
https://ma.ttias.be/system-calls-in-apache-linux-vs-iis-wind...
[+] [-] jstanley|10 years ago|reply
1.) "0x12b9b0a5. This equals 314159269 in decimal. Yep, that’s the first 8 digits of pi right there" -- actually, it's not. The first 8 digits of pi are 3.14159265, it's not clear why the Microsoft developers ended with a 9 instead of a 5, perhaps a mistake? perhaps to help with coprime-ness?
2.) The "ANSI C" version uses mbstowcs which isn't any kind of ANSI C function I've ever heard of.
Fantastic article, really enjoyed reading, many thanks.
[+] [-] Quackmatic|10 years ago|reply
Those are the first 9 digits if you count 3 - the 9 at the end is the 9th digit, not the 8th, so the first 8 digits are correct. You're correct about the ANSI C part though, it's actually C99 - I've corrected that part. Thanks.
[+] [-] to3m|10 years ago|reply
[+] [-] poizan42|10 years ago|reply
Well the DDK is called WDK now, but anyways how does one not have access to it? It's right here: https://msdn.microsoft.com/en-us/windows/hardware/hh852365
[+] [-] rplnt|10 years ago|reply
I thought that Windows 2000 source leaked. Would seem like an easier path. Though certainly not better (for either the author or a reader).
[+] [-] userbinator|10 years ago|reply
[+] [-] ape4|10 years ago|reply
[+] [-] RandomBK|10 years ago|reply
[+] [-] tryp|10 years ago|reply
The problem turned out to be the code that found the 8.3 filename: it did the longfi~1.bin, checked to see if that file existed and if so, incremented to longfi~2.bin, then checked that... but never did the checksum trick described here, just kept iterating. (bear in mind this was a tiny 8-bit microcontroller that didn't have the RAM to just read all the directory entries at once and keep them around for comparison) Finding the proper 8.3 filename this way took longer than the timeout period after 104 collisions.
Of course, we only cared about the long filename and never saw the 8.3 filename, so my fix was simply to use an appropriate hash of the long filename to ensure a good probability of uniqueness.
[+] [-] TheLoneWolfling|10 years ago|reply
[+] [-] userbinator|10 years ago|reply
[+] [-] Quackmatic|10 years ago|reply