top | item 8427852

Fixing a 37-year-old bug by merging a 22-year-old fix

268 points| tscherno | 11 years ago |cvsweb.openbsd.org

72 comments

order
[+] kazinator|11 years ago|reply
This fix leaves a variable uninitialized, and depends on funny logic to later give it a value.

The general structure should be

  FILE *fp = stdin; /* sane default */
then override with a new pointer from fopen if necessary.

In fact, looking at this more closely, note how fp is scoped to the body of the for loop. That's where it should be defined:

   for (firsttime = 1; ; firsttime = 0) {
     FILE *fp = stdin;

     /* logic flips it to fp = fopen(...) as necessary */

   }
[Note: the C compiler Joy was working with probably didn't support block scope declarations except at the top of the function. This comment applies to merging the fix using the environment of today.]

Note how the fclose at the bottom of the loop ends up closing stdin in the case that fp == stdin! It ensures that this only happens once, but you have to analyze it to see that this is true. I would write this as:

     if (fp != stdin) /* defensive! */
       fclose(fp);
Making it an infinite loop terminated by calling exit is unnecessary. There is already a continue in the loop, and a break in an inner loop; another break instead of the exit, and a "return status;" at the bottom wouldn't hurt. Better yet:

  /* Enter loop whenever there are args left to
     process, or even if there are no args if it is
     the first time through (to process stdin instead). */

  for (firsttime = 1; firsttime || *argv; firsttime = 0) {
    FILE *fp = stdin;

    if (*argv) {
       /* have args; do the fopen to replace stdin */
       /* set status = 1 and continue if cannot fopen */
    }

    /* do head logic */

    if (fp != stdin)
      fclose(fp);
  }

  return status;
C itself was only a couple of years old when that code was written originally, and the systems were small. Whereas I've been using it since 1989. Not criticizing the original!
[+] saurik|11 years ago|reply
I would argue your insistence on giving the variable a "sane default" is a matter of opinion. I personally am much happier seeing a variable left uninitialized than ever seeing it double-assigned, as to me a fundamental semantic logic error has occurred if you are overriding the value of this kind of variable, especially as part of a conditional later where it might not be clear whether or not it happened: I really would prefer to look at the variable being assigned and think "ok, now we know the value". The compiler is entirely able to determine whether or not the variable has been initialized in all possible control flows for a case like this (at least, once we move the scope of the variable to inside of the for loop, of course), so there is no safety benefit to hoisting that default assignment out, nor is it saving a meaningful amount of code: it only serves to obfuscate the "default" control flow by making what had been an explicit if/else into an implicit if/default :/.
[+] michael_h|11 years ago|reply
As I understand it, you can fclose(stdin) without issue on almost all POSIX systems.
[+] mnarayan01|11 years ago|reply
Not sure that scoping to the block + unconditionally doing the initialization actually helps readability, though I do think doing one or the other would.

Also the original is super efficient which is pretty cool, and (at least when correct) is arguably more readable (probably not so much re. extendable) since it seems like there's no way to look at the code and think you know what it's doing when you actually do not.

[+] userbinator|11 years ago|reply
The extremely odd for-loop structure feels to me like someone was trying very hard to avoid a goto, and in the process obfuscated the flow quite a bit more. The special case of head'ing stdin is similar to the classic "loop and a half problem" (which is really only a problem if you religiously abstain from goto use.) I'd consider this a more straightforward way to do it:

    FILE *f;
    int i = 1;
    ...
    if(argc < 2) {
       f = stdin;
       goto do_head;
    }
    ...
    while(i<argc) {
       f = fopen(argv[i], "r");
       /* check for failure to open */
       ...
      do_head:
       /* head'ing code goes here */
       ...
       fclose(f);
       i++;
    }

That being said, OpenBSD code is still far more concise and readable than GNU; compare coreutils' head.c here:

http://code.metager.de/source/xref/gnu/coreutils/src/head.c

GNU head has a bit more functionality, but I don't think the increase in complexity - nearly 10x more lines - is proportional to that.

[+] ceejayoz|11 years ago|reply
In a 24-year-old version control system.

edit: Wow, not sure what's offensive here...

[+] brynet|11 years ago|reply
I posted this in the reddit thread today, after being part of a larger discussion last week.

Theo de Raadt co-created AnonCVS back in 1995/1996. He and Chuck Cranor from AT&T Labs Research published a paper in 1999 about it.

OpenBSD was the first open source project to make access to their source repository public to non-developers, before AnonCVS you needed commit access to review history, annotations and even checkout. OpenBSD had a read-only CVS tree even before their website was created in 1996.

There are no known prior cited examples of a project providing this level of transparency to the development process, most open source projects at the time only provided "snapshots" of development in the form of compressed source directory tarballs, excluding RCS files. SUP and FreeBSD's ctm(1) only let you merge "current" remote changes to a local non-versioned directory.

http://www.openbsd.org/papers/anoncvs-paper.pdf

http://www.openbsd.org/papers/anoncvs-slides.pdf

And here's an additional historical tidbit relating to FreeBSD and AnonCVS: https://twitter.com/canadianbryan/status/517714611089719296

[+] idlewords|11 years ago|reply
You're probably getting downvoted because a lot of OpenBSD threads derail into fruitless discussions of their use of CVS.
[+] tux3|11 years ago|reply
People here downvote what they don't like
[+] cordite|11 years ago|reply
One of my coworkers found a data integrity causing bug that has been around for 11 years.

He and his team lead had no idea how such a bug missed 11 years of peer-QA through all edits on that module / activity.

It was as simple as converting to string and back and hitting an edge case on internationalization with commas vs periods.

[+] pacaro|11 years ago|reply
My favourite decimal separater i18n bug was in a product where we were reading a time dilation factor from a config file. In the default shipping file (which basically nobody ever changed) it was specified as "1.0". We would then read this using (this is in .Net) Double.parse() which honors the users locale if none is specified, so for all users in Europe, and probably elsewhere, this was interpreted as 10 ('.' is a grouping construct and more or less ignored). So the software was running 10x slower than expected.
[+] burtonator|11 years ago|reply
Those are the ones which scare me the most.. when you find them, they're potentially devastating, and you haven't hit them.

Another I've seen is the potential landmine that can go of at ANY time and then one day, 5 years in the future, it hits you at 2AM and you can't figure out WTF is happening.

[+] couchand|11 years ago|reply
Right now I'm working on our company's new commissions payment system. Thanks to you I'm going to have nightmares tonight.
[+] ape4|11 years ago|reply
I like that the -number option is "obsolete". I still use it.
[+] alexmchale|11 years ago|reply
That's pretty funny. I'd say that the majority of the time that I call head or tail, it's using that syntax.
[+] gojomo|11 years ago|reply
If OpenBSD has anyone young enough, would've been a fun twist to have someone who wasn't even born at the time of the fix make the commit.
[+] Scuds|11 years ago|reply
> for (firsttime = 1; ; firsttime = 0) {

does this line predate the while loop?

or boolean values in C?

[+] cremno|11 years ago|reply
>does this line predate the while loop?

It's the other way around. Very early C didn't have a for loop. It seems it was introduced in Unix V5: http://minnie.tuhs.org/cgi-bin/utree.pl?file=V5/usr/c/c00.c

And a for loop is basically just a while one, but they are sometimes used in different ways than the usual `for (size_t i = 0; i < len; ++i)`. One of my favorites is http://git.musl-libc.org/cgit/musl/tree/src/internal/floatsc....

>or boolean values in C?

1 and 0 are boolean values in C. But C didn't have a boolean type until C99 (called _Bool or bool after including <stdbool.h>).

[+] mbreese|11 years ago|reply
It's not a while loop... well, maybe a "while 1 {}" loop.

It's closer to an iterator over argv.

First, it sets the firsttime=1 flag, then it tries to open the first argument (file). After the first arg, it sets firsttime=0. It only breaks out of the loop once argv is exhausted.

[+] ghshephard|11 years ago|reply
What's the common way of doing this in C? Is it normally something like this?:

  firsttime=True
  while (True) {
  ..
  ..
  ..
  firsttime=False
  }
[+] piran|11 years ago|reply
C99 introduced the boolean stdlib, so yes it predates that.

Also, as the other guy said, that code has while loops.

[+] staticshock|11 years ago|reply
while loop right above the for loop:

> while ((ch = getopt(argc, argv, "n:")) != -1) {

bool was introduced in C99

[+] duaneb|11 years ago|reply
There are no booleans in C, just ints. I think c99 added a 'bool' type, but it's just typedef'd to being an int.
[+] aalbertson|11 years ago|reply
Looking through all these comments, my only response is "effing nerds..." <3