top | item 15180883

(no title)

ktRolster | 8 years ago

shrug It doesn't fall over. I've done it, the openBSD team has done it. DJB has done it. Maybe something is wrong with your implementation that I can help you with?

discuss

order

WalterBright|8 years ago

I'm curious. Got links?

ktRolster|8 years ago

OpenBSD takes a fairly minimalist approach, which is vaguely described here: http://www.freebsdforums.org/forums/showthread.php?threadid=... They basically replace the unsafe functions with things that are easier to use. Their idea is that it isn't the format of the C-string that causes security issues (null-terminated string), it's the poorly defined functions (with weird corner cases that are hard to get right). It's worked well for their use cases.

DJB did something similar in qmail, I don't recall the details but you can look at the source code as easily as I can, and it eliminated security problems.

When I'm working in Java, I find that most of my string parsing uses the split() function. This is a pain in C, because even if you had a split() function you'd need to deal with memory allocations. Most of these are solved with a memory pool. In my own library, I also added runtime, grammar-based parsing functionality. So to parse a CSV line you might do something like this:

    char *g = " S   -> WORD | WORD , S;"
              "WORD -> [^,]";
    results = parsegram(g, inputString);
Grammar parsing + memory pools makes string parsing in C easier than in Java. The biggest difficulty with this kind of library is to do it right, you need to be something of a unicode expert, and that's tough.