top | item 6738222

(no title)

yorhel | 12 years ago

Hmm? The stack buffer in yxml will never cause parsing to block, and there's no added dependencies on... anything? I don't think that buffer causes any problems even on a size-restricted microcontroller that doesn't have malloc(). As long as you can find ~512 bytes or so of free memory you can parse a lot of files.

The only situation in which that buffer would cause an error is when the application used a too small buffer, or when the document is far too deeply nested or has extremely long element/property names. Both the maximum nesting level name lengths should, IMO, be limited in the parser in order to protect against malicious documents. Most parsers have separate settings for that, yxml simplifies that by letting the application control the size of a buffer.

The stack buffer in yxml is also used to make the API a bit easier to use. With the buffer I can pass element/property names as a single zero-terminated C string to the application, without it I would have to use the same mechanism as used for attribute values and element contents, and that mechanism isn't all that easy to use. (This is the one case where I chose convenience over simplicity, but I kinda wanted the validation anyway so that wasn't really a problem)

discuss

mtdewcmu|12 years ago

Ah. I have not yet looked too far into how yxml works. I never came up with a perfectly satisfactory solution to the zero-buffer problem myself, but you've hit on a lot of the things that make it a hassle either way. It's almost impossible to do a usable xml parser under the assumption that it will not buffer and it will not be guaranteed access to more than one character at a time. I started developing one like that based on a goto-driven state machine, but I stopped working on it, because the interface was going to be too inconvenient.

What I meant about blocking was blocking on malloc. It sounds like you're expecting the caller to take care of allocation?