top | item 5195167

Labeled Tab-separated Values

21 points| naoya | 13 years ago |ltsv.org | reply

9 comments

order
[+] beering|13 years ago|reply
So what is the purpose of replacing a common format with one that wastes space repeating labels? Many tools already support combined format, and you can look up the fields in the config if you really can't figure it out on your own.

Also, this linkbait title is just asking for a mod edit.

[+] rjbond3rd|13 years ago|reply
> So what is the purpose of replacing a common format with one that wastes space repeating labels?

Some time ago, we had to migrate dozens of HTML forms from legacy servers. We ended up implementing a generic forms handler to process all form submissions.

Initially, we logged all submissions to simple tab-delimited files. But as it turns out, some HTML field types, when left blank by the user, leave no trace in the query string.

So plain tab-delimited was not an option, and the answer turned out to be exactly this format.

[+] hcr|13 years ago|reply
It's difficult to extend the combined format, despite the demand to output more information, like response time, to access logs is increasing.
[+] bwooce|13 years ago|reply
Ok, it's sensible to make these logs easier to parse.

I don't understand why an entirely new Tag-Value scheme was invented though, and this article doesn't attempt to justify it. Maybe it's not new and I just haven't heard of it?

Why not use: JSON ASN.1 BER Or any other scheme with existing, mature, encoders and parsers.

[+] tingletech|13 years ago|reply
A lot of tooling, especially in legacy processes, is based on record-per-line formats based on semi-specified formats such as tab separated value, comma separated values. The advancement here is that rather than using positional values, the values are labeled. The use case for this is if you want to make some legacy process (like some crazy bash script some sysadmin wrote up 8 years ago) and tweak them in a way that you can add and remove columns from text files more easily.
[+] tingletech|13 years ago|reply
I think this is sort of clever. Parsing is simple (split the line on tab, no `:' in the label, no escaping) and you can add/remove/re-order the input rows. By the time you gzip it up it seems like repeating the labels on every row would not add that much weight.