top | item 19958241

(no title)

dbro | 6 years ago

While not exactly what you asked for, I wrote something similar called csvquote ( https://github.com/dbro/csvquote ) which transforms "typical" CSV or TSV data to use the ASCII characters for field separators and record separators, and also allows for a reverse transform back to regular CSV or TSV files.

It is handy for pipelining UNIX commands so that they can handle data that includes commas and newlines inside fields. In this example, csvquote is used twice in the pipeline, first at the beginning to make the transformation to ASCII separators and then at the end to undo the transformation so that the separators are human-readable.

> csvquote foobar.csv | cut -d ',' -f 5 | sort | uniq -c | csvquote -u

It doesn't yet have any built-in awareness of UTF or multi-byte characters, but I'd be happy to receive a pull request if it's something you're able to offer.

discuss

No comments yet.