[lnav](https://www.lnav.org) is a terrific little tool, a "mini-ETL" of sorts with an embedded SQLite client and a clean, powerful interface. Its sweet spot is logfiles, but given regex-based custom formats, works great with any semi-structured input. Lnav easily handles a few million rows at a time. IME it pairs really really well with eg mitmproxy/mitmdump for client request logs, as well as webserver logs.
Thanks for linking that. It's going to make my life easier this week, and I had not heard of it. I was weighing setting up something like Graylog for some troubleshooting and kind of dreading it. lnav looks like a perfect middle-ground between that and my wiki page full of grep commands.
This looks like a great resource. The tools you'd like to have for a specific problem are often quite un-googlable. So you either need complex hacks to get inferior tools to work or you spend an hour googling the tools for a tiny problem.
Of course, it would be even better if you could easily tell which of the dozen JSON query tools is the best choice for the task at hand, or which you should code if you only want to ever use one of them.
In fact I'd love if someone would like to share their set of tried-and-true tools. Personally I mostly go with the POSIX tools, plus jq or gawk on occasion (but I have to read their docs every single time...).
One thing I could suggest for the XML list is xmllint. It can be really useful for converting xml to canonical format so you can then use diff to compare it.
E.g. something like
diff <(xmllint —c14 first.xml) <(xmllint —c14 second.xml)
I’d love to heat about more command line SOAP tools if anyone can recommend some.
kdb+/q is another really good choice for dsv[1] and json[2]. You can certainly create single-file databases (if you really want to e.g. for exchange), but splayed table[3] is faster so you'd usually do that.
The problem with that might be the licensing costs once you use it commercially (eg. at work). IIRC the license prices aren't public, but you're looking at over $10k in any case.
I personally prefer J to K in the APL family of languages. They also have a relatively cheap database, Jd [1]. Individual licenses are $600. Still a bit too much for my data mangling needs. :)
I took the time to learn recutils a long time ago, and it has been the gift that keeps on giving
Sure, it is not as fast as many other formats, but on the other hand it integrates very well into Emacs an org-mode. I manage a large part of my different collections using a combination of both, and the Emacs integration means it is all less than 2 seconds away.
I don't understand why csvkit is listed in the SQL-based utilities section. csvkit is a suite of multiple command-line tools, including csvcut, csvsort, csvgrep, csvjson, csvstat, csvstack, csvjoin, etc. and multiple converters, so is not only csvsql
I'm very glad to see the 'silly' tools there, cut/join/paste/sort/uniq. While I would never build anything 'important' with them, they're an extremely useful tool to have in your toolkit.
If you don’t mind converting the Excel file to CSV, csvkit[0], which is mentioned in the list, has a tool to pipe Excel into CSV for further processing by its sibling tools.
It won’t help if you need to retain anything Excel specific, but I find it very useful to deal with any Excel files that come my way.
Thanks for this link! I frequently have to load CSV files into a database and they are invariably full of errors. People think spitting out CSV is easy, but it's because they don't have to use their product. So every time I write a Perl script and go through various iterations before I find all that's wrong with the file.
[+] [-] chrisweekly|8 years ago|reply
[+] [-] hoistbypetard|8 years ago|reply
[+] [-] dancek|8 years ago|reply
Of course, it would be even better if you could easily tell which of the dozen JSON query tools is the best choice for the task at hand, or which you should code if you only want to ever use one of them.
In fact I'd love if someone would like to share their set of tried-and-true tools. Personally I mostly go with the POSIX tools, plus jq or gawk on occasion (but I have to read their docs every single time...).
[+] [-] tannhaeuser|8 years ago|reply
[1]: http://pubs.opengroup.org/onlinepubs/009695399/utilities/awk...
[+] [-] Lio|8 years ago|reply
One thing I could suggest for the XML list is xmllint. It can be really useful for converting xml to canonical format so you can then use diff to compare it.
E.g. something like diff <(xmllint —c14 first.xml) <(xmllint —c14 second.xml)
I’d love to heat about more command line SOAP tools if anyone can recommend some.
[+] [-] arundelo|8 years ago|reply
[+] [-] IvarTJ|8 years ago|reply
[+] [-] geocar|8 years ago|reply
[1]: http://code.kx.com/q/ref/filenumbers/#load-csv
[2]: http://code.kx.com/q/ref/dotj/
[3]: http://code.kx.com/q/cookbook/splayed-tables/
[+] [-] dancek|8 years ago|reply
I personally prefer J to K in the APL family of languages. They also have a relatively cheap database, Jd [1]. Individual licenses are $600. Still a bit too much for my data mangling needs. :)
[1] http://code.jsoftware.com/wiki/Jd/Index
[+] [-] bjoli|8 years ago|reply
Sure, it is not as fast as many other formats, but on the other hand it integrates very well into Emacs an org-mode. I manage a large part of my different collections using a combination of both, and the Emacs integration means it is all less than 2 seconds away.
[+] [-] jwilk|8 years ago|reply
https://johnkerl.org/miller/doc/file-formats.html#Tabular_JS...
[+] [-] netol|8 years ago|reply
[+] [-] netol|8 years ago|reply
[+] [-] manaktir|8 years ago|reply
[+] [-] unknown|8 years ago|reply
[deleted]
[+] [-] AdamJacobMuller|8 years ago|reply
[+] [-] badsectoracula|8 years ago|reply
[+] [-] joakinen|8 years ago|reply
[+] [-] stevoski|8 years ago|reply
Edit: I’m looking for a command line tool that allows me to open an Excel file, make a few simple changes, and then save again as an Excel file.
[+] [-] clarkdave|8 years ago|reply
It won’t help if you need to retain anything Excel specific, but I find it very useful to deal with any Excel files that come my way.
[0] https://csvkit.readthedocs.io/en/1.0.3/
[+] [-] gwn7|8 years ago|reply
https://github.com/saulpw/homebrew-vd
[+] [-] geocar|8 years ago|reply
[+] [-] chrisweekly|8 years ago|reply
[+] [-] unknown|8 years ago|reply
[deleted]
[+] [-] jwdunne|8 years ago|reply
This sheet was formatted like:
MEMBERS ...rows...
ADMINS ...rows...
EXECUTIVE COMMITTEE ...rows...
You could whip up any command line tool you need with that.
[+] [-] paulsutter|8 years ago|reply
[+] [-] unknown|8 years ago|reply
[deleted]
[+] [-] unknown|8 years ago|reply
[deleted]
[+] [-] nol13|8 years ago|reply
http://csvfix.byethost5.com/csvfix15/csvfix.html
[+] [-] zabzonk|8 years ago|reply
[+] [-] forinti|8 years ago|reply
[+] [-] zatkin|8 years ago|reply
[+] [-] jwilk|8 years ago|reply