For me, bash and jq are, literally, the opposite of riding a bicycle. It doesn't matter the amount of time I spend on a given week working with them, a month later, I am gonna have to skim through my bookmarks and Kagi results (and now also chatGPT) for knowing how to do stuff I was easily doing a month ago.
I also observed this when using most cli tools… I think it’s a common problem for tools you have to reach for a couple times a month/quarter (versus programming language when you’re coding almost everyday)
My solution was literally to create Anki cards every time I discover a neat feature that I might not remember but it would be useful too. I just go through it once a day for 10 minutes (my anki cards) and it works like a charm. My memory for various cli tools has drastically improved. Rarely do I need to reach for Google, man docs or ChatGPT for most cli tools usages. I’d recommend spaces repition for cli tools
I get Bash, but for jq I found that my small fusillade of Anki flash cards was more than enough to get a fingertip feel for its syntax. Amazing what 50 flashcards of jq (or awk, or sed, or regexes, or any DSL really) gets you in the long run.
It's more important to understand the possibilities than remember the details. Details can always be quickly looked up, as long as you know what to look for and can conceptualize which tools to combine to achieve a goal.
If I find myself struggling with a task I’ve done a handful of times, I just make a page for it in obsidian with the snippet I need and an explanation of how it works.
I always struggled with understanding JQ. Each time I was just googling things. But actually it does make a lot of sense if you understanding the building blocks. I wrote it all down [1] but here is my summary:
jq lets you select elements like it's a JavaScript object using dot notation and array indexing.
jq '.key.subkey.subsubkey'
jq '.key[].subkey[2]'
You can turn wrap things in array constructors, or object constructors to create new objects and lists:
jq '[ .[].key ]'
jq '{key1: .key1, key2: .key2}'
You can combine filters with pipes (|) to build complex transformations. Built-ins like map() and select() are useful for transforming arrays.
This query fetches GitHub issues, transforms them into a simplified structure, filters out unlabeled issues, sorts them, and wraps the results in an array - demonstrating how you can chain together jq's query language to wrangle JSON data.
I was curious so looked up how it works before reading the summary at the end, and that led me to find another user of aioli.js jq implementation: https://jiehong.gitlab.io/jq_offline/ (featured https://news.ycombinator.com/item?id=28627172 two years ago); jqplay.org still sends all the data on every modification so they should learn from it...
Anyway, this article is neat! Good work!
If I were to nitpick one of the last examples with path has no explanation and flew over my head (would have to open the documentation), and a reset button for each example might be nice after messing with it a bit, but it was a nice play.
Regarding the reset button: I think that's a great suggestion and now it bugs me so much that I can't reset it. I'll add a reset button later tonight when I'm off work.
Regarding the confusing example: Yes, some of the examples are missing explanations (mainly because I spent more than a month on this post and I just did not want to put off putting it out any longer). Sorry haha. I'll try to improve the explanations and add more.
JQ is an insanely powerful language, just to put to rest any of your doubts about what it is capable of here is an implementation of JQ... in JQ itself:
And also for prototyping you can also use it to tailor output of APIs to what you need in a pinch, using JQ as a library especially with something like python:
JQ + journald is great too, but 20 years of muscle memory writing bash / python / perl / awk / sql / ruby / JS / CSS selectors / xpath / xmlstarlet one-liners keep getting in my way. I keep long notes on both with examples of common tasks. I still dislike yaml (significant whitespace is my “ick” as the kids say) too much to learn whatever the equivalent is for that and still find CSV/TSV easier to slice and dice at will due to my own personal history.
I’m sure at this point that many ETL jobs in notebooks we run at $BigCo today could be reduced to jq expressions that run 100x faster and use 1/10th the memory.
The ‘nearly’ Turing complete is something I wonder about. It feels like jq might have some limitations - transformations it can’t do, due to some inherent limitation of how it handles scope or data flow. The esoteric syntax makes it hard to determine sometimes whether what you are attempting is actually possible.
As soon as jq scripts reach a certain level of complexity I break out to writing a node script instead.
And given how rapidly jq scripts acquire complexity, that level is pretty low. One nested lookup, and I’m out.
For whatever reason jq is one tool that I simply can never remember the syntax for. It's ChatGPT every time for me. I just can't remember the specifics of how it differs from jsonpath vs jmespath (used by AWS) .... I wish there was a way for every tool to just use jsonpath instead.
The first and foremost thing to know about jq is that it's built on path expressions, so the first thing to learn is how to write path expressions. Fortunately path expressions are easy in jq!
.a # Get the value of the "a" key
# in the current input object
.[0] # Get the value of the first
# element in the current input
# array
.a[0] # Get the value of the first
# element in the array at the
# key named "a" in the current
# input object.
#
# I.e., path expressions chain:
.a[0].b # Get the value of the "b"
# key in ...
Things get more interesting when you see that `.[]` is the iterator operator, and that you can use it in path expressions.
Things get really interesting when you see that `select(conditional expression)` can be used in path expressions joined with `|`.
Just this can be very useful. It's also useful to know about the magic `path()` function, and `paths`, which I often use to just list all the paths in an input JSON text. Try applying `jq -c paths` to a `kubectl get -o json pods` command's output!
It's great for building large complex queries that will eventually live in scripts, but your zsh plugin seems to hit a real sweet spot of fast feedback for ad-hoc queries too! Huge props!
zshbuiltins(1): Unlike parameter assignment statements, typeset's exit status on an assignemt that involves a command substitution does not reflect the exit status of the command substitution. Therefore, to test for an error in a command substitution, separate the declaration of the parameter from its initialization.
Nice plugin; I got it and will be using it. Browsing the code, I saw a couple of small errors; not too serious, but some error handling is incorrect. In your `jq_complete()` function, for instance, you have
local query="$(__get_query)"
local ret=$?
Unless the `local` assignment to `query` fails, `ret` will always be 0 regardless of the return value of `__get_query`. To fix this, you would need your first line to be
Nice, esp reading Calzifier’s comment above and remembering how many times I’ve cursed the JQ syntax because of quoting issues…another “trick” I’ve been using is for any non-trivial JQ filter, stick it in a file or at least a heredoc and feed it to JQ using -f for much less quote-escaping malarkey.
nice, but... I'd written something like this (as a program you pipe to, not autocomplete) before, but when there's an error, I try to show the error then the last-good-output. The reason for this is that when you're typing a complex command you want to have the json visible to guide your thinking, just displaying the error hides it.
The way I did this was to store both the last working query and the last working output, I'd only reuse it if the last working query was a prefix of the current query - that avoids the awkward case where you are deleting letters from the output, so you need an output further back in history (which I didn't store, wasn't worth the hassle)
For me jq is my epitome of "When faced with a problem a programmer says 'I know I can use X' and now they have two problems"
I continually bounce off the "language/philospohy" of jq in quite embarrassing ways. Every time I go "Ah, I can use this as a reason to learn jq and half an hour lateI've written a python script to extract the data instead.
x1000 this. I find I have similar reasoning that I apply to awk. I _know_ some people get massive benefits out of using it, I just dont need it often enough to actually pick it up... GPT to the rescue I suppose
Great article. Nice to have it interactive. How does it work? Do you have a terminal running somewhere or does it run in the browser?
One thing I noticed, and where I stopped continuing, is that the jump from Filtering Nested Arrays to Flattening Nested JSON Objects, is WAAAY too big. From a simple filter to triple nested filters with keywords that had no introduction in a simpler example, isn’t working for me
It seems like jq is getting a nice boost due to how useful it is getting JSON into and out of OpenAI and LLM environments that understand jq. The big new release/relaunch shows the project is up and running again so maybe we see even more integration with Agent/Function type use cases or some pydantic-ish guardrails. Thanks for the Bookmark !
If you have trouble remembering jq syntax (or any other weird CLIs) I'd reccomend increasing the number of lines of history stored in your shell and finding a way (I use FZF) to search through that history.
I do a quick ctrl+r, type jq, and I can find all of my JQ snippets I've used in the past couple of years. If I then type "select" I can find all of the times I've used that function, etc.
I also use it to find while loops, kubectl snippets, environment variables I exported to run a script, etc.
This made me think: If you wanted to make an 'inverted bottom-up' introduction to the suite of Unix command line tools, you could go in the direction of more-to-less-structured text formats and the common tools we use with them quite easily.
1. JSON: `curl` to get interesting JSON APIs, `gron` and `grep` to explore what's inside them, `jq` to process them into interesting formats.
2. CSV: Lots of good choices here. `xsv` is very popular but I think development ended a while back; I like the `csvkit` just because I like tabbing through the options you have here. `miller` I've heard good things about. Or go to the total opposite end of the direction, use Simon Willison's excellent `csvs-to-sqlite` in conjunction with `datasette`, and then do a foray into the many interesting things you can do in SQL.
3. Bespoke text formats - `sed`, `awk`, and possibly even Vim macros reign supreme here, along with the rest of the "standard" Unix text kit. The big benefits of introducing these last is that these tools work as a superset of many of the previous ones for added flexibility.
Jq has one of the worst, non intuitive, non self evident syntax ever devised on planet Earth. Bash's if constructs are a walk in the park compared to general jq syntax. And people try to sort out that mess... somehow people always want to climb a mountain when it's in their way or someone say that would be an achievement of some sorts...
I found Jq to be difficult to use which is why Oj, https://github.com/ohler55/ojg is based on JSONPath. There still are a lot of options but it only takes a couple of help screens to figure out what the options are.
Reading through these comments here - some praise jq, others claim it is not possible to actually remember the syntax. There seems a consensus it reminds of complexity in awk, bash, sed... While I appreciate the magic behind jq, from intellectual point of view, and also as a tool, indeed - is impossible for me to remember reasonable part of it.
Interestingly I still remember most Perl5 syntax, even the crazy stuff, quite vividly, after some 6-7 years of not-writing Perl code. I wonder why - perhaps because Perl is not so complex (even the PCRE), and perhaps because one needs jq now and then, while Perl can be a primary tool for many things. Sadly, Perl is past its prime now, and there are no implications it'll ever do a comeback.
One thing that has helped me write simple/intermediate jq code is this: Imagine what the context is for your filter. Most importantly, update that context at each pipe character '|'.
On an empty command, the context is the top-level of your JSON. As you add filter stages, that context evolves.
(this really requires more explanation and diagrams than I have room for in this margin)
At some point in every declarative language’s life, so many features get bolted on to make it useful that it loses its declarative nature, at which point you might as well just use a more standard imperative language. In this case, just plain JS.
[+] [-] Octabrain|2 years ago|reply
[+] [-] absoluteunit1|2 years ago|reply
My solution was literally to create Anki cards every time I discover a neat feature that I might not remember but it would be useful too. I just go through it once a day for 10 minutes (my anki cards) and it works like a charm. My memory for various cli tools has drastically improved. Rarely do I need to reach for Google, man docs or ChatGPT for most cli tools usages. I’d recommend spaces repition for cli tools
[+] [-] hiAndrewQuinn|2 years ago|reply
[+] [-] ta8645|2 years ago|reply
[+] [-] racl101|2 years ago|reply
I will never truly memorize how to use this because it's not my primary goal, nor is it the end product to process data.
Rather, it is a means to a means to a means to an end.
[+] [-] bobbylarrybobby|2 years ago|reply
[+] [-] sorenjan|2 years ago|reply
[+] [-] ollysb|2 years ago|reply
[+] [-] adamgordonbell|2 years ago|reply
jq lets you select elements like it's a JavaScript object using dot notation and array indexing.
You can turn wrap things in array constructors, or object constructors to create new objects and lists: You can combine filters with pipes (|) to build complex transformations. Built-ins like map() and select() are useful for transforming arrays.You put it all together into something like this:
This query fetches GitHub issues, transforms them into a simplified structure, filters out unlabeled issues, sorts them, and wraps the results in an array - demonstrating how you can chain together jq's query language to wrangle JSON data.[1]: https://earthly.dev/blog/jq-select/
[+] [-] kspacewalk2|2 years ago|reply
[+] [-] evgpbfhnr|2 years ago|reply
Anyway, this article is neat! Good work!
If I were to nitpick one of the last examples with path has no explanation and flew over my head (would have to open the documentation), and a reset button for each example might be nice after messing with it a bit, but it was a nice play.
[+] [-] ishandotpage|2 years ago|reply
Regarding the reset button: I think that's a great suggestion and now it bugs me so much that I can't reset it. I'll add a reset button later tonight when I'm off work.
Regarding the confusing example: Yes, some of the examples are missing explanations (mainly because I spent more than a month on this post and I just did not want to put off putting it out any longer). Sorry haha. I'll try to improve the explanations and add more.
[+] [-] nonlogical|2 years ago|reply
https://github.com/wader/jqjq
It really is a super cool little, super expressive nearly (if not entirely) turing complete pure functional programming language.
You can:
* Define your own functions and libraries of functions
* Do light statistics
* Drastically reshape JSON data
* Create data indexes as part of you JQ scripts and summarize things
* Take JSON data, mangle it into TSV and pipe into SQLite
And also for prototyping you can also use it to tailor output of APIs to what you need in a pinch, using JQ as a library especially with something like python:https://pypi.org/project/jq/
As a part of the library you can compile your expressions down to "byte-code" once and reuse them.
Saying JQ is a best kept secret is an understatement. JQ gets more amazing the deeper you dig into it. Also it is kind of crazy fast for what it is.
edit: Formatting fixes
[+] [-] seanp2k2|2 years ago|reply
I’m sure at this point that many ETL jobs in notebooks we run at $BigCo today could be reduced to jq expressions that run 100x faster and use 1/10th the memory.
[+] [-] jameshart|2 years ago|reply
As soon as jq scripts reach a certain level of complexity I break out to writing a node script instead.
And given how rapidly jq scripts acquire complexity, that level is pretty low. One nested lookup, and I’m out.
[+] [-] wwader|2 years ago|reply
[+] [-] zmmmmm|2 years ago|reply
[+] [-] imp0cat|2 years ago|reply
[+] [-] pacha--|2 years ago|reply
[+] [-] reegnz|2 years ago|reply
As for learning it, it's the same with any tool, key is repetition and regular use.
[+] [-] cryptonector|2 years ago|reply
The first and foremost thing to know about jq is that it's built on path expressions, so the first thing to learn is how to write path expressions. Fortunately path expressions are easy in jq!
Things get more interesting when you see that `.[]` is the iterator operator, and that you can use it in path expressions.Things get really interesting when you see that `select(conditional expression)` can be used in path expressions joined with `|`.
Just this can be very useful. It's also useful to know about the magic `path()` function, and `paths`, which I often use to just list all the paths in an input JSON text. Try applying `jq -c paths` to a `kubectl get -o json pods` command's output!
[+] [-] navels|2 years ago|reply
[+] [-] Calzifer|2 years ago|reply
And when using the raw-output option it helps with the ambiguity between "null" and null.
[+] [-] city41|2 years ago|reply
[+] [-] reegnz|2 years ago|reply
I find the biggest problem with jq is that the feedback loop is not tight enough. With this jq-repl the expression is evaluated at every keystroke.
[+] [-] rustyminnow|2 years ago|reply
Let me piggyback to mention the (neo)vim plugin I use for tightening the loop... https://github.com/phelipetls/vim-jqplay
It's great for building large complex queries that will eventually live in scripts, but your zsh plugin seems to hit a real sweet spot of fast feedback for ad-hoc queries too! Huge props!
[+] [-] neuromanser|2 years ago|reply
zshbuiltins(1): Unlike parameter assignment statements, typeset's exit status on an assignemt that involves a command substitution does not reflect the exit status of the command substitution. Therefore, to test for an error in a command substitution, separate the declaration of the parameter from its initialization.
[+] [-] ykonstant|2 years ago|reply
[+] [-] seanp2k2|2 years ago|reply
[+] [-] bazzargh|2 years ago|reply
The way I did this was to store both the last working query and the last working output, I'd only reuse it if the last working query was a prefix of the current query - that avoids the awkward case where you are deleting letters from the output, so you need an output further back in history (which I didn't store, wasn't worth the hassle)
Feature request?
[+] [-] tejtm|2 years ago|reply
To than end I wrote a line of jq to emit every structural path from any json as a list of jq arguments.
You can use it to make queries or keep track of a documents structure.
https://github.com/TomConlin/json_to_paths
[+] [-] Marazan|2 years ago|reply
I continually bounce off the "language/philospohy" of jq in quite embarrassing ways. Every time I go "Ah, I can use this as a reason to learn jq and half an hour lateI've written a python script to extract the data instead.
[+] [-] marliechiller|2 years ago|reply
[+] [-] zwischenzug|2 years ago|reply
https://zwischenzugs.com/2023/06/27/learn-jq-the-hard-way-pa...
JQ really is the best kept secret in data.
[+] [-] ishandotpage|2 years ago|reply
[+] [-] unknown|2 years ago|reply
[deleted]
[+] [-] unknown|2 years ago|reply
[deleted]
[+] [-] highmastdon|2 years ago|reply
One thing I noticed, and where I stopped continuing, is that the jump from Filtering Nested Arrays to Flattening Nested JSON Objects, is WAAAY too big. From a simple filter to triple nested filters with keywords that had no introduction in a simpler example, isn’t working for me
[+] [-] jve|2 years ago|reply
> Aioli is a library for running genomics command-line tools in the browser using WebAssembly. See Who uses biowasm for example use cases.
https://biowasm.com/cdn/v3/jq/1.6
[+] [-] ollybee|2 years ago|reply
[+] [-] jimmySixDOF|2 years ago|reply
https://github.com/jqlang/jq/releases/tag/jq-1.7
[+] [-] jonfw|2 years ago|reply
I do a quick ctrl+r, type jq, and I can find all of my JQ snippets I've used in the past couple of years. If I then type "select" I can find all of the times I've used that function, etc.
I also use it to find while loops, kubectl snippets, environment variables I exported to run a script, etc.
[+] [-] hiAndrewQuinn|2 years ago|reply
1. JSON: `curl` to get interesting JSON APIs, `gron` and `grep` to explore what's inside them, `jq` to process them into interesting formats.
2. CSV: Lots of good choices here. `xsv` is very popular but I think development ended a while back; I like the `csvkit` just because I like tabbing through the options you have here. `miller` I've heard good things about. Or go to the total opposite end of the direction, use Simon Willison's excellent `csvs-to-sqlite` in conjunction with `datasette`, and then do a foray into the many interesting things you can do in SQL.
3. Bespoke text formats - `sed`, `awk`, and possibly even Vim macros reign supreme here, along with the rest of the "standard" Unix text kit. The big benefits of introducing these last is that these tools work as a superset of many of the previous ones for added flexibility.
[+] [-] lofaszvanitt|2 years ago|reply
[+] [-] peterohler|2 years ago|reply
[+] [-] s_dev|2 years ago|reply
[+] [-] larodi|2 years ago|reply
Interestingly I still remember most Perl5 syntax, even the crazy stuff, quite vividly, after some 6-7 years of not-writing Perl code. I wonder why - perhaps because Perl is not so complex (even the PCRE), and perhaps because one needs jq now and then, while Perl can be a primary tool for many things. Sadly, Perl is past its prime now, and there are no implications it'll ever do a comeback.
[+] [-] AceJohnny2|2 years ago|reply
On an empty command, the context is the top-level of your JSON. As you add filter stages, that context evolves.
(this really requires more explanation and diagrams than I have room for in this margin)
[+] [-] brap|2 years ago|reply
The same thing happened to HCL.
[+] [-] benatkin|2 years ago|reply
I looked at aoili briefly. I didn't see how to reproduce the build.