Gron – Make JSON Greppable

[+] theshrike79|5 years ago|reply

From the FAQ, before someone asks the obvious:

Why shouldn't I just use jq?

jq is awesome, and a lot more powerful than gron, but with that power comes complexity. gron aims to make it easier to use the tools you already know, like grep and sed.

gron's primary purpose is to make it easy to find the path to a value in a deeply nested JSON blob when you don't already know the structure; much of jq's power is unlocked only once you know that structure.

[+] sph|5 years ago|reply

In simpler words, as a user: The jq query language [1] is obtuse, obscure, incredibly hard to learn if you need it for quick one liners once in a blue moon. I've tried, believe me, but I should probably spend that much effort learning Chinese instead.

It's just operating at the wrong abstraction level, whereas gron is orders of magnitude easier to understand and _explore_.

1: https://stedolan.github.io/jq/manual/

[+] rwiggins|5 years ago|reply

I regularly use jq to summarize the structure of big JSON blobs, using the snippet written here (I alias it to "jq-structure"): https://github.com/stedolan/jq/issues/243#issuecomment-48470...

For example, against the public AWS IP address JSON document, it produces an output like

    $ curl -s 'https://ip-ranges.amazonaws.com/ip-ranges.json' | jq -r '[path(..)|map(if type=="number" then "[]" else tostring end)|join(".")|split(".[]")|join("[]")]|unique|map("."+.)|.[]'
    .
    .createDate
    .ipv6_prefixes
    .ipv6_prefixes[]
    .ipv6_prefixes[].ipv6_prefix
    .ipv6_prefixes[].network_border_group
    .ipv6_prefixes[].region
    .ipv6_prefixes[].service
    .prefixes
    .prefixes[]
    .prefixes[].ip_prefix
    .prefixes[].network_border_group
    .prefixes[].region
    .prefixes[].service
    .syncToken

This plus some copy/paste has worked pretty well for me.

[+] jeffbee|5 years ago|reply

It feels like finding a deeply nested key in a structured document is a job for XPath. Most people including myself until recently ignore that XPath 3.1 operates on JSON.

[+] StavrosK|5 years ago|reply

Can jq do what gron does?

[+] vhanda|5 years ago|reply

I love that this handles big numbers without modifying them unlike `jq` - https://github.com/stedolan/jq/issues/2182

   $ echo "{\"a\": 13911860366432393}" | jq "."
   {
     "a": 13911860366432392
   }

   $ echo "{\"a\": 13911860366432393}" | gron | gron -u
   {
     "a": 13911860366432393
   }

I can now happily uninstall `jq`. I've been burned by it way too many times.

[+] detaro|5 years ago|reply

ouch, I did not know that! thanks for the warning, need to check if my installed version has the fix already.

[+] jrockway|5 years ago|reply

I think you should just invest the hour or so it takes to learn jq. Yes, it's far from a programming language design marvel. But it covers all of the edge cases, and once you learn it, you can be very productive. (But, the strategy of "copy paste a oneliner from Stackoverflow the one time a year I need it" isn't going to work.)

I think structured data is so common now, that you have to invest in learning tools for processing it. Personally, I invested the time once, and it saves me every single day. In the past, I would have a question like "which port is this Pod listening on", and write something like "kubectl get pod foo -o yaml | grep port -A 3". Usually you get your answer after manually reading through the false-positives. But with "jq", you can just drive directly to the correct answer: "kubectl get pod foo -o json | jq '.spec.containers[].ports'"

Maybe it's kind of obtuse, but it's worth your time, I promise.

[+] barrkel|5 years ago|reply

How about a tool which output nicely formatted JSON with every line annotated with a jq expression to access that value.

But then I squint a little bit at the default gron (not ungron) output, and that's actually what I see.

[+] monsieurbanana|5 years ago|reply

But how do you get that '.spec.containers[].ports'?

It seems to me that for your example use case, gron is at least useful to first understand the json structure before making your jq request. And, for simple use cases like this one, enough to replace jq altogether.

[+] lenkite|5 years ago|reply

Better to learn xidel http://www.videlibri.de/xidel.html which is standards based and more sane to read.

[+] bradknowles|5 years ago|reply

With respect, it might have been just an hour for you.

For me, it was hours and hours and hours and days and days wasted with jq, before I found gron.

Not looking back.

[+] inshadows|5 years ago|reply

I don't find this grep-able at all:

    json[0].commit.author.name = "Tom Hudson";

Now I need to escape brackets and dots in regex. Genius!

I have 5 line (!) jq script that produces this:

    json_0_commit_author_name='Tom Hudson'

This is what I call grep-able. It's also eval-able.

> What if there's json object with commit and json_commit?

Then I'll use jq to filter it appropriately or change delimiter. The point is ease of use for grep and shell.

[+] TomNomNom|5 years ago|reply

I think this is a good point. It's definitely hard to grep for certain parts of gron's output; especially where there's arrays involved because of the square brackets. I find that using fgrep/grep -F can help with that in situations where you don't need regular expressions though.

It's not an ideal output format for sure, but it does meet some criteria that I considered to be desirable.

Firstly: it's unambiguous. While your suggested format is easier to grep, it is also lossy as you mention. One of my goals with gron was to make the process reversible (i.e. with gron -u), which would not be possible with such a lossy format.

Secondly: it's valid JavaScript. Perhaps that's a minor thing, but it means that the statements are eval-able in either Node.js or in a browser. It's a fairly small thing, but it is something I've used on a few occasions. Using JavaScript syntax also means I didn't need to invent new rules for how things should be done, I could just follow a subset of existing rules.

FWIW, personally I'm usually using gron to help gain a better understanding of the structure of an object; often trying to find where a piece of known data exists, which means grepping for the value rather than the key/path - avoiding many of the problems you mention.

Thanks for your input :) I'd like to see your jq script to help me learn some more about jq!

[+] hiperlink|5 years ago|reply

With `fgrep` and quoteing in 's You don't have to escape anything.

[+] armagon|5 years ago|reply

What is your jq script?

[+] necrotic_comp|5 years ago|reply

Can you share the jq script ?

[+] c2xlZXB5Cg|5 years ago|reply

Great with https://github.com/jpmens/jo

[+] GordonS|5 years ago|reply

While I like what jq let's me do, I actually find it really difficult to use. It's very rare that I attempt to use it without having to consult the docs, and when I try to do anything remotely complex it often takes ages to figure it out.

I very much like the look of gron for the simpler stuff!

[+] mikepurvis|5 years ago|reply

"Or you could create a shell script in your $PATH named ungron or norg to affect all users"

You could also check argv[0] for if you were called via the `ungron` name. Then it would be as simple as a symlink, which is very easy to add at install/packaging time.

(I know it's fairly broadly known, but this is the "multicall binary" pattern: https://flameeyes.blog/2009/10/19/multicall-binaries/)

[+] zvr|5 years ago|reply

Please submit this as an issue to the repo.

[+] superdeeda|5 years ago|reply

The author of tool did a really nice tutorial on doing bug bounty recon using Linux in which he also used Gron: https://youtu.be/l8iXMgk2nnY?t=1335

[+] jbverschoor|5 years ago|reply

So it basically flattens a json file into lines of flattened-key = value. Which makes it easy to just grep

[+] saurik|5 years ago|reply

Same as html2/xml2.

[+] oftenwrong|5 years ago|reply

This was on the front page before, which generated some good discussion:

https://news.ycombinator.com/item?id=16727665

[+] OJFord|5 years ago|reply

Yeah, two years ago, and no commits (roughly) since.

(Has bug reports, has PRs, it's not 'done'.)

[+] freedomben|5 years ago|reply

Wow, I am typically hesitant to adopt new tools into my flow. Often times they either don't solve my problem all that much better, or they try to do too much.

This looks perfect. Does one thing and does it well. I will be adopting this :-)

[+] ajholyoake|5 years ago|reply

I achieve something similar with the following function in ~/.jq

  def flatten_tree: [leaf_paths as $path | {"key":$path | join("."), "value": getpath($path)}] | from_entries;

e.g.

  curl -s http://consul.service.consul:8500/v1/catalog/service/brilliant-service | jq -r 'flatten_tree'

I haven't felt the need to devise the reverse transformation yet, but it works great for grepping blobs of unknown structure

[+] georgecalm|5 years ago|reply

If you'd like to use something like this in your own APIs to let your clients filter requests or on the CLI (as is the intention with gron), consider giving "json-mask" a try (you'll need Node.js installed):

  $ echo '{"user": {"name": "Sam", "age": 40}}' | npx json-mask "user/age"
  {"user":{"age":40}}

or (from the first gron example; the results are identical)

  $ gron "https://api.github.com/repos/tomnomnom/gron/commits?per_page=1" | fgrep "commit.author" | gron --ungron
  $ curl "https://api.github.com/repos/tomnomnom/gron/commits?per_page=1" | npx json-mask "commit/author"

If you've ever used Google APIs' `fields=` query param you already know how to use json-mask; it's super simple:

  a,b,c - comma-separated list will select multiple fields
  a/b/c - path will select a field from its parent
  a(b,c) - sub-selection will select many fields from a parent
  a/*/c - the star * wildcard will select all items in a field

[+] zomglings|5 years ago|reply

This is a fantastic idea.

I have installed gron on all my development machines.

Will probably use it heavily when working with awscli. I'm not conversant enough in the jq query language to not have to look things up when writing even somewhat complex scripts. And I don't want to learn awscli's custom query syntax. :)

Thought at first that it might be possible to replicate gron's functionality by some magic composition of jq, xargs, and grep, but that was before I understood the full awesomeness of gron - piping through grep, sed maintains gron context so you can still ungron later.

Nice work, thank you!

[+] juancampa|5 years ago|reply

Very nice! I don't like that it can also make network requests. It's potential security hole and completely unnecessary given that we already have curl and pipes for that.

[+] pletnes|5 years ago|reply

Catj is worth a mention. Similar, but written in node.js.

https://github.com/soheilpro/catj

[+] earthboundkid|5 years ago|reply

I use this all the time when working with new APIs.

[+] pointernil|5 years ago|reply

1. Is there a name/"standard" for the format gron is transforming json into?

2. Thesis: jq is cumbersome when used on a json input of serious size/complexity because upfront knowledge of the structure of the json is needed to formulate correct search queries. Gron supports that "uninformed search" use-case much better. Prove me wrong ;)

[+] TomNomNom|5 years ago|reply

1. There isn't really a name for it, but it's a subset of JavaScript and the grammar is available here specified in EBNF, with some railroad diagrams to aid understanding: https://tomnomnom.github.io/gron/

2. That's pretty much exactly why I wrote the tool :)

[+] aston|5 years ago|reply

gron outputs Javascript!

[+] jandrese|5 years ago|reply

Is this better than the old solution of json_pp < jsonfile | grep 'pattern' ?

While that's only useful for picking out specific named keys without context, that's often good enough to get the job done. Added bonus is that json_pp and grep are usually installed by default so you don't have to install anything.

[+] ddgflorida|5 years ago|reply

JSON Path Names at https://www.convertjson.com/json-path-list.htm will do this too plus break down all the pieces in a nice searchable table format. Disclosure - author.

91 comments