Show HN: Catj – A new way to display JSON files

[+] jolmg|6 years ago|reply

I was curious if this could be doable with jq, and apparently it is:

  jq -j '
    [
      [
        paths(scalars)
        | map(
          if type == "number"
          then "[" + tostring + "]"
          else "." + .
          end
        ) | join("")
      ],
      [
        .. | select(scalars) | @json
      ]
    ]
    | transpose
    | map(join(" = ") + "\n")
    | join("") 
  '

EDIT: Got the string quoting and escaping.

EDIT 2: For those who want to save this script, you can put just the jq code in an executable file with the shebang:

  #!/usr/bin/jq -jf

[+] etxm|6 years ago|reply

Holy moly, Your jq skills are savage.

[+] pstuart|6 years ago|reply

I was expecting some mention of jq for this post, and you did not disappoint. Thank you for the script -- it works great and I'm adding it to my collection.

[+] phiresky|6 years ago|reply

Would probably be shorter if you used `tostream`

[+] alinspired|6 years ago|reply

Following https://github.com/stedolan/jq/issues/243 i commonly use https://github.com/joelpurra/har-dulcify/blob/master/src/uti... to explore unfamiliar json, ie:

  $ docker inspect 620f55df9177| structure.sh |grep -i addr
   .[].NetworkSettings.GlobalIPv6Address
   .[].NetworkSettings.IPAddress
   .[].NetworkSettings.LinkLocalIPv6Address
   .[].NetworkSettings.MacAddress
   .[].NetworkSettings.Networks.bridge.GlobalIPv6Address
   .[].NetworkSettings.Networks.bridge.IPAddress
   .[].NetworkSettings.Networks.bridge.MacAddress
  
  $ docker inspect 620f55df9177| jq .[].NetworkSettings.IPAddress
   "192.168.0.2"

[+] soheilpro|6 years ago|reply

That's awesome work. The only problem is that it does not properly handle keys which are not valid JS identifiers (like 1foo, @foo, foo-bar, etc.).

[+] unknown|6 years ago|reply

[deleted]

[+] kfrzcode|6 years ago|reply

  jq: error: syntax error, unexpected INVALID_CHARACTER, expecting $end (Unix shell quoting issues?) at <top-level>, line 3:
  jq -j '      
  jq: 1 compile error

This, uh, doesn't work for me on jq-1.5.1.

[+] snek|6 years ago|reply

IIRC in a hashbang isn't posix compatible

[+] unknown|6 years ago|reply

[deleted]

[+] avidal|6 years ago|reply

Have you seen gron[0]? It's similar: flattens JSON documents to make them easily greppable. But it also can revert (ie, ungron) so you can pipe json to gron, grep -v to remove some data, then ungron to get it back to json.

[0] https://github.com/tomnomnom/gron

[+] Fnoord|6 years ago|reply

Actively developed and written in Go.

The project linked to is from 2014 with last update in 2015.and it is on NPM...

What is left to say? Thank you!

[+] wooby|6 years ago|reply

I haven't seen that one, but it looks very similar to the one I use: https://github.com/micha/json-table

[+] krick|6 years ago|reply

That's actually a nice lifehack. Much simplier than jq. Unfortunately, would be harder to make all kinds of logical conditions for which jq allows (even if not that intuitively).

It still feels like there must be something in between, some way to make queries with json more naturally, than with jq, yet with enough power.

[+] inferiorhuman|6 years ago|reply

What does grep + gron give you over jq?

[+] news_hacker|6 years ago|reply

Love the greppability and reconstructability. This should be the submission.

[+] twp|6 years ago|reply

I wrote a similar tool:

https://github.com/twpayne/flatjson

The flat format is great for diffs:

  --- testdata/a.json
  +++ testdata/b.json
  @@ -1,5 +1,6 @@
   root = {};
   root.menu = {};
  +root.menu.disabled = true;
   root.menu.id = "file";
   root.menu.popup = {};
   root.menu.popup.menuitem = [];
  @@ -9,8 +10,5 @@
   root.menu.popup.menuitem[1] = {};
   root.menu.popup.menuitem[1].onclick = "OpenDoc()";
   root.menu.popup.menuitem[1].value = "Open";
  -root.menu.popup.menuitem[2] = {};
  -root.menu.popup.menuitem[2].onclick = "CloseDoc()";
  -root.menu.popup.menuitem[2].value = "Close";
  -root.menu.value = "File";
  +root.menu.value = "File menu";

[+] chrismorgan|6 years ago|reply

There is actually a standard around writing paths into JSON objects: JSON Pointer, https://tools.ietf.org/html/rfc6901. It’s straightforward, and avoids ambiguity between separator and key name by simple replacement, e.g. `/foo/bar~0baz~1quux` looks up a key named "foo", then a key named "bar~baz/quux" inside it. It’s not particularly widely used, but I’ve come across it in a few places over the years (it’s not a common thing to need to do), and probably most recently JMAP uses it for backreferences.

(I haven’t run it, but a skim of the code suggests that this tool will turn `{"foo.bar": "baz", "foo": {"bar": "baz"}}` into `["foo.bar"] = "baz"` and `.foo.bar = "baz"`, resolving the separator ambiguity in a pretty JavaScripty way.)

[+] yason|6 years ago|reply

When XML came out into popularity I think the first thing I wrote was a small Python program to flatten/unflatten XML into per-line entries quite similar to the example output in the article.

The text streams that are processed line-by-line by dozens or hundreds of line-based tools are immensely powerful and universal. It's all Unix heritage and often overlooked by fancy modern designs that more often follow a fashion rather than root themselves in substance.

Surely text streams have their share of limitations like everything else but in practise you can retrofit nearly anything into line-based text streams and get an immediate productivity multiplier by being able to apply a whole array of established tools to process that data. Proof of that power is that it has been worthwhile to write converters to and from text and other formats. Not only you can find translators to turn various hierarchical or object-oriented formats into text but you can even convert a PNG into text and back (with SNG).

Text streams are like roads with lanes. They're ages old, they're pretty good at separating and guiding traffic, and they're somehow suboptimal in several senses yet rarely can anyone point out a single, clear practical improvement on laned roads, not to mention a system for containing traffic flows that is superior to them.

[+] emmelaich|6 years ago|reply

Augeas can do something similar too. But not only JSON but XML and 200+ other config file formats.[0]

  $ augtool -r . -L --transform 'JSON.lns incl /catj-eg.json'  <<< 'print /files/catj-eg.json'
  /files/catj-eg.json
  /files/catj-eg.json/dict
  /files/catj-eg.json/dict/entry = "movie"
  /files/catj-eg.json/dict/entry/dict
  /files/catj-eg.json/dict/entry/dict/entry[1] = "name"
  /files/catj-eg.json/dict/entry/dict/entry[1]/string = "Interstellar"
  /files/catj-eg.json/dict/entry/dict/entry[2] = "year"
  /files/catj-eg.json/dict/entry/dict/entry[2]/number = "2014"
  /files/catj-eg.json/dict/entry/dict/entry[3] = "is_released"
  /files/catj-eg.json/dict/entry/dict/entry[3]/const = "true"
  /files/catj-eg.json/dict/entry/dict/entry[4] = "director"
  /files/catj-eg.json/dict/entry/dict/entry[4]/string = "Christopher Nolan"
  /files/catj-eg.json/dict/entry/dict/entry[5] = "cast"
  /files/catj-eg.json/dict/entry/dict/entry[5]/array
  /files/catj-eg.json/dict/entry/dict/entry[5]/array/string[1] = "Matthew McConaughey"
  /files/catj-eg.json/dict/entry/dict/entry[5]/array/string[2] = "Anne Hathaway"
  /files/catj-eg.json/dict/entry/dict/entry[5]/array/string[3] = "Jessica Chastain"
  /files/catj-eg.json/dict/entry/dict/entry[5]/array/string[4] = "Bill Irwin"
  /files/catj-eg.json/dict/entry/dict/entry[5]/array/string[5] = "Ellen Burstyn"
  /files/catj-eg.json/dict/entry/dict/entry[5]/array/string[6] = "Michael Caine"

[0]

  $ ls  .../share/augeas/lenses/dist/|wc
        221     221    2867

[+] tedivm|6 years ago|reply

For exploring large files I released a program called "JSONSmash" that runs a shell which lets you browse the data as if it was a filesystem.

https://blog.tedivm.com/open-source/2017/05/introducing-json...

[+] rahimnathwani|6 years ago|reply

This looks cool! I'm curious: did you consider using FUSE for this, instead of taking a shell? Using FUSE might make installation harder for less technical users, but the upside is that they could use their choice of file manager (shell commands, curses-based or GUI).

As you've already implemented most relevant commands (cd, ls, cat), it would probably be easy to make a FUSE version using fs-fuse / fuse-bindings

[+] ficklepickle|6 years ago|reply

What an interesting idea! On my last contract, I had to deal with 100mb+ JSON responses from a barely-documented API. This would have come in handy when I was figuring it out.

I've used JSONExplorer for this purpose, but it is web based and doesn't handle files this large.

Extending the filesystem metaphor to JSON data and re-using the same commands strikes me as a great idea.

Did another project inspire you, or did you come up with the concept yourself?

Have you done a Show HN yet?

[+] hokus|6 years ago|reply

this is something between a joke and a thought experiment.

    function cason(x){
    switch(x[0]){
      case "movie": switch(x[1]) {
        case "name"       : return "Interstellar";
        case "year"       : return 2014;
        case "is_released": return true;
        case "director"   : return "Christopher Nolan";
        case "cast": switch(x[2]){
          case 0: return "Matthew McConaughey";
          case 1: return "Anne Hathaway";
          case 2: return "Jessica Chastain";
          case 3: return "Bill Irwin";
          case 4: return "Ellen Burstyn";
          case 5: return "Michael Caine";
        }
      }
    }
    }

[+] geofft|6 years ago|reply

Well, that's kind of the inverse of writing switch statements like

    def license(kernel):
        return {"Linux": "GPL",
                "FreeBSD": "BSD",
                "NT": "Proprietary"}[kernel]

[+] kara_jade|6 years ago|reply

You can do this easily with the json_tree function from SQLite's JSON1 extension. It's given as an example in the documentation:

https://sqlite.org/json1.html#jtree

  SELECT big.rowid, fullkey, value
    FROM big, json_tree(big.json)
   WHERE json_tree.type NOT IN ('object','array');

[+] ComputerGuru|6 years ago|reply

And with the fileio sqlite extension, you can even directly query files (and their contents, and directories recursively too, no less) from SQL.

[+] nfoz|6 years ago|reply

That's nice, I like that.

Related, if you want more of a csv-style, see JSONLines. aka "newline-delimited JSON"

http://jsonlines.org/

[+] emmelaich|6 years ago|reply

I don't see what jsonlines has over yaml; in fact the jsonlines examples presented there are almost trivially converted to yaml. e.g. the first example is valid yaml if you add a '- ' to each line.

And JSON is (almost) a perfect subset of yaml.

I've been using csv lately. It's reputation is overstated.

What I like is that it's far more compact than yaml or json and trivially pulled into sqlite for ad-hoc queries.

[+] limsup|6 years ago|reply

Try it with deno:

deno install catj https://deno.land/std/examples/catjson.ts --allow-read

[+] ravinizme|6 years ago|reply

Similar to python-jsonpipe (8 years ago).

https://github.com/zacharyvoase/jsonpipe

It includes 'jsonunpipe'.

So you could grep part of the JSON and still get a JSON back.

``` echo '{"a": 1, "b": 2}' |grep b| jsonpipe | jsonunpipe

#{""b": 2} ```

[+] bborud|6 years ago|reply

Oh no. This looks like the horrible config format a colleague of mine invented at Yahoo when making an absolutely horrific config system. This brings back bad memories.

This may look cute but it is horrific when dealing with large configs and you have to reconstruct all the structure in your head.

Also, when you have a format that nests using brackets, braces and parenthesis, you can get help from the editor. This format does not give you that.

I'm not a huge fan of JSON (and the above mentioned format was invented because none of us were fans of XML at the time), but it turns out that both XML and JSON are actually easier to work with in practice than this format. Not least because there is ample tooling for JSON (and XML).

The lesson I learnt: I may hate XML (or in this case JSON), but finding an alternative that is better is not easy.

[+] geofft|6 years ago|reply

I think the idea is not that this is for storage or editing, but just for querying - you should keep your data in JSON, but if you want to find where something is, do `catj foo.json | grep something` and it'll tell you all the paths in the document where you can find the string "something". The intent is not to open catj format in a text editor, or to use the output of catj for anything other than ad-hoc purposes.

[+] flying_sheep|6 years ago|reply

https://github.com/stedolan/jq/issues/243

jq can also do the same thing, with more flexibility. And it is possible to combine with bash alias to make it indistinguishable from catj

[+] jolmg|6 years ago|reply

> combine with bash alias

... or you know, you could put the jq script in an executable file and add a shebang like

  #!/usr/bin/jq -jf

or

  #!/usr/bin/jq -rf

In my opinion, aliases should mostly be used to add default options only. Not really to insert whole scripts into them.

[+] sdegutis|6 years ago|reply

This is really similar to the format that AWS uses to represent recursive structures (arrays, maps) as a single array of key-value pairs for their APIs. Your catj could potentially be used to create that if ever working with the raw API directly instead of a SDK.

[+] QuadrupleA|6 years ago|reply

Please don't write JSON like the example - it's like putting 4 files in a hierarchy of 20 folders. Way over-structured.

That said, if you're stuck dealing with bad JSON like this with low signal to noise this is a decent way to redisplay it.

[+] danschumann|6 years ago|reply

Okay I made one with javascript ( go to https://underscorejs.org/ and open console )

EDIT: how do I markup code on HN?

[+] ComputerGuru|6 years ago|reply

> how do I markup code

Indent each line with four spaces. Please please keep line length very short (under forty?) as HN's pre tags are absolutely not mobile friendly and can trash the entire page.

Edit: actually it seems to at least scroll within the comment div on overflow now, that's a huge improvement!

[+] not_kurt_godel|6 years ago|reply

Hm, essentially JSON->Properties file. Cool. I wrote a little script the other day and decided Properties format was a pleasant way to define the config, maybe this could dovetail with similar future endeavors.

[+] blablabla123|6 years ago|reply

That's really smart, in fact this was - maybe until now - the only reason for me to resort to csv. (With pandas it's by the way really easy to flatten jsons into csv) JSON is such a nice format but the tools are really not there yet. I guess should should make it then possible to combine with line-based tools like head, tail, sort, uniq etc.

[+] venthur|6 years ago|reply

Looks a lot like ye olde gron: https://github.com/tomnomnom/gron

Here's a Python implementation of gron https://github.com/venthur/python-gron

120 comments