top | item 34875728

Ask HN: What do you use for basic data analysis, visualization, and graphing?

47 points| transitivebs | 3 years ago | reply

I often find myself with some JSON data that I want to visualize. I usually end up converting to CSV, uploading to a Google Sheet, and manually creating charts. BUT this is really time consuming and I find Google Sheets charts pretty difficult and painful for quickly exploring different views of the original data.

So what do you use for this type of thing?

I know python has lots of good utils for data wrangling & graphing, but I'd prefer a solution which is: no-code, gives me a bunch of common graph views I can quickly choose between, and that "just works" 99% of the time.

Thanks!

54 comments

order
[+] n8henrie|3 years ago|reply
Consider adding "no code" to the title somewhere to save a click for the hundreds of people that are planning to cheerfully suggest jupyter + pandas / matplotlib / Altair / seaborn / R / etc etc.
[+] transitivebs|3 years ago|reply
Good call; I realized that after the fact and unfortunately can't edit the title anymore.
[+] wanderingmind|3 years ago|reply
You can use superset[0]. Its a Flask app that can connect to databases, read csv, json and create good plots

[0] https://superset.apache.org/

[+] transitivebs|3 years ago|reply
Looks very promising.

I think I'm looking for the AI-powered equivalent of this that's one level of abstraction higher. Apache projects are obviously super high quality, but I want to offload the cognitive load of thinking about the graph specifics to an ML algo that "just works" for the majority of use cases (and is tweakable after the fact).

[+] flat-pluto|3 years ago|reply
I know you said wanted a no-code solution but in case you don't get a satisfactory answer try this out.

Earlier today there was a Show HN post[1] which showed how to visualize a Pandas dataframe (can come from CSV, JSON whatever). I tried it for basic tasks and it is pretty good. It's minimal code (<5 lines) - just reading the json and calling pygwalker in a Google Colab environment[2] or something. Something like this:

    import pandas as pd
    import pygwalker as pyg
    df = pd.read_json('{filename}.json')
    gwalker = pyg.walk(df)
Should be decent for most basic use-cases.

[1] - https://news.ycombinator.com/item?id=34869244

[2] - https://colab.research.google.com/

[+] ghiculescu|3 years ago|reply
If you can connect it to the data source, Metabase is exactly what you want. https://www.metabase.com/
[+] epgui|3 years ago|reply
Personally I find Metabase lacks flexibility, but then again I think that’s just the predicament you’re stuck with in no-code/low-code.
[+] ReDeiPirati|3 years ago|reply
This! As long as you can connect the data source you can go quite forward with this tool. It's probably the one that have the quickest learning curve.
[+] djbusby|3 years ago|reply
I love metabase. I'd also suggest to take a look at Apache Superset.
[+] MilStdJunkie|3 years ago|reply
Not sure if this is "no code" exactly, but I use these from inside of Asciidoc every day, and I haven't seen them mentioned yet particularly. Asciidoc directives inside of a graph block are processed before the graph, so you can get conditional graphs in the output, or use the include directive to fetch outside graphics.

== Vega and Vega-Lite

Site:: https://vega.github.io/vega/

Sandbox:: https://vega.github.io/editor/#/examples/vega/airport-connec...

== PlantUML

Sandbox:: https://plantuml-editor.kkeisuke.dev/

Language Specs:: https://plantuml.com/sitemap-language-specification

[+] rvrst|3 years ago|reply
Marple[0] is a pretty awesome tool for quick visualization and browsing through data.

[0] https://www.marpledata.com/

[+] transitivebs|3 years ago|reply
Looks very promising.

They seem to imply it's only for time-series data, but I like their marketing & UX so far. So many of the projects people link to are probably awesome, but if you don't nail the UX / DX, people bounce really quickly.

Thanks!

[+] employee42|3 years ago|reply
Can't believe someone hasn't suggested Grafana[0] yet. It sounds perfect for your needs (although there is some coding required to make the queries).

[0] https://grafana.com/

[+] aeontech|3 years ago|reply
Datasette seems like it might be a good fit?

https://datasette.io/

[+] cldellow|3 years ago|reply
OP said they "find Google Sheets charts pretty difficult and painful for quickly exploring different views of the original data"

And that they want "a bunch of common graph views I can quickly choose between"

That doesn't sound like Datasette to me, although I'd be happy to be wrong -- how would you recommend someone achieve this in Datasette?

[+] __mharrison__|3 years ago|reply
I use pandas. I'm pretty biased but I generally prefer to create things programmatically rather than drag and drop tooling. Especially if I need to do it in the future.

(I just made a course covering visualization w/ Pandas, Seaborn, Excel, Tableau, and a few others. My takeaway is that unless your data is good, you will need some preprocessing. Also, making good visualizations and tweaking them is difficult with code and no-code tooling. You need to figure out how to do the 20% of things (if you are even able to) in both code/no-code tools.)

[+] gooseyman|3 years ago|reply
What is the name of your new course? I’m struggling to determine if it’s effective pandas or one of the others.
[+] b_mc2|3 years ago|reply
If I'm not trying to build a very specific graph or chart, and just exploring data I usually use either Rawgraphs or Sqliteviz. Rawgraphs is nice if you just want to swap visualizations out with smaller data as is, sqliteviz seems to handle much larger datasets and let's you use SQL if you want to change the resultset. Both seem to keep data local too and I know sqliteviz works offline, rawgraphs might too.

https://www.rawgraphs.io/

https://sqliteviz.com/

[+] ellisv|3 years ago|reply
I use R and ggplot2 for most plotting/visualization but I’d recommend something like Apache Superset if you want no-code (although setup is still required).
[+] shubhamjain|3 years ago|reply
Shameless plug: This is exactly the problem the I am trying to solve with my app - TextQuery [1]. Creating even a basic graph means dealing with multiple tools. I wanted to create a simple app where you can import all common types of data, run SQL over it, and visualize it quickly.

[1]: https://textquery.app/

[+] transitivebs|3 years ago|reply
Awesome; glad I helped validate the pain point && always happy to try and help fellow indie hackers.

btw adding a screenshot to the home page of the goal UX would help 1000x even if it's just a design mockup.

For my use case, I'd want to drag & drop a JSON file.

Thanks!

[+] iamcreasy|3 years ago|reply
You can write Trino rest client for a JSON endpoint that will allow you to query the endpoint using SQL and the result table can be push to any number of application for visualization.
[+] acomjean|3 years ago|reply
Rstudio is a pretty nice user interface to the R language. datasets can be browsed like spreadsheet tables. ggplot2 is a great graphing tool. Used in science a lot by non programmers.

there is an online book too thats pretty decent: R for data science

https://r4ds.had.co.nz/

[+] ies7|3 years ago|reply
Since you're already using Google sheet, fastest nocode dashboard available should be Google Data Studio.
[+] transitivebs|3 years ago|reply
Ahhh; didn't know about this. I guess it's called Looker Studio now? Will give it a try. Thanks!
[+] geophph|3 years ago|reply
I might suggest pandas + plotly express. Not no code, and dependent on your data structure, but if you can form it into a tidy data frame plotly express will let you easily customize into different chart types and styles from there