top | item 21165736

Show HN: Transform Data Without Programming

114 points| hermitcrab | 6 years ago |easydatatransform.com | reply

61 comments

order
[+] hermitcrab|6 years ago|reply
Easy Data Transform is a tool to help you quickly and easily clean, merge, dedupe and analyze table and list data, without any programming.

It is aimed at professionals who have data to transform, but aren't programmers or data science professionals.

Use cases include:

* making a list of all the people in mailing list A that are not in mailing list B

* filtering a log file

* joining two spreadsheets

* renaming, reordering and adding/deleting columns in a table

* reformatting dates

* de-duplicating a postal mailing list

It is desktop software for Windows and Mac, so there is no latency and you don't have to upload sensitive data to a third party server.

At some point we plan to start charging. But the current beta is free until the end of November. And there may be another free beta after that.

We would love to get some feedback. Particularly from people using it to solve real world problems.

[+] blondin|6 years ago|reply
well, i know i can just download the app to see what it looks like but i can't right now so would be nice to see some screenshots of the app on the website.
[+] stevoski|6 years ago|reply
I love this product and wish I had created it.

Here’s a podcast interview I recently did with OP about his product:

https://bootstrapped.fm/2019/10/04/109-andy-brice-founder-of...

Worth listening to if you are interested in the decisions that go into creating, designing, naming, doing usability testing and promoting a desktop app like Easy Data Transform.

[+] harryf|6 years ago|reply
Cool tool. Tried it on an annoying dataset I know well. Three specific requests;

#1. The "Show First 10 Rows" dropdown... nice here would be "Show First 10 MOST FREQUENT Rows" ... helps get a view of the distribution of values

#2. A "Map" transformation - you can give it a list of input values and a list of one more more output values to which the inputs should be mapped. E.g. input values might be "New York", "Peekskill" and "Middletown" which map to "New York State" which can be placed in a new column (like the "If" transform)

#3. Finally because it's Hackernews... a "Function" transform allowing something like a Javascript function to be applied to a column, the output put in another column

[+] hermitcrab|6 years ago|reply
>#1. The "Show First 10 Rows" dropdown... nice here would be "Show First 10 MOST FREQUENT Rows" ... helps get a view of the distribution of values

You should be able to do this with a pivot, then a sort. But pivot doesn't work with non-numeric values at present. Next release!

>#2. A "Map" transformation - you can give it a list of input values and a list of one more more output values to which the inputs should be mapped. E.g. input values might be "New York", "Peekskill" and "Middletown" which map to "New York State" which can be placed in a new column (like the "If" transform)

You could do that with 'IF'. But I guess that could be a bit verbose and I should perhaps offer a 'Lookup' transform as well. The table lookup has the advantage that the lookup table can be created/modified by Easy Data Transform.

>#3. Finally because it's Hackernews... a "Function" transform allowing something like a Javascript function to be applied to a column, the output put in another column

Yes, an option to have some sort of scriptable transform would be very useful (even if it is slightly at odds with the "without programming" positioning). I personally loathe Javascipt, but I guess it would be easier to embed than, say, Python or Lua.

Thanks for the feedback.

[+] jcadam|6 years ago|reply
I imagine you could do much of this with NiFi - https://nifi.apache.org/, though if your needs are simple, something like this would definitely be much easier to deal with.
[+] hermitcrab|6 years ago|reply
Looking at the Nifi website, I assume it is aimed at IT professionals and has a steep learning curve.

Easy Data Transform is aimed at people who don't have either the skills, time or inclination to take on something like Nifi (which is most people!).

The aim with Easy Data Transform is that someone can use it to transform their data within a few minutes of first seeing it.

[+] k__|6 years ago|reply
I don't know how good this is, but from that page it looks like the "no-code required" approach for data transformation I saw as part of many solutions.

When I write data-transformation code, I always have the feeling that it's often too inter-connected and an approach, like the one this tool follows, would be nicer.

Somehow the only the core idea of using these connected nodes is good, the rest of the UI is too clunky, so I drop down to "real" code again for some nodes and sooner or later the mixing up of nodes and code becomes too cumbersome and I drop down to "real" code for everything.

[+] hermitcrab|6 years ago|reply
If you can code a solution, then you probably aren't in the core market for this product.

But, perhaps one day in the future, I might be able to add a script or plugin node, so you can add your own custom transforms.

[+] fastbeef|6 years ago|reply
I love these kinds of tools and this looks very useful indeed, but something about your page triggers my spidey sense. A pricing page or at least a hint at a business model would make that go away. Right now it feels like you’re just trying to get me to run a binary on my computer.

Edit: not saying you’re shady, just that it has a vibe of being shady :)

[+] hermitcrab|6 years ago|reply
It is free while we are in beta. Because I hope that is going to result in more feedback. Also I don't feel comfortable charging for something that isn't quite production quality yet. But the plan is to charge an annual sub after it comes out of beta (price undecided).

I can see that might trigger some people to think it is of dubious provenance. Maybe I should put the above on a 'Buy' page?

BTW the software is digitally signed (and notarized on Mac) and we've been selling software online since 2005. http://oryxdigital.com/

[+] RenRav|6 years ago|reply
Pretty cool, reminds of how Advanced Renamer handles batch renaming filenames through a visual stack of methods, like sorting, regex replacing, trimming, renumbering, etc. I think that's a really useful thing. There are lots of other weird online formatting tools I've seen over the years that perform things like this, but the experience is pretty poor. I will probably recommend this to my dad.
[+] nevf|6 years ago|reply
Congrats on releasing this. I had a quick look at the site and the manual and couldn't see a list of file formats that can be used. For example can I use it with JSON or XML files?
[+] hermitcrab|6 years ago|reply
Not yet. Currently it can read delimited text (e.g. CSV) and XLS(X)* and write delimited text. But I do plan to add other input/output formats, depending on feedback.

I need to think a bit about how to flatten an XML/JSON doc into a table and then turn it back into an XML/JSON doc.

(*XLS(X) output currently only works on Windows, because it uses ActiveX. But I plan to have XLS(X) input/output on Windows and Mac at some point.)

[+] nexuist|6 years ago|reply
This is a good idea. Honestly surprised it's not SaaS - although I get the privacy aspect.
[+] hermitcrab|6 years ago|reply
I don't see any advantage to making this a SaaS. It would just result in more latency and potential privacy issues.

It is true that a desktop system may not be suitable for transorming million row datasets or processing that is running 24x7 - but that is not the market we are aiming for.

[+] GordonS|6 years ago|reply
I think the benefit of a SaaS in this case would be:

1. Users always work with the latest version, so you only have 1 version to support 2. It would make monthly pricing an easier sell

But I think there are some downsides here, with an app that is solely about data:

1. If user's data has to flow through it, there are privacy, GDPR and intellectual property concerns (for both the SaaS vendor and customers) 2. Latency, since you're going to have to upload data 3. Possibly issues with bandwidth fees (I think most clouds only charge for egress bandwidth, but users are still going to want to download the processed data) 4. Monthly pricing is a big turn-off for a large segment

[+] softwarelimits|6 years ago|reply
Why are you not offering a Linux Version?
[+] hermitcrab|6 years ago|reply
It is written in C++/Qt, so it wouldn't be that hard to add a Linux version. But I'm not sure that the market that this is aimed at uses Linux in any appreciable numbers. You are the first to ask!

Also building binaries for Linux is a pain. Which distributions to support?

[+] Havoc|6 years ago|reply
Looks like Alteryx basically?
[+] hermitcrab|6 years ago|reply
I don't know a lot about Alteryx. But I understand it is enterprise oriented and much more expensive than the price point we are aiming for.
[+] enriquto|6 years ago|reply
I love this kind of programming tools, but do not understand the terminology. Using programming tools has always been called, eeer, "programming"? Is there something that I'm missing here? What's the point of saying "no programming" when you are, precisely, programming?
[+] hermitcrab|6 years ago|reply
I don't think it is programming in any real sense. There are no variables and no loops. So I don't think it is Turing Complete. ;0)
[+] tonyedgecombe|6 years ago|reply
Presumably because the end user of the product doesn't need to be a programmer.