top | item 3024977

Ask HN: Is this data startup idea viable?

10 points| kemvi | 14 years ago | reply

Sometimes I want to play around with some data but don't want to overcome the barrier of downloading it, cleaning it, and writing parsing code.

Our vision is to build a startup that makes it easier to do just that. We store the data, you use a webapp to play around with it. So, in a matter of a minute, you could load a time series of Google's stock price, one of Apple's stock price, normalize each by the value of the Dow, and plot on top of them a time series of the consumer price index for urban consumers. Or you could generate some quick scatter plots to see if there's a correlation.

The UI model is based on dataflow programming (like Orange, Weka, or Rapidminer). If you don't know what that is, it looks like this:

http://mines.humanoriented.com/classes/2010/fall/csci568/portfolio_exports/sphilip/images/sample_workflow_border.png

You create "widgets" that do something, like loading data, or making a scatter plot, and you link them up to do things with them.

There's a splash page up at www.kemvi.com.

There's also a working prototype. If you're interested in testing it out, please email [email protected].

9 comments

order
[+] revorad|14 years ago|reply
Be sure to read about these startups which failed trying to do something similar:

Verfiable: http://stuartroseman.com/post/619953720/out-with-the-old-bus...

http://eagereyes.org/blog/2010/end-of-verifiable-com

Swivel: http://eagereyes.org/criticism/the-rise-and-fall-of-swivel

I guess the takeaway is that the business model is probably B2B, and you need to fulfill a real need.

[+] kemvi|14 years ago|reply
Great, thanks! I knew of timetric, mentioned below, but not about these. I suppose there's a strong selection bias.

The business model probably should be B2B: I imagine the users would be companies that need to do business intelligence, journalists, universities, and research organizations.

[+] dmk23|14 years ago|reply
Seems like you are offering the same data manipulation people currently do either with desktop software or custom coded server apps. Your value here is turning it into a web service and making it easier to use. There is a very strong case for doing just that.

What I would recommend is to connect the dots and clearly position your service as cloud alternative to desktop / server data manipulation packages. Get very explicit about it and target similar customers. Do not limit yourself to just data on the web and allow file uploads. Make a feature-light free version and charge for a premium one, determined by data volume or number of manipulations or something like that.

I think there is a lot of room for cloud-based statistical service to undercut traditional players in this market.

[+] kemvi|14 years ago|reply
The freemium model is exactly what I was thinking. And, yes: Rapidminer, Weka, etc, are all on the desktop. Thanks for the motivation.
[+] trussi|14 years ago|reply
Take a look at Tableau. It's super expensive and desktop-based. But it allows for some very powerful, customized data analysis.

Build a SaaS version of Tableau and you have a good starting point.

You still need to find the problem you are solving. Try to find one or two niches, like university labs or government research facilities.

Personally, I'd find the problem this solves well, then actually solve that problem (instead of helping others solve that problem). Take credit card interest rates; instead of building a tool tailored to credit card issuers or resellers, build a site that provides the data (using your tool) in a value-added way.

It's like you're creating your itch, so you can scratch it with your idea. A bit backwards, but crazy enough it might just work.

[+] tylerneylon|14 years ago|reply
Yes, I think it is, although I see a few challenges.

It is a huge amount of work to integrate disparate data sets and build a system that can efficiently handle many customers' requests.

It may be difficult to show customers how they can actually use it. Comparing two stocks isn't cool. You know what's cool? Comparing all stocks with realtime updates.

For the second problem, I mean that users won't use their imagination much to figure out what your product does. You have to spell it out in concrete, actually useful examples. This is a challenge between engineer-thinking, where general functionality is king, and user-thinking, where one particular goal and the path of least thinking/work is king.

[+] ig1|14 years ago|reply
Have you seen http://timetric.com/ who do something along those lines, although with less manipulation.
[+] kemvi|14 years ago|reply
I had heard of these guys, but hadn't seen their site in months. Sounds like they're doing something pretty similar, which is exciting, because it means they've tested the business model.
[+] rorrr|14 years ago|reply
There are tons of data sets out there, most are in a great shape, and trivial to parse.

I guess maybe you need to provide some better examples of what your service will be. Now it all sounds like something I can do in Google Doc spreasheets or Yahoo pipes.

And how will you make money?