Show HN: Wit – Natural language for your app

[+] npalli|12 years ago|reply

It was so confusing figuring out what this service is supposed to do. Had to look up the documentation. In summary, from what I can gather

1. It doesn’t do any speech recognition (speech -> text), so not sure why they put Siri in the title. It is also not clear how they can ‘hijack’ the text from Siri to do this analysis. The ASR engines they talk about (CMU, OpenEars) have pretty horrible accuracy (compared to Siri or google voice).

2. Looks like they do some form of text normalization/correction, again not clear how they do it.

3. The actual service they provide is a form of named entity recognition (confusing named intent which clashes with the android intent mechanism in their examples).

4. Also they let you define your own entities to match. You can train them using a drop –down menu. Not sure how you can train hundreds of examples using point and click.

This different from alchemy (or many others) because this is open source(?) http://www.alchemyapi.com/products/features/entity-extractio...

Given this service was for developers with an interest in NLP, it would have been good if they didn’t hide behind a snow job title like “Siri as a service”.

[+] ar7hur|12 years ago|reply

> 1. It doesn’t do any speech recognition (speech -> text), so not sure why they put Siri in the title. It is also not clear how they can ‘hijack’ the text from Siri to do this analysis. The ASR engines they talk about (CMU, OpenEars) have pretty horrible accuracy (compared to Siri or google voice).

Currently most Wit users use Google or Nuance with great success. You can even use Android's offline speech rec.

That being said, CMU and OpenEars work well, as long as you provide them with good language models (which you can't do if you hack a quick project). Our plan is for Wit to automatically generate the right language models from your instance configuration.

> 2. Looks like they do some form of text normalization/correction, again not clear how they do it. 3. The actual service they provide is a form of named entity recognition (confusing named intent which clashes with the android intent mechanism in their examples).

We abstract the full NLP stack for the developer. How we do it is not really what matters to our developers, as long as it works :) Actually we use a combination of many different NLP and machine learning techniques.

> 4. Also they let you define your own entities to match. You can train them using a drop –down menu. Not sure how you can train hundreds of examples using point and click.

You don't need to train hundreds of examples. Plus, our users are not NLP/ML experts and they prefer a graphical UI. But that's true it could be still more efficient, we have good features in the roadmap for that :)

> This different from alchemy (or many others) because this is open source(?) http://www.alchemyapi.com/products/features/entity-extractio....

Alchemy is great as a set of NLP tools, some of them quite academical, but it's not designed from scratch to solve the problem we're trying to solve: enable the masses of developers to easily add a natural language interface to their app.

[+] aquark|12 years ago|reply

You could SiriProxy (https://github.com/plamoni/SiriProxy) as input to this to handle to speech-to-text part

I'm running this at home and it works great for adding custom actions to Siri

[+] film42|12 years ago|reply

Maybe I'm out of line, but if you're planning on tearing into what's wrong with something, try to offer something positive as well. Your feedback was great and constructive, it's just not very nice.

I'll be honest, HN has a tendency (myself included) to have a first natural reaction of "how can I criticize this?" But just because something isn't faster than enterprise, or not-as-scalable, or not made in your framework of choice doesn't mean it's worthless. I think this project is amazing. Great job and I can't wait to see this mature.

[+] jasonkester|12 years ago|reply

Is there any way to view this page with the effects turned off? With all the text constantly appearing and disappearing, I haven't yet made it to the end of a sentence, and therefore can't form an opinion about it.

I think there was a picture of a robot on the screen for a few seconds, but that's all I remember.

Would disabling javascript do the trick?

[+] blandinw|12 years ago|reply

Hi, fixed it! Thanks for that first feedback!

EDIT: All animations (except "What we do") should be disabled. Please, email me at [email protected] if you still have issues.

[+] itry|12 years ago|reply

Same here. I find "siri as a service" an interesting project. But not interesting enough to cope with a page that blends in and out content and makes my head spin.

[+] MasterScrat|12 years ago|reply

That looks really interesting.

You should make it clearer that you don't actually handle voice recognition. When I read: "Developers use Wit to easily build a voice interface for their app." I expect you to handle things from start to finish.

Also, let me try it! It's frustrating because the UI looks like you can experiment but it's only an animated demo (or am I missing something??) In particular the mic logo is used to record on Google and here it doesn't seem to do anything?

[+] ar7hur|12 years ago|reply

> You should make it clearer that you don't actually handle voice recognition.

You're right, we'll make it more clear on the landing page. A full out-of-the-box integration with some voice recognition engines (we love CMU Sphinx, open source) is in our roadmap.

> Also, let me try it!

We purposely didn't provide a "end-user" demo (something that would look like chatting with Siri) because we want to focus first on the developer experience, when they configure Wit to understand their very own end-users intents. You can require an invite and try this in less than 5 minutes.

[+] lutusp|12 years ago|reply

The word "Siri" doesn't belong in the title or the article, unless a Trabant advertisement has the right to mention Mercedes-Benz in its promotional text. The project does a primitive kind of voice recognition, but it doesn't use Siri.

On this topic, I invite people to try out my non-prototype, non-project toy that uses Google's support for HTML5 speech recognition. It's pretty funny how wrong things go when you try to say something even a bit out of the ordinary:

http://arachnoid.com/speech_to_text

If I say, "Now is the time for all good men to come to the aid of their country," an old teletype test sentence, the Google recognizer always nails it. If I say, "I hit an uncharted rock and my boat is being repaired," things go hilariously wrong, and every time differently.

[+] mrosethompson|12 years ago|reply

I also found it going very well or very, very badly.

For instance: "In most cases, before beginning to listen, the browser will ask permission to monitor your microphone."

Came out as: "In Las Cruces, f listen, permission to monitor your microphone."

[+] xauronx|12 years ago|reply

I like the concept a lot. I'm going to have to read more about it. One thing that I'm unclear about is if this does voice->text, or if the developer does that and Wit handles translation of that into actions.

Just a heads up, but Get Started on the pricing page does nothing. It's natural progression for me to go home page->pricing->OK, looks good, let's get started.

[+] ar7hur|12 years ago|reply

Thanks for the feedback.

Wit takes the output of the voice recognition engine as input. It's quite robust to voice recognition errors. Most devs use Google's engine or the open source CMU Sphinx engine.

Fixed the Get Started link, thanks!

[+] endlessvoid94|12 years ago|reply

This is amazingly timely for me, I've been building my own version of jarvis using speakeasy-nlp (a node NLP library) and Chrome's builtin support for HTML5 webkitSpeechRecognition:

https://github.com/dpaola2/jarvis (work in progress)

I absolutely would love a better NLP api. Please let me in!

[+] ar7hur|12 years ago|reply

Jarvis-like systems are a great use case.

You should be able to sign up now.

[+] ar7hur|12 years ago|reply

Hey everyone. Wit guy here. We've been working on Wit the past few months and we think it's time to get your feedback. I'm happy to answer any questions you have.

Bringing Natural Language Understanding to the masses of developers is hard and we still have a lot of work ahead of us. Please don't hesitate to reach out to us!

[+] post_break|12 years ago|reply

Is Siri required? Is there an option for Android devices? I admit I didn't dive too deep into the website because I was looking at all the eye candy.

[+] unknown|12 years ago|reply

[deleted]

[+] chrislomax|12 years ago|reply

Nice concept, I just came back on here to let you know that I don't know what is happening on that page but I left it open about 45 minutes ago and noticed my fan kicked in a lot. It was that page I left open. Ended up taking 25% CPU, you didn't work on the iTunes software did you??

Only messing, it was taking a lot of CPU though.

[+] blandinw|12 years ago|reply

Haha, sorry about that.. I guess we are better at Clojure than Angular.js!

Working on a fix now.

[+] MasterScrat|12 years ago|reply

The .ai extension is cute :-)

I don't know if people will remember it and be receptive to this touch but I like it.

[+] ragebol|12 years ago|reply

Interesting! Nothing happened after I registered with my Github account though using Opera. I also wonder how this compares to http://www.maluuba.com/ ?

[+] ar7hur|12 years ago|reply

> how this compares to http://www.maluuba.com

Wit is 100% open and flexible, you can create any intent you need for your app, you're not limited to a static set of domains/actions.

EDIT: @ragebol we are very interested in ROS and robotics, don't hesitate to get in touch with me arthur at wit dot ai. In the future we would like to provide an off-the-shelf human/robot communication module for developers.

[+] blandinw|12 years ago|reply

Regarding Opera, it has been tested and everything should work as expected!

Can you email me your GitHub username at [email protected] to make sure we got your request? Thanks!

[+] fjabre|12 years ago|reply

Your message isn't clear. AFAIK there is no official way to interact with Siri or Google voice rec.

It seems like WIT will take the text that has already been translated from a user's voice to text and make it easily accessible to my application but how does WIT access the text generated from a Siri request in the first place for example? Does WIT have some other way of getting at this data that has already been converted from voice to text by Siri or Google or some other speech-to-text engine?

[+] ar7hur|12 years ago|reply

> AFAIK there is no official way to interact with Siri or Google voice rec.

Actually there are ways. On Android devices, voice rec is available to devs (even offline if the user enabled it!). We have a simple tutorial about how to integrate on Android https://wit.ai/docs/android-tutorial

Right now on iOS you have two options (none of them involves Siri, which is kept closed by Apple):

1/ Do the voice rec server-side (Siri does that)

2/ Use OpenEars to do it client-side

Server-side, you have many voice rec options, including open source CMU Sphinx.

Providing a fully-integrated solution with speech rec out of the box is in our roadmap.

[+] dhucerbin|12 years ago|reply

You could read witai as "witaj" in Polish, which means "hello" in slightly official manner.

[+] drakaal|12 years ago|reply

Sounds like they are trying to be this, http://www.youtube.com/watch?v=Ko-r4gpM3Rc

Except Stremor has a Query Language so you don't have to do anywhere near as much heavy lifting.

[+] ar7hur|12 years ago|reply

Hi Stremor! :)

Looks like you focus on search, summary, entity and sentiment extraction with a rule-based approach.

Wit's focus is to power human/machine interfaces, and our priority is to provide developers with a 100% configurable solution, with no prior assumption on their domain. And we don't believe in rules, we chose a machine learning approach.

[+] sinzone|12 years ago|reply

Hi guys, here you can find the full API Documentation: https://www.mashape.com/lxbrun/nlp-and-voice-interface-for-a...

[+] rch|12 years ago|reply

This would be great for open source projects, but I feel like I would trip over a very large pile of patents if I tried to build a product around it. I don't have any relevant experience myself though, so it's just a feeling.

[+] radley|12 years ago|reply

The pricing model doesn't scale realistically and would require a subscription service for users. An app with 1M+ installs could do 1M+ calls per day making this service $24k / month.

[+] ar7hur|12 years ago|reply

That's why at the bottom of the pricing page we encourage you to contact us if you have more than 1M calls per month.

Meanwhile you can decide to share your configuration data and get Wit for free (à la Github) :)

[+] BrandiATMuhkuh|12 years ago|reply

I'm interested to use it in combination with a robot (NAO). Could you provide a tutorial for it. Not sure if ROS on NAO will be necessary or not.

[+] tonydiv|12 years ago|reply

This looks neat, I will definitely keep it in mind.

I would be weary of using the Github Octocat mascot though. I believe Octocat is protected under copyright.

[+] blandinw|12 years ago|reply

Hi Tony,

After reading http://octodex.github.com/faq.html, we thought it was okay to put this image given that we advertise and reference GitHub a lot, we heavily integrate with it and we love Octocat!

Do you think we should remove it?

Thanks!

[+] hackula1|12 years ago|reply

The landing page looks like a minefield of legal issues. Marketing everything explicitly with the references to Siri is asking for a law suit from Apple.

[+] skram|12 years ago|reply

How does this compare and contrast to http://www.ask-ziggy.com/ ?

[+] ar7hur|12 years ago|reply

We share the same vision that voice becomes the key human/machine interface, especially for the upcoming generation of wearable devices, home automation, etc.

I don't know if Ask Ziggy is 100% self-service for the developers. That's a key requirement for us.

[+] hipaulshi|12 years ago|reply

hah! My startup is doing a similar platform in a little bigger scale. I realized I did pretty bad on the hackathon :( http://on.aol.com/video/jarvis-2-0-demo-at-hackathon-sf-2013...

[+] ar7hur|12 years ago|reply

I'd love to see the full video of your demo, it looks cool!

95 comments