Houndify: Add voice enabled, conversational interface to anything

[+] pmontra|10 years ago|reply

Any plans for other languages and locales? I immediately noticed the temperature in F in the example about the weather in Lima. I think everybody there uses C with the exception of American tourists :-) Seriously, it looks a great product. Maybe it returns even too much data in the JSON. I wonder how to take advantage of all of that if I don't know what people are going to ask. They're going to ask silly questions just for fun even if I have a vertical app (example: a mortgage calculator), because this is not a web form with constrained input fields but a free form input. The numbers I get into the answer could be unrelated to mortgages. Do you have examples of best practices? Maybe just write and speak the answer? Thanks.

[+] transpy|10 years ago|reply

Nice observation. Sadly, localization is an afterthought for a lot of developers. I am also curious to see how they handle other languages and locales, since I'm interested in learning how to use these kinds of systems.

[+] dshankar|10 years ago|reply

You're probably skeptical (as I was) but watch this video demo of the Hound app: https://m.youtube.com/watch?v=M1ONXea0mXg

That's insanely fast, compound natural language queries. I'm impressed.

[+] frik|10 years ago|reply

The video is only 240p and quite shaky. As it is published by the SoundHound Inc. company, is this a marketing technique to make it look more amateurish?

Such a low latency means the demo was done over Wifi in the SoundHound building - especially if the speech recognition runs on the server side. Or which speech recognition software does that demo app use? Nuance software based on the client? Android 5 voice recognition isn't that fast.

[+] pmontra|10 years ago|reply

Maybe answers are spoken a bit too fast but that speed makes the demo more impressive.

[+] joelrunyon|10 years ago|reply

It got the Space Needle question wrong. It gave the answer for DC rather than Seattle.

[+] pbreit|10 years ago|reply

After owning Echo, Roku and Fire TV, I'm super-bullish on voice commands finally being ready for prime time. It's a terrific interface for home audio, TV and car audio.

I've gotta think Apple will open up Siri to app developers sooner than later.

Houndify looks interesting.

[+] wutbrodo|10 years ago|reply

Definitely. I've been using voice commands in Android for about 5 years now (since ~2010) and I've consistently been shocked at how incredibly efficient an interface it is. The number of capabilities hooked up to voice control has only been increasing since then and it's been great.

[+] doragcoder|10 years ago|reply

I think voice with a screen is interesting, but voice alone can be difficult. What is the last voice controlled IVR (phone system) that was awesome to interact with. I think it takes a combination of voice and something can can be confirmed with another "button", or something you can touch or push to confirm or cancel what you've "asked" it to do.

I think it can augment things well, but not be the prime time star.

[+] pjc50|10 years ago|reply

Privacy policy points out that the system sends voice recordings to Houndify but is totally silent on how they will be treated.

[+] dexterdog|10 years ago|reply

And that's different from Google recording all of your searches how?

[+] alistproducer2|10 years ago|reply

It's already really easy to get fast, efficient access to large data sets. I don't see much value in that. It is not fast,efficient, and easy to transform natural language queries into computationally actionable ones.

I would find more value as a developer if, when given a natural language query, it returned a structured query. Then I could tweak the query to conform to whatever data retrieval API I wanted.

I don't think what I'm asking for has to be mutually exclusive with what they're currently offering. Give me the option to have houndify do some or all of the work for me.

[+] iamcasen|10 years ago|reply

I am one of the developers for houndify.com, so I can answer this question for you!

We actually have an api endpoint dedicated to doing this for you. At the moment we have a concept of "domains" where developers use a proprietary language to help Hound understand topics. Using our api, you could technically do this yourself, and add functionality that doesn't currently exist on the platform.

You could use the hotel domain and get back a ton of pre-formatted data, or you could just get back speech-to-text, or you could specify hooks you want to take action on. I'm not a developer on the actual voice api itself, so I'm not the most informed, but perhaps that answers your question?

[+] dang|10 years ago|reply

Also https://news.ycombinator.com/item?id=9650748.

[+] egonschiele|10 years ago|reply

I've been using pocketsphinx with this neat Ruby gem[1]. It's really easy to use but has low accuracy (understands me correctly maybe half the time). I'm curious to see if Houndify does any better!

[1] https://github.com/watsonbox/pocketsphinx-ruby

[+] johnm1019|10 years ago|reply

There is clearly a knowledge graph coupled with this in addition to the speech recognition. Sorry, "meaning" recognition. I feel like there is an opportunity to connect the deep knowledge graph of Wolfram Alpha -- or that maybe Wolfram missed the ball by not connecting their graph in a more usable way.

[+] frik|10 years ago|reply

I wonder if it is based on Freebase.com knowledge graph, which Google discontinued last month. http://www.freebase.com/ (and recently IBM has bought Blekko web search and knowledge graph engine as a replacement for Freebase to power their IBM Watson)

[+] davedx|10 years ago|reply

Acquisition by <Google/Apple/Microsoft/Facebook> in 3... 2... 1...

[+] ilaksh|10 years ago|reply

Seems absofreakinglutely amazing. Congratulations to the team. Amazed that this isn't front page everywhere yet.

[+] moron4hire|10 years ago|reply

Does this require a network connection? I'd love to start adding speech-to-text interfaces to my apps, but most of the stuff I work on needs to be able to work without the network, and most of the speech-to-text engines these days are SaaS products in some form or another.

[+] speechduh|10 years ago|reply

Even if the speech-to-text doesn't, the natural language understanding does.

[+] philjackson|10 years ago|reply

I expect an acquisition announcement in about six days before the third Thursday in 2016.

[+] unknown|10 years ago|reply

[deleted]

[+] diminish|10 years ago|reply

Can't Android Google voice keyboard be used as a speech-to-text interface and then the text can be used to trigger a command in a similar fashion?

[+] Freeboots|10 years ago|reply

its the complexity of the queries, and the contextual awareness that makes it impressive. But yes, my immediate thought was either Android s-to-t or google speech api plugged into wolfram alpha might create a (much simpler, but also much easier) version of this.

[+] thejosh|10 years ago|reply

This looks extremely cool and I can't wait to try it.

Bug: When scrolling down the page it is very very sluggish, using Chrome on Xubuntu 15.04.

[+] hobonumber1|10 years ago|reply

I'm one of the developers behind Houndify. Thanks for this feedback, I'll look into it.

[+] hobonumber1|10 years ago|reply

I'm one of the developers behind Houndify. Feel free to ask any questions.

[+] Donald|10 years ago|reply

Any chance for an invite, or should we just use https://www.houndify.com/verify-invite ?

[+] iamcasen|10 years ago|reply

Do I know you?

[+] amelius|10 years ago|reply

The thing that least impresses me about this demo is the voice synthesis :)

[+] cscharenberg|10 years ago|reply

It is surprising how that has lagged behind. I haven't noticed any significant improvement for several years in the voice synthesis on Android.

[+] ilaksh|10 years ago|reply

I thought this was Apple-funded?

51 comments