top | item 10310816

Curated list of speech and natural language processing resources

101 points| sebg | 10 years ago |github.com

27 comments

order
[+] greatthanks|10 years ago|reply
Those curated lists poppung up all over the place seem to indicate a need for pre-Google-style Altavista/Yahoo portals.
[+] arafalov|10 years ago|reply
Curation is always the next step after explosion of content. Yahoo was curation of the whole internet. Then it got too hard. Now, we have enough content in tiny sub-niches to need curation on that level. I definitely see the need for curation of resources around the topic I am interested in (Apache Solr).

Unfortunately, I haven't seen a good software platform that actually allows to build a good curation site. Ones that exist want you to build the content for them. I want one I can run/own/brand on my own. I suspect there might be some in the library space though (haven't search _very_ hard yet).

[+] melling|10 years ago|reply
There always seemed to be a need for dedicated lists. Rather than curate, I'm trying to build a dedicated mini "search engine" for Swift/iOS Resources: http://www.h4labs.com/dev/ios/swift.html

The Internet contains so much information on any given topic that if you have a question, it probably has already been answered. If we could build better search engines, we could learn anything in a fraction of the time.

[+] ilurk|10 years ago|reply
Indeed I've also noticed that.

The web had a chaotic growth in the first decades but now it looks as if on one end, the larger websites have killed smaller ones, and on the other it has grown so large that search is no longer enough.

You need organization.

(sorry for the offtopic)

[+] nl|10 years ago|reply
https://github.com/facebook/MemNN should be in the language modelling (or Deep Learning) part. I'll give them a pass because it was only released a couple of days ago.

The original Word2Vec[1] is missing too. While Gensim and Glove are nice, Word2Vec still outperforms them both in some circumstances.

Surely there is a good LTSM language modelling project somewhere too? I can't think of one off the top of my head though. There's some code in Keras[2], but maybe Karpathy's char-RNN would be better[3] because of the documentation.

[1] https://code.google.com/p/word2vec/

[2] https://github.com/fchollet/keras/blob/master/examples/lstm_...

[3] https://github.com/karpathy/char-rnn

[+] michael_h|10 years ago|reply
LSTM --> right now, Torch 7 and Theano are receiving the bulk of the attention.
[+] gbrits|10 years ago|reply
Consider a speech-to-structured-search-app in a limited domain, like a specialized siri/google now. For example something like a real estate search assistant with possible questions like: "what new 2 bedroom apartments have become available in Capitol Hill, Seattle this week?"

Perhaps naively, it seems a big part of the deducing meaning could be done doing ordinary dictionary lookups with terms like 'bedroom', 'apartments', "Capitol Hill", "seattle" etc.

Is this indeed naive, or is this 'dictionary lookup'-technique part of the bag of tricks used? If so, any good references to use this in combination with other techniques described here?

Highly interested in this topic, but looking for a nice introduction to get used to the terminology of the field.

[+] amirouche|10 years ago|reply
This is called Question/Answering (QA). "bedroom", "apparatments" are different entities from "Capitol Hill", "Seattle". You could do as you say, trying to understand the question based on some of the words that appears using statistics. This is a "bag of word" approach.

The general idea of NLP is not different from general computer science ie. 1) narrow the problem 2) solve it 3) try to solve a bigger problem.

The tower of sentence structure in NLP is:

- bag of word

- part of speech + named enties tagging

- dependency tagging/framing

- semantic tagging

The idea is to create templates for most common questions. Then you parse questions recognizing the named entities like "Capitol Hill", "Seattle" and commons "appartement" you can resolve the question. It's not an ordinary dictionary hash lookup since for in given template there is several "key". The value of the dictionary is the correct search method. It makes me think to multiple method dispatch which support dispatch by value.

Also something to take into account is that in the "assistant" example you give, the assistant can ask for confirmation. You don't explicitly state that you are looking to "rent" something. So the system might not recognize the question, but just guess that you talk about renting something because it's the most popular search around Capitol Hill, Seattle. You can implement a "suggest this question" feature that will feedback the "question dispatch" algorithm to later recognize this question.

This is mostly a Dynamic Programming approach. Advanced NLP pipelines use logic, probabilistic programming, graph theory or all of them ;)

The other big problems of NLP are:

- summary generation - automatic translation

Important to note is that like other systems it must be goal driven. You can start from the goal and go backward infering the previous steps or do it from the initial data and go forward. Again, it's very important to simplify. Factorize by recognizing patterns. It's the main idea regarding the theory of the mind.

Have a look at this SO question [1] I try to fully explain an example QA. Coursera NLP course is a good start.

OpenCog doesn't deal solely with NLP but gives an example of what a modern artificial cognitive assistant can be made of.

Beware that NLP is kind of loop-hole.

[1] http://stackoverflow.com/questions/32432719/is-there-any-nlp...

[+] m_eiman|10 years ago|reply
About the "Text-to-Speech" section there, I was really impressed with the updated Swedish "Alva" voice in OSX El Capitan: it correctly pronounces "tomten" in different ways in the first and second occurrence in this example:

say -v Alva "Tomten dricker julmust på tomten"

"Tomten" can mean either "Santa Claus" or "the yard"/"the plot" depending on context, and apparently they're able to detect this properly.

[+] motdiem|10 years ago|reply
OS X makes progress with every release on this front. I typically test it with a few tricky french sentences (think "les poules du couvent couvent") and it seems to improve, but it's hard to say from the outside what gets better in the model ("Mes fils ont cassé mes fils" still fails for instance, but seems harder to detect to me)
[+] mohn|10 years ago|reply
I'm glad to see the CMU pronouncing dictionary in there. It was instrumental when I wrote a web app[1] to generate Spoonerisms[2] (my apologies for the UI and the fact that I haven't yet removed the more obscure words, especially obscure homophones, from my cmudict subset).

The cmudict isn't under the text-to-speech subheading in this list, but I think the folks at Carnegie Mellon may have considered text-to-speech applications, like a talking GPS navigator, when they compiled the dictionary. I recall the cmudict containing lots of US city names.

[1] https://spoonerizer.appspot.com/

[2] https://en.wikipedia.org/wiki/Spoonerism

[+] ZanyProgrammer|10 years ago|reply
Also missing TextBlob, which was featured on HN recently on the front page.
[+] maresca|10 years ago|reply
Could anyone point me to some sentiment analysis frameworks and/or update the list to include some?