Joe did not buy a car today.
He was in buying mood.
But all cars were too expensive.
Why didn't Joe buy a car?
Answer: buying mood
I think I have seen similar systems for decades now. I thought we would be further along meanwhile.
I have tried for 10 or 20 minutes now. But I can't find any evidence that it has much sense of syntax:
Paul gives a coin to Joe.
Who received a coin?
Answer: Paul
All it seems to do is to extract candidates for "who", "what", "where" etc. So it seems to figure out correctly that "Paul" is a potential answer for "Who".
No matter how I rephrase the "Who" question, I always get "Paul" as the answer. "Who? Paul!", "Who is a martian? Paul!", "Who won the summer olympics? Paul", "Who got a coin from the other guy? Paul!"
Same for "what" questions:
Gold can not be carried in a bag. Silver can.
What can be carried in a bag?
Answer: Gold
Sadly, the NLP world is full of hot air. I've seen so many companies get funding for complete "written by a 12-year old" dogshit "industry leading IP", it's not even funny anymore.
The hype has gone down and some are actually doing great work, but 90% of the people who say they do NLP/AI stuff don't even fundamentally understand what NLP/AI is.
All of the above require fairly complex world knowledge as well as an explicit representation of a scene. There is minimal leverage for lexical distributional statistics in these cases—arguably the one thing we have had major success in using (e.g. building vector space word representations, like Word2Vec; finding the highest probability parse tree for an utterance).
The difference with new NN-based systems is that they are trained end-to-end, learn the syntax and some form of "reasoning". Check Memory Networks, by facebook, for example (two NNs, one for "reasoning" and one for storing long-term data, quite impressive).
Now, it's still an area of active research... and I'm not sure what "state-of-the-art" means for this library, somebody said that they rank #27th in some commonly used dataset.
According to the website they use the BiDAF model, which as a single model does not produce state-of-the-art results on the SQuAD benchmark. It is ranked 27th here: https://rajpurkar.github.io/SQuAD-explorer/
This is very brittle: it works really well on the pre-canned examples but the vocabulary seems very tightly linked. It doesn't handle something as simple as:
'the patient had no pain but did have nausea'
Doesn't yield any helpful on semantic role labeling and didn't even parse on machine comprehension. If I vary it to say ask 'did the patient have pain?' the answer is 'nausea'.
CoreNLP provides much more useful analysis of the phrase structure and dependencies.
In "Adversarial Examples for Evaluating Reading Comprehension Systems" https://arxiv.org/abs/1707.07328, it was found that adding a single distracting sentence can lower F1 score of BiDAF (which is used in demo here) from 75.5% to 34.3% on SQuAD. In comparison, human performance goes from 92.6% to 89.2%.
Different set of tasks. SpaCy is focused on bread-and-butter tasks like tokenization, part of speech tagging, and dependency parsing (not to say that these are easy, but that they are things people have been working on a long time). AllenNLP seems focused on distributing relatively recent neural models (last few years) of more complex language understanding like labeling semantic roles (agents, patients, etc.) and identifying textual entailments (=mining facts from a sentence). It is not great at these tasks, because this is v. difficult and a very active area of ongoing research.
[+] [-] TekMol|8 years ago|reply
I have tried for 10 or 20 minutes now. But I can't find any evidence that it has much sense of syntax:
All it seems to do is to extract candidates for "who", "what", "where" etc. So it seems to figure out correctly that "Paul" is a potential answer for "Who".No matter how I rephrase the "Who" question, I always get "Paul" as the answer. "Who? Paul!", "Who is a martian? Paul!", "Who won the summer olympics? Paul", "Who got a coin from the other guy? Paul!"
Same for "what" questions:
[+] [-] galenko|8 years ago|reply
The hype has gone down and some are actually doing great work, but 90% of the people who say they do NLP/AI stuff don't even fundamentally understand what NLP/AI is.
[+] [-] glup|8 years ago|reply
[+] [-] senatorobama|8 years ago|reply
[+] [-] halflings|8 years ago|reply
Now, it's still an area of active research... and I'm not sure what "state-of-the-art" means for this library, somebody said that they rank #27th in some commonly used dataset.
[+] [-] msamwald|8 years ago|reply
[+] [-] rubyfan|8 years ago|reply
[+] [-] make3|8 years ago|reply
[+] [-] mamp|8 years ago|reply
'the patient had no pain but did have nausea'
Doesn't yield any helpful on semantic role labeling and didn't even parse on machine comprehension. If I vary it to say ask 'did the patient have pain?' the answer is 'nausea'.
CoreNLP provides much more useful analysis of the phrase structure and dependencies.
[+] [-] sanxiyn|8 years ago|reply
[+] [-] andrew3726|8 years ago|reply
[+] [-] vbuwivbiu|8 years ago|reply
"what is the fifth word in that sentence ?"
Answer: squid
[+] [-] strin|8 years ago|reply
[+] [-] wyldfire|8 years ago|reply
[+] [-] glup|8 years ago|reply