top | item 1461214

(no title)

MLnick | 15 years ago

I wonder what exactly a "human-based" algorithm is? How do you not use stats, NLP, ML and/or network based approaches at that scale?

discuss

order

elblanco|15 years ago

A very large part of their strategy is crowdsourcing named entity extraction. Basically, as you read a report, you hilight and tag all the entities of interest. It provides incredibly high quality entities and relationships on an enterprise ontology.

The problem is that it scales incredibly poorly. Imagine hilighting every person, phone number, location, etc. in a document. Then linking them all together manually. Then imagine having a few million documents you have to do that on. It's been an enormous problem at all the sites I've seen it deployed at.

jauer|15 years ago

When I was playing with it, it seemed like a really handy way for a human to sift through a lot of data. Possibly that's where the human-based spin is coming from.

If you want to try it, they have a training instance online at https://www.optradestop.com/