top | item 2960290

Extracting Meaning from Millions of Pages

75 points| jaybol | 14 years ago |technologyreview.com | reply

11 comments

[+] lazyjeff|14 years ago|reply

I work for the professor from the article (but not on TextRunner).

We're working on extracting meaning from reviews as well: http://revminer.com/

At the moment, it only has reviews of Seattle places (restaurants, hotels, etc.) but we're moving it mobile. It's written using node.js and socket.io; I'd be interested in hearing any feedback.

[+] agotterer|14 years ago|reply

Is it also open source?

[+] andreasvc|14 years ago|reply

I'd say you're extracting information, not meaning.

[+] acak|14 years ago|reply

From the article - "For example, to find the names of people who are CEOs within millions of documents, you'd first need to train the software with other examples, such as "Steve Jobs is CEO of Apple, Sheryl Sandberg is CEO of Facebook." "

Sheryl Sandberg? Deliberate or honest mistake? :-]

[+] antimora|14 years ago|reply

Looks like the directory index was left open. http://textrunner.cs.washington.edu/

[+] timr|14 years ago|reply

it's open source. just download it:

http://reverb.cs.washington.edu/

[+] mark_l_watson|14 years ago|reply

Awesome: code released under the GPL, with several data sets. Good to see this project (which has been under development for a long time) releasing technology for other people to use.

[+] abhaga|14 years ago|reply

Read The Web at CMU is also a similar system. http://rtw.ml.cmu.edu/rtw/

[+] DallaRosa|14 years ago|reply

Hasn't this been out for like, a long time?