top | item 4651353

Pattern - Web Mining Python lib

183 points| interro | 13 years ago |github.com

More info here: http://www.clips.ua.ac.be/pages/pattern

14 comments

order
[+] languagehacker|13 years ago|reply
Really cool library. I'm excited to take it for a spin! I liked that there was some work done already for Wikipedia. But as a note to people who want to work with Wikipedia data, it's not very hard to abstract your stuff to work with most wikis based on the MediaWiki platform. I've added a pull request to this project that also supports using the hundreds of thousands of wikis on Wikia. ( https://github.com/clips/pattern/pull/17 )
[+] knes|13 years ago|reply
I'm a big fan of data mining so I'll make sure to take it out for spin :) And from fellow belgian people, nice!
[+] salimmadjd|13 years ago|reply
This is awesome! Any plans to add other sites, like amazon, yelp, tripadvisor, etc!
[+] stevejohnson|13 years ago|reply
NB: screen-scraping Yelp is against the TOS and you'll get shut down pretty fast if you try it.
[+] enjo|13 years ago|reply
Better yet: Is there a well defined structure for other folks to add that stuff?
[+] mkumm|13 years ago|reply
This looks pretty interesting, I will give it a go