I plan a deeper dive into text mining this year, and am looking for some suggestions on what resources are best. A friend suggested Text Mining by Weiss, et al http://www.springer.com/computer/database+management+%26+information+retrieval/book/978-0-387-95433-2What would you suggest?
[+] [-] helwr|16 years ago|reply
Information Retrieval (Manning)
Text Compression (Bell)
Natural Language Processing (Manning)
Natural Language Understanding (Allen)
Speech and Language Processing (Jurafsky)
The Text Mining Handbook (Sanger)
Statistical Machine Translation (Koehn)
Data-Intensive Text Processing with MapReduce (Lin)
Algorithms on strings (Gusfield)
Jewels of Stringology (Crochemore)
Regular Expressions (Friedl), also: http://swtch.com/~rsc/regexp/regexp1.html and automata theory (Hopcroft)
Practical Text Mining with Perl (Bilisoly)
Natural Language Processing with Python (Bird)
Computational Linguistics (Hausser)
Syntactic structures (Chomsky)
also check out these links: http://measuringmeasures.blogspot.com/2010/01/learning-about...
http://measuringmeasures.com/blog/2010/3/12/learning-about-m...
http://www.cs.technion.ac.il/~gabr/resources/resources.html
[+] [-] unknown|16 years ago|reply
[deleted]
[+] [-] dejv|16 years ago|reply
First two lectures are great introduction to this topic and third is also related, but not necessary.
If you want to dive deeper to more advanced stuff I will recommend to look to the conditional random fields, which is kind of state of art of this field right now.
Great tutorial: http://www.cs.umass.edu/~mccallum/papers/crf-tutorial.pdf Wiki entry: http://en.wikipedia.org/wiki/Conditional_random_field
[+] [-] mindcrime|16 years ago|reply
Mining The Talk: http://www.amazon.com/Mining-Talk-Unlocking-Unstructured-Inf...
Text Mining Application Programming: http://www.amazon.com/Text-Mining-Application-Programming/dp...
Introduction to Information Retrieval (available freely online): http://nlp.stanford.edu/IR-book/information-retrieval-book.h...
[+] [-] kunjaan|16 years ago|reply
[+] [-] vark|16 years ago|reply
- Data Mining Book by Jiawei Han et al
- Managing Gigabytes by Witten et al
- Hypertext Mining book by Chakrabarti
[+] [-] unknown|16 years ago|reply
[deleted]
[+] [-] big_data|16 years ago|reply