ChrisFulstow's comments

ChrisFulstow | 15 years ago | on: Full Text Search in Mongo

Tokenization was a simple string split on whitespace, and no stemming. It was quite a large Mongo dataset, so only a fraction of the index and data would've been in memory, it could've easily been quicker for a smaller dataset living in memory. For me, one of the benefits of Lucene is the powerful built-in query parsing, tokenization, analyzers, etc.

ChrisFulstow | 15 years ago | on: Full Text Search in Mongo

This isn't really supposed to be a proper full-text index feature, instead it's building a rudimentary inverted index using a string array property. It's possible to create indexes over array properties in MongoDB, which is very cool, and increases performance to an extent. But even with an index, this approach to full-text was much slower for me than an equivalent search against the same data in Lucene.

I'd love to see a MongoDB component that replicates data from the oplog to a dedicated full-text store like Lucene or Solr.

page 1