top | item 41199200

(no title)

mcbetz | 1 year ago

Does anyone have successfully worked with Non-English text with FTS5 in Sqlite? I could not find any reference for German, e.g. and the default stemming does not seem to work properly (given some short tests).

discuss

order

sgbeal|1 year ago

> Does anyone have successfully worked with Non-English text with FTS5 in Sqlite? I could not find any reference for German, e.g.

We use it in the Fossil SCM project and users have reported success with Chinese and Russian, so it presumably works fine with any European/Germanic language.

> and the default stemming does not seem to work properly (given some short tests).

The Porter Stemmer is documented as only being useful for English.

djhn|1 year ago

It has pretty much the same support for other languages as most text mining tools and Elasticsearch via the snowball stemmer: https://github.com/abiliojr/fts5-snowball

Should work well for German, I’m using it with Nordic languages.