top | item 8383196

(no title)

llllllllllll | 11 years ago

It can be done quickly and accurately. Check out Peter Norvig's implementation of Viterbi algorithm for text segmentation.

Segmentation is typically used for scripts of languages like Thai and Khmer that don't feature word boundaries. I don't know the ins and outs of German word compounding, but it should work for breaking apart compounds too given a fully conjugated/declined word list as input.

discuss

order

No comments yet.