top | item 28189256

(no title)

reubenbond | 4 years ago

The Opus OpenSubtitles corpus was very useful when I was creating this Chinese-English dictionary app: https://github.com/ReubenBond/HanBaoBao. The tool which creates the dictionary database aggregates several sources, including processing Chinese subtitles for word frequency to inform the most likely cuts when performing word segmentation.

discuss

order

No comments yet.