top | item 35870105

(no title)

keyboard_smash | 2 years ago

Like with any language, there’s gonna be a lot of context-dependent words/phrases with multiple meanings that are hard to segment/parse/translate correctly. Things like DeepL or GTranslate take into account probabilities for segmentation and grammar (or use ICU libraries); but that’s harder to do from a context of using ligatures and basic font engine features.

e.g. The classic example is 大麻煩 - is it 大|麻煩 (a big inconvenience), or is it 大麻|煩 (marijuana annoyance)? Is 粉絲 fan, or vermicelli? Is 早唞! “good night!” or “go fuck yourself!”?

discuss

order

dumbotron|2 years ago

Marijuana literally means "big numb" in Chinese? Or "big horse" if I mispronounce it?

keyboard_smash|2 years ago

Haha, yeah, works that way for Cantonese and Mandarin. 大麻 daai6 maa4/dà má (marijuana, literally ‘big numb’) vs 大馬 daai6 maa5/dà mǎ (big horse).

Bonus: 大媽 daai6 maa1/dà mā (auntie or father’s elder brother’s wife, literally ‘big mother’).

eloisius|2 years ago

麻 isn’t always numb. It’s also a genetic term for all kinds of flax or fiberous plants like hemp. For example sesame is 芝麻.