Show HN: My AI writing assistant for Chinese
74 points| dwilding | 2 years ago |unscrambler.dpw.me
Really, the fundamental benefit of the tool is that it encourages me to exercise the writing muscle, rather than defaulting to translating from English.
My goal was to build something that is fast, relatively inexpensive, and not prone to misleading people. After some experimentation, I found that pairing GPT-3.5 with Microsoft machine translation works great! I'm using GPT-3.5 to generate an English version of the input, then Microsoft machine translation to translate the English version into Chinese. I've written a blog post about how I arrived at this approach: https://maybecoding.bearblog.dev/my-attempt-at-an-ai-writing...
The tool is pretty basic from a technical standpoint. There's a PHP backend that handles the API calls to OpenAI and Microsoft. The frontend is plain HTML with a bit of JS for interactivity - but the whole thing works fine without JS.
Hopefully other folks find this useful too. I'm interested to know whether I could have approached this problem in a different/better way. Thanks!
danielscrubs|2 years ago
dwilding|2 years ago
GaggiX|2 years ago
I'm curious if Claude models (now the version 3 has been released) work really well with Chinese too (beyond Microsoft Translator). Also there are several good open source bilingual models like Qwen.
dwilding|2 years ago
According to my wife (a Chinese native speaker), GPT-3.5 is bad at outputting Chinese but GPT-4 does a good job
matthewfelgate|2 years ago
Quite often I find myself in that situation where I know most of a sentence but not how to put together.
dwilding|2 years ago
I’ve found it very helpful for staying in the zone and getting better at “thinking” in Chinese
If you start learning Chinese again in the future, I hope you find it useful too!
zhifenzhu|2 years ago
bulubulu|2 years ago
dwilding|2 years ago
So in the end I decided that I can ask GPT-4 to explain alternative structures based on the translation I get
Ideally, I would like to find a reliable way to directly transform my first attempt at an expression into a native-level expression, without necessary routing via English. It didn’t seem doable with the LLMs I tried so far, but there are others I can try, and maybe some training would help too - as you say
BohuTANG|2 years ago
服了u: Correctly translates to "I give up on you." Unscramble: Incorrectly interpreted as "You impress me."
- [1] https://chat.openai.com/g/g-tI0XLZxuR-fu-liao-u
dwilding|2 years ago
unignorant|2 years ago
I've found creative writing in a target language is great for learning.
dwilding|2 years ago
I also use GPT-4 for explaining the meaning of sentences in more detail (as in JimDabell’s comment). Often my questions are like “how would a native speaker say this colloquially” - I’ve found it really valuable to be able to have a back-and-forth on why something works the way it does
iansinnott|2 years ago
dwilding|2 years ago
JimDabell|2 years ago
For instance:
> > I’m learning Mandarin, can you break this sentence into individual hanzi and explain it? 果汁很好喝
> Certainly! Let's break down the sentence "果汁很好喝" into individual hanzi (Chinese characters) and explain what each character and the whole sentence mean.
> - 果 (guǒ) - Fruit
> - 汁 (zhī) - Juice
> - 很 (hěn) - Very; quite
> - 好 (hǎo) - Good
> - 喝 (hē) - to drink
> So, when you put these characters together:
> - 果汁 (guǒ zhī) means "fruit juice."
> - 很 (hěn) is an adverb that intensifies the adjective that follows. When used before an adjective, it indicates a high degree of something, commonly translated as "very" or "quite" in English.
> - 好喝 (hǎo hē) combines "good" (好 hǎo) and "to drink" (喝 hē), which together mean "tasty" or "good to drink."
> So, the sentence 果汁很好喝 translates to "The fruit juice is very tasty" or more directly, "Fruit juice is very good to drink." It's a simple sentence structure expressing that the speaker finds the fruit juice to be quite delicious.
It does idioms too:
> > I’m learning Mandarin, can you break this sentence into individual hanzi and explain it? 吃醋
> Sure, I can help with that. The sentence "吃醋" consists of two characters: "吃" and "醋".
> - 吃 (chī): This character means "to eat" or "to consume".
> - 醋 (cù): This character means "vinegar".
> However, when put together as "吃醋", the phrase doesn't literally mean "eat vinegar". In Chinese, "吃醋" is an idiomatic expression that means to be jealous or envious, usually in the context of a romantic relationship. It can be translated to feeling "sour" over someone's interaction with others, hence the metaphor of "eating vinegar". This phrase showcases how Mandarin often uses vivid imagery to encapsulate emotions and situations.
Even going down to a locally running Mistral 7b gives pretty good results, although it sometimes makes minor mistakes or doesn’t give as thorough an explanation as the larger models. Still very useful for my stage though, where I’m mostly just trying to remember something that isn’t quite fully memorised yet, or exploring what’s adjacent to my current knowledge.
alisonatwork|2 years ago
It just does a dumb greedy search to try find the longest chain of characters for the translation, but that turns out to work more often than not:
I wanted to add different strategies on -s like dict(1), but I never got round to it because it was easy enough to just force breaking the word by adding spaces: Now I live in Taiwan it might be neat to update it to output traditional characters, but to be honest I think simplified is easier to read and learn for foriegners.dwilding|2 years ago
One feature I use a lot is the clipboard reader, which lets you step through Chinese text character by character and shows you the meaning of each character. It can be faster than using an LLM, but you don’t get the whole-sentence meaning or the ability to ask follow-up questions like you can with an LLM. But it’s a great app all round (some features are paid)
[1] https://www.pleco.com/
peterburkimsher|2 years ago
modeless|2 years ago
abdusco|2 years ago
dwilding|2 years ago
gfiorav|2 years ago
dwilding|2 years ago