top | item 46328387

(no title)

bisonbear | 2 months ago

As a fellow Mandarin learner - this is super cool! Intuitively makes a lot of sense for the "full immersion" component of language. I love to see exciting uses of AI for language learning like this instead of just more slop generation :)

I haven't dug into the github repo but I'm curious if by "guided decoding" you're referring to logit bias (which I use), or actual token blocking? Interested to know how this works technically.

(shameless self plug) I've actually been solving a similar problem for Mandarin learning - but from the comprehensible input side rather than the dictionary side:

https://koucai.chat - basically AI Mandarin penpals that write at your level

My approach uses logit bias to generate n+1 comprehensible input (essentially artificially raising the probability of the tokens that correspond to the user's vocabulary). Notably I didn't add the concept of a "regeneration loop" (otherwise there would be no +1 in N+1) but think it's a good idea.

Really curious about the grammar issues you mentioned - I also experimented with the idea of an AI-enhanced dictionary (given that the free chinese-english dictionary I have is lacking good examples) but determined that the generated output didn't meet my quality standards. Have you found any models that handle measure words reliably?

discuss

order

No comments yet.