(no title)
zserge
|
2 years ago
Would it be possible to train an LLM from scratch that would speak Toki Pona? 120 word dictionary over a reduced alphabet would mean a tiny number of possible tokens and theoretically a model could be smaller than the ones used in "tiny stories" experiment (where a simplified almost childish English has been used). Maybe even a local machine would be enough to train it. I wonder if there is a large enough dataset for Toki Pona or if there is a sensible way to synthesize one? I'm no expert in LLMs or Toki Pona, though.
No comments yet.