top | item 40320951

(no title)

I've experimented with 30 or models so far, my general finding is closed source models like Claude have GPT-isms, while open source models do have a little less of a default tone but their ability to understand existing worlds is directly tied to how many tokens they were trained on.

Since existing worlds are (currently) where most of the stories are set, it's worth it to use a closed source models and wrangle their issues with dialogue.

To it's credit though, Llama 3 is the first OSS model trained on enough tokens to not feel lost for most worlds, so I've started routing some traffic to it for free users

The output format the site uses is also really really hard for most models to follow without fine-tuning, but fine-tuning then causes them to pick up the vocabulary of whichever model they were fine tuned on, which is a bit unfortunate

discuss

No comments yet.