top | item 45947099

(no title)

dtho | 3 months ago

Each token affects the probabilities of subsequent tokens. Let's say you want the model to produce Python code, and you are using a grammar to force JSON output. The model wasn't trained on JSON-serialized Python code. It was trained on normal Python code with real newlines. Wouldn't forcing JSON impair output quality in this case?

discuss

No comments yet.