But only for the whole generation. So if you want to constrain things one token at a time (as you would to force things to follow a grammar) you have to make fresh calls and only request one token which makes things more or less impractical if you want true guarantees. A few months ago I built this anyway to suss out how much more expensive it was [1][1] https://github.com/newhouseb/clownfish#so-how-do-i-use-this-...
No comments yet.