(no title)
newhouseb | 2 years ago
> let's say we had a grammar that had a key "healthy" with values "very_unhealthy" or "moderately_healthy." For broccoli, the LLM might intend to say "very_healthy" and choose "very" but then be pigeonholed into saying "very_unhealthy" because it's the only valid completion according to the grammar.
That said, you can use beam search to more or less solve this problem by evaluating the joint probability of all tokens in each branch of the grammar and picking the one with the highest probability (you might need some more nuance for free-form strings where the LLM can do whatever it wants and be "valid").
IanCal|2 years ago
My gut feeling is that taking the output and if it's broken then start fixing it would have a better result - you could even then completely limit the output to only valid json. For your example, if it wrote "very_healthy" and was given an error message explaining that this wasn't an option it had to choose from very_unhealthy" or "moderately_healthy" I would expect a halfway decent model to pick "moderately_healthy".
This has the benefit of allowing you to use a more powerful model for reasoning (like GPT4) and a local model where you can do this kind of token probability manipulation for just fixing the data.