(no title)
yaantc | 1 year ago
No, that would be very inefficient. At each token generation step, the LLM provides a likelihood for all the defined token based on the past context. The structured output is defined by a grammar, which defines the legal tokens for the next step. You can then take the intersection of both (ignore any token not allowed by the grammar), and then select among the authorized token based on the LLM likelihood for them in the usual way. So it's a direct constraint, and it's efficient.
__loam|1 year ago