top | item 36860195

Better LLM response format – A simple trick reduces costs and response time

7 points| livshitz | 2 years ago |betterprogramming.pub

6 comments

livshitz|2 years ago

I've composed a post that could be of interest to those of you working with GPT (or any other LLM) and seeking JSON as an output. Here's a simple trick that can help reduce expenses and improve response times.

loueed|2 years ago

Interesting post, I've not used YAML outputs as of yet. When using GPT3.5 for JSON, I found that requesting minified JSON reduces the token count by a significant amount. In the example you mention, the month object minified is 28 tokens vs 96 tokens formatted. It actually beats the 50 Tokens returned from YAML.

It seems like the main issue is whitespace and indentation which YAML requires unlike JSON.

keskival|2 years ago

I wonder how well LLMs understand YAML Schema format.

I have found providing JSON Schema to them to be an excellent way to reduce their improvisation in their outputs intended for machine consumption.

villgax|2 years ago

Might make more sense for invoking 3rd party API but for self run LLMs TypeChat w/ JSON is just fine instead of adapting to YAML across your stack

livshitz|2 years ago

You shouldn't adapt YAML across your stack. Instead of parsing JSON string (the output from the LLM) into an object, I'm suggesting here you parse the YAML string into an object. The article suggests that it'll be more beneficial for you to do this LLM -> YAML -> JSON in terms of time and costs.