(no title)
wjessup | 2 years ago
This little library will generate multiple draft responses and then use a second model to judge the answers and pick a winner, which is then returned to the user. Google's Bard uses this same approach.
With this library you can apply the pattern to gpt-3.5 and gpt-4.
Drafts are generated in parallel and all drafts are evaluated with a single prompt.
This will use a lot of tokens. For example to generate 3 drafts, you are at 3x + you need to feed those drafts into another prompt + get that response, so >7x.
Streamlit demo: https://theoremone-gptgladiator-streamlit-ui-5ljwmm.streamli...
abhijoysar|2 years ago
Looking forward to exploring the Streamlit demo and seeing the Gladiator package in action. Keep up the great work!
ehq|2 years ago
This is not just useful to reduce hallucinations or improve reliability in general, but also you could get as precise and specific as you want with the criteria to select the winning draft, which is something you can't control with Bard either. You could also extend this idea by then having another model extract and combine the best aspects out of each draft and so on.
This seems like a pattern / approach that would also be particularly great for cases where the output from the LLM has to be precise to be useful, such as writing code.
arciniegasdev|2 years ago
[deleted]