top | item 40237004 (no title) 1 points| remilouf | 1 year ago discuss order hn newest remilouf|1 year ago LLM evaluations are very sensitive to the details of the prompt's structure. This post shows how using structured generation reduces the results' variance and the ranking shifts.
remilouf|1 year ago LLM evaluations are very sensitive to the details of the prompt's structure. This post shows how using structured generation reduces the results' variance and the ranking shifts.
remilouf|1 year ago