WingNews logo WingNews
top | new | best | ask | show | jobs
top | item 47032261

(no title)

tezza | 14 days ago

Thank you all! We needed further data points.

comparing one shot results is a foolish way to evaluate a statistical process like LLM answers. we need multiple samples.

for https://generative-ai.review I do at least three samples of output. this often yields very differnt results even from the same query.

e.g: https://generative-ai.review/2025/11/gpt-image-1-mini-vs-gpt...

discuss

order

No comments yet.

powered by hn/api // news.ycombinator.com