top | item 45718237

(no title)

uaas | 4 months ago

I am curious, what’s the point of re-running these interactions on a UI?

discuss

order

muzani|4 months ago

Reproduction I suppose. I would like the same things as OP too.

LLM outputs are qualitative; they can't really be automatically scored and prompt enhancements tend to multiply the bug. It can solve a problem, but introduce a new one. It's practical just to do it manually.