top | item 44264208

(no title)

spion | 8 months ago

Indeed. Which is why I think the only way to really evaluate the progress of LLMs is to curate your own personal set of example failures that you don't share with anyone else and only use it via APIs that provide some sort of no-data-retention and no-training guarantees.

discuss

No comments yet.