top | item 37685391

(no title)

mufasachan | 2 years ago

One explaination https://x.com/yampeleg/status/1707127722743325106?s=46&t=Cxa...

I would be curious what does he mean by "semi-automated system for detecting benchmark leaks. " though.

discuss

order

brucethemoose2|2 years ago

AFAIK, such tests just feed the model chopped up bits of the evaluation data as raw strings with zero temperature. If it completes them verbatim, its probably in the training dataset.