(no title)
IvanAchlaqullah | 1 year ago
Wait, people still use this benchmark? I hear there's a huge flaw on it.
For examples, fine-tuning the model on 4chan make it scores better on TruthfulQA. It becomes very offensive afterwards though, for obvious reasons. See GPT-4chan [1]
thomashop|1 year ago
wongarsu|1 year ago
andy99|1 year ago
nurumaik|1 year ago
>very offensive
Any cons?
hoseja|1 year ago
andai|1 year ago