top | item 35632281 (no title) lunixbochs | 2 years ago Are you using https://github.com/EleutherAI/lm-evaluation-harness? discuss order hn newest lhl|2 years ago Yeah, although looks like it currently has some issues with coqa: https://github.com/EleutherAI/lm-evaluation-harness/issues/2...There's also the bigscience fork, but I ran into even more problems (although I didn't try too hard) https://github.com/bigscience-workshop/lm-evaluation-harnessAnd there's https://github.com/EleutherAI/lm-eval2/ (not sure if it's just starting over w/ a new repo or what?) but it has limited tests available
lhl|2 years ago Yeah, although looks like it currently has some issues with coqa: https://github.com/EleutherAI/lm-evaluation-harness/issues/2...There's also the bigscience fork, but I ran into even more problems (although I didn't try too hard) https://github.com/bigscience-workshop/lm-evaluation-harnessAnd there's https://github.com/EleutherAI/lm-eval2/ (not sure if it's just starting over w/ a new repo or what?) but it has limited tests available
lhl|2 years ago
There's also the bigscience fork, but I ran into even more problems (although I didn't try too hard) https://github.com/bigscience-workshop/lm-evaluation-harness
And there's https://github.com/EleutherAI/lm-eval2/ (not sure if it's just starting over w/ a new repo or what?) but it has limited tests available