So we have numbers on PTB original perplexity 8.79 quantized 9.68, already 10% worse. And PPL reported per token I suppose? Because word PPL for PTB must be around 20, not less than 10.
They're using GTPQ -- here you go: https://arxiv.org/abs/2210.17323 . The authors benchmarked two families of models over a wide range of numbers of params.
ddren|3 years ago
1: https://github.com/qwopqwop200/GPTQ-for-LLaMa
nshm|3 years ago
So we have numbers on PTB original perplexity 8.79 quantized 9.68, already 10% worse. And PPL reported per token I suppose? Because word PPL for PTB must be around 20, not less than 10.
Any numbers on more complex tasks then? like QA?
summarity|3 years ago
sottol|3 years ago
ddren|3 years ago