top | item 41942578

(no title)

Not that I know of for this study, at least for the specific scope torchao we want to make it easier for researchers to create new quantization algorithms in python and have those algorithms run fast and you can see a lot of those algorithms here https://github.com/pytorch/ao/tree/main/torchao/prototype

So for example for AWQ and GPTQ we can accelerate them by using a fast int4 kernel called tinygemm

discuss

No comments yet.