top | item 35140467

(no title)

rjb7731 | 3 years ago

The inference on the gradio demo seems pretty slow, about 250 seconds for a request. Maybe I am too used to the 4-bit quant version now ha!

discuss

order

sebzim4500|3 years ago

I'm sure it's partially the HN hug of death.