top | item 34875143

(no title)

borzunov | 3 years ago

I'm afraid that, unlike proprietary APIs and Petals, this system can't be used for single-batch inference of 175B models with interactive speeds - the thing you actually need for running ChatGPT and other interactive LM apps. See https://news.ycombinator.com/item?id=34874976

discuss

No comments yet.