top | item 34223938

(no title)

madisonmay | 3 years ago

Interestingly it sounds like offloading could be made quite efficient in a batch setting if you primarily care about throughput rather than latency. Though I guess for most current LLM applications latency is quite important.

discuss

order

No comments yet.