top | item 34223938 (no title) madisonmay | 3 years ago Interestingly it sounds like offloading could be made quite efficient in a batch setting if you primarily care about throughput rather than latency. Though I guess for most current LLM applications latency is quite important. discuss order hn newest No comments yet.
No comments yet.