top | item 39366621 (no title) Nevin1901 | 2 years ago How fast are coldstarts, and how do you compare against other gpu providers (runpod modal etc) discuss order hn newest xena|2 years ago The slowest part is loading weights into vram in my experience. I haven't done benchmarking on that. What kind of benchmark would you like to see? ipsum2|2 years ago I would like to see time to first inference for typical models (llama-7b first token, SDXL 1 step, etc)
xena|2 years ago The slowest part is loading weights into vram in my experience. I haven't done benchmarking on that. What kind of benchmark would you like to see? ipsum2|2 years ago I would like to see time to first inference for typical models (llama-7b first token, SDXL 1 step, etc)
ipsum2|2 years ago I would like to see time to first inference for typical models (llama-7b first token, SDXL 1 step, etc)
xena|2 years ago
ipsum2|2 years ago