top | item 42394775

(no title)

Yeah classic use cases of GPUs like deep learning have you transfer the weights for the entire model to your GPU(s) at the of inference and after you that you only transfer your input over.

The use case of transferring ALL data over every time is obviously misusing the GPU.

If anyone’s ever tried running a model that’s too large for your GPU you will have experienced how slow this is when you have to pull in the model in parts for a single inference run.

discuss

No comments yet.