top | item 37468499

(no title)

"Inference" - getting the predictions out of the model. While training you need to run: Input -> Model -> Output (Prediction) - Compare with True Output (Label) -> Backpropagation of Loss through the Model. Which can highly batched & pipelined. (And you have to batch to train in any reasonable amount of times, and GPUs shine in batch regime)

When a single user request comes in, you just want the prediction of that single input, so no backprogation and no batching. Which is more CPU friendly.

discuss

syockit|2 years ago

Wow, now I learned something new. So even though statistics and machine learning overlap each other a lot, a word as simple as inference have totally different meanings. In statistics, it usually refers to determining the influence of an input, for a multi-input model. Getting predictions is simply called prediction.