top | item 46987159

(no title)

hasperdi | 17 days ago

Why distill, if you can run the full model yourself... or at other inference providers.

Quantization the better approach in most cases, unless you want to for instance create hybrid models ie. distilling from here and there.

discuss

No comments yet.