top | item 46177206 (no title) Reubend | 2 months ago It would be REALLY cool to see this same technique applied to a much more recent OSS model distillation. For example, Mistral 3 14B would be a great target. How efficient can we get inference there? discuss order hn newest No comments yet.
No comments yet.