top | item 43019094 (no title) mluo | 1 year ago For quantization, very big impact for small models, can drop at much as 10% on AIME. Our model does best on bfloat16 ;)Come checkout our repo at: https://github.com/agentica-project/deepscaler discuss order hn newest No comments yet.
No comments yet.