top | item 46299767 Apriel-H1: Towards Efficient Enterprise Reasoning Models 1 points| guiriduro | 2 months ago |arxiv.org 1 comment order hn newest guiriduro|2 months ago Apriel-H1-15b-Thinker-SFT uses incremental distillation from Apriel-Nemotron-15B-Thinker, selectively replacing less critical attention layers with linear Mamba blocks to reduce computational complexity while preserving reasoning quality.
guiriduro|2 months ago Apriel-H1-15b-Thinker-SFT uses incremental distillation from Apriel-Nemotron-15B-Thinker, selectively replacing less critical attention layers with linear Mamba blocks to reduce computational complexity while preserving reasoning quality.
guiriduro|2 months ago