top | item 45282255

(no title)

-_- | 5 months ago

ART is also great, though since it's built on top of Unsloth it's geared towards single GPU QLoRA training. We use 8 H100s as a standard, so we can handle larger models and full-parameter fine-tunes.

discuss

omneity|5 months ago

Interesting, do you have benchmarks on FFT vs QLoRA for RL?

ag8|5 months ago

we should publish some; the high-order effect seems to be that LoRAs significantly hurt small model performance vs FFT, with less of an effect for large models. This is maybe because large models have more built-in skills and thus a LoRA suffices to elicit the existing skill, whereas for small models you need to do more actual learning (holding # parameter updates constant). In general I think it's better to get a performant small model with FFT than a performant large model with a large LoRA, which is why we default to FFT, but I agree that we should publish more details here.