I just skimmed over LoRA+ and DoRA and I see no reason why these improvements could not go hand in hand. Actually, LoRA+ seems to be about efficient training while DoRA seems about improving the ability to actually learn, making it significantly more robust. Although I still have my questions on how the improvements of LoRA+ would be applied to the magnitude vector.
No comments yet.