top | item 39810544

(no title)

hallqv | 1 year ago

This discussion is so dumb - finetuning a base model costs ~$1 with LORA/QLORA and can yield same performance as gpt-4, but at 1/100 of the cost per token.

What Bloomberg did for $10M was not finetuning..

discuss

simonw|1 year ago

"finetuning a base model costs ~$1 with LORA/QLORA and can yield same performance as gpt-4, but at 1/100 of the cost per token"

That's a big claim - can you back that up with any examples?

Implicated|1 year ago

I had opened a new tab back when this comment was just a few minutes old in hopes that when I came back there was some really great blog post linked with the details on the sorcery.

hallqv|1 year ago

https://arxiv.org/pdf/2402.00841.pdf