top | item 36636088

(no title)

That advice makes sense if we're talking about 800B+ parameter models that require a gigantic investment of capital and time. For models that fit on a consumer GPU you're leaving chips on the table to not take advantage of training / fine-tuning. It's just too easy and powerful not to.

discuss

No comments yet.