top | item 37153003

(no title)

arugulum | 2 years ago

Yep I understood that you were using it informally, just trying to keep things informative for other folks reading too.

discuss

order

swyx|2 years ago

there frankly needs to be a paper calling this out tho, because at this point there are a bunch of industry models following “llama laws” and nobody’s really done the research, its all monkey see monkey do

arugulum|2 years ago

But what would they be calling out?

If industry groups want to run a training run based on the configurations of a well-performing model, I don't see anything wrong with that. Now, if they were to claim that what they are doing is somehow "optimal", then there would be something to criticize.