top | item 41955535

(no title)

> Preferred by anyone who's actually using and modifying the work.

> ...fine tuning is preferred by everyone

How do you know this? Did you take a survey? When? What if preferences change or there is no consensus?

> The only people I've seen who've asserted otherwise are random commenters on the internet who don't really understand the tech.

There are lots of things that can be done with the training set that don't involve retraining the entire model from scratch. As a random example, I could perform a statistical analysis over a portion of the training set and find a series of vectors in token-space that could be used to steer the model. Something like this can be done without access to the training data, but does it work better? We don't know because it hasn't been tried yet.

But none of that really matters, because what we're discussing is the philosophy of open source. I think it's a really bad take to say that something is open source because it's in a "preferred" format.

discuss

lolinder|1 year ago

> I think it's a really bad take to say that something is open source because it's in a "preferred" format.

Preferred form and under a free license. Llama isn't open source, but that's because the license has restrictions.

As for if it's a bad take that the preferred form matters—take it up with the GPL, I'm just using their definition:

> The “source code” for a work means the preferred form of the work for making modifications to it.

fsflover|1 year ago

Today, the weights may be the preferable format indeed, due to the cost. Are you going to change the definition tomorrow, when the cost drops?