top | item 40059213

(no title)

It's a different animal. In general you cannot reproduce the model even having all the training data. There are too many random factors and nobody keeps track of them. Just pushing the training data is done at random from the dataset. This results in some interesting facts. Having the model and the data it's impossible to say if the model was trained on that exactly data. All we can say is that some pieces of that data were used in training, in some cases. Model can be 'watermarked' in hard to detect, stable to quantization and finetuning way.

So, you cannot have a reproducible, 'open source' in its strict interpretation, model.

discuss

No comments yet.