top | item 38934640

(no title)

mgreg | 2 years ago

I very much appreciate that the authors not only published their code (https://github.com/llm-random/llm-random) but included the dataset they used (available on Huggingface - https://huggingface.co/datasets/c4) as well as the training process and hyperparameters they used so others can replicate and build on their work. The only thing really missing is the weights which would be nice to have on huggingface as well.

discuss

order

swells34|2 years ago

It's very confusing to me that you are praising the authors of a published scientific paper for almost making their work reproduceable.

chaxor|2 years ago

If we had a proper data version control, wherein the git commit hash was tied directly to the output data hash and hosted on IPFS (and the make system checked ipfs like it does local files for the cache) then it would be absolutely reproducible.

And the wonderful thing is, every person that used git clone on this repo and ran it would be serving the NN weights.

But alas, this unfortunately hasn't been done yet.

astrange|2 years ago

That's not what confusing means.

_ea1k|2 years ago

The weights aren't needed to make it reproducable. The code and training data are needed. Hopefully if you used those, you'd ultimately reach the same result.

jakderrida|2 years ago

It's a sad world where our standards are that low. But they are that low for good reasons.

mgreg|2 years ago

I understand where you're coming from but what they provided DOES make their work reproducible. You can use the data, source code, and recipe to train the model and get the weights.

It would be nice if they provided the weights so it could be USABLE without the effort or knowledge required.

We (I think) would all like to see more _truly_ open models (not just the source code) that enable collaboration in the community.