top | item 44123923

(no title)

reedciccio | 9 months ago

Can you point at the research that says that the training process of a LLM at least the size of OLMo or Pythia is deterministic?

discuss

order

timschmidt|9 months ago

Can you point to something that says it's not? The only source of non-determinism I've read of affecting LLM training is floating point error which is well understood and worked around easily enough.

reedciccio|9 months ago

Search more, there is a lot of literature discussing how hard the problem of reproducibility of GenAI/LLMs/Deep Learning is, how far we are from solving it for trivial/small models (let alone for beasts the size of the most powerful ones) and even how pointless the whole exercise is.