top | item 44763238

(no title)

cactusfrog | 7 months ago

This is really interesting. I think force fields in molecular dynamics have underwent a similar NN revolution. You train your NN on the output of expensive calculations to replace the expensive function with a cheap one. Could you train a small language model with a big one?

discuss

order

lossolo|7 months ago

> Could you train a small language model with a big one?

Yes, it's called distillation.