(no title)
MrLeap | 8 months ago
This made me laugh.
You seem like you may know something I've been curious about.
I'm a shader author these days, haven't been a data scientist for a while, so it's going to distort my vocab.
Say you've got a trained neural network living in a 512x512 structured buffer. It's doing great, but you get a new video card with more memory so you can afford to migrate it to a 1024x1024. Is the state of the art way to retrain with the same data but bigger initial parameters, or are there other methods that smear the old weights over a larger space to get a leg up? Anything like this accelerate training time?
... can you up sample a language model like you can lowres anime profile pictures? I wonder what the made up words would be like.
kouteiheika|8 months ago
You have to be careful about the "same data" part though; ideally you want to train once on unique data[2] as excessive duplication can harm the performance of the model[3], although if you have limited data a couple of training epochs might be safe and actually improve the performance of the model[4].
[1] -- https://arxiv.org/abs/2312.15166
[2] -- https://arxiv.org/abs/1906.06669
[3] -- https://arxiv.org/abs/2205.10487
[4] -- https://galactica.org/static/paper.pdf
yorwba|8 months ago
MrLeap|8 months ago
ijk|8 months ago