(no title)
thatguysaguy | 4 months ago
This doesn't have an explicit diffusion tie in, but Savinov et al. at DeepMind figured out that doing two steps at training time and randomizing the masking probability is enough to get it to work reasonably well.
thatjoeoverthr|4 months ago
https://joecooper.me/blog/crosstalk/
I’ve still got a few ideas to try though so I’m not done having fun with it.
Anon84|4 months ago
"The [MASK]" "The quick [MASK]" etc
binarymax|4 months ago