(no title)
michaelscott | 10 days ago
I'm not sure if I'm up to date on the latest diffusion work, but I'm genuinely curious how you see them potentially making LLMs more deterministic? These models usually work by sampling too, and it seems like the transformer architecture is better suited to longer context problems than diffusion
LoganDark|10 days ago
michaelscott|10 days ago