top | item 47087199

(no title)

cubefox | 9 days ago

This doesn't mention the drawback of diffusion language models, the main reason why nobody is using them: they have significantly lower performance on benchmarks than autoregressive models at similar size.

discuss

order

No comments yet.