(no title)
abeppu
|
10 days ago
Diffusion model papers are always interesting to read but I always feel like they need some mechanism to insert or delete tokens.
In the example in the figure in this post, once it has fixed "British munchkin cats _ _ and ..." you _can't_ get to "British munchkin cats are a new and controversial breed." because there's not the right number of tokens between "cats" and "and".
In a coding context, if your model samples a paren or a comma or something which is entirely plausible at that position, it can still close off an expansion which would be syntactically correct.
kazinator|10 days ago
Once you get to "British cats <next-token-here>" you can't get to "British munchkin cats <next-token-here>"; the tokens to the left are done and dusted.
It's kind of a feature. Diffusion is used for images, right? It's like saying, once the image of a door has started to form right next to a kitchen counter, it cannot insert a refrigerator there any more. Well, maybe it doesn't "want to" because that layout is already settled by that time.
crystal_revenge|10 days ago
Further more, you're applying the logic of AR LLMs to diffusion models. AR LLMs are only seeking the probability of the next token (a chain of conditional probability), but diffusion LLMs are modeling the probability of the entire output at once. Because of this token structures that leads to invalid outputs should be extremely low probability if properly trained.
cubefox|8 days ago
However, I believe this would "only" be able to insert tokens, not to delete tokens again it mistakenly produced before. (The deletion in the title refers to the reverse process during training, where tokens are progressively deleted rather than masked.)
LarsDu88|10 days ago
abeppu|10 days ago
moralestapia|10 days ago
abeppu|10 days ago
naasking|10 days ago
abeppu|10 days ago