top | item 22294548

(no title)

octbash | 6 years ago

One is a language generation model, the other is a fill-in-the-blank model. It sounds like they might be similar, but in practice they are different enough objectives (and in particular the "bi-directional" aspect of BERT-type models) that the models learn different things.

discuss

No comments yet.