top | item 46853213

(no title)

stephantul | 28 days ago

The compression algorithm is very similar to a greedy subword tokenizer, which is used in BERT and other older language models, but has become less popular in favor of BPE.

discuss

order

No comments yet.