top | item 46853213 (no title) stephantul | 28 days ago The compression algorithm is very similar to a greedy subword tokenizer, which is used in BERT and other older language models, but has become less popular in favor of BPE. discuss order hn newest No comments yet.
No comments yet.