top | item 41356267 (no title) haensi | 1 year ago Very interesting to see an LLM with weights and the code base. They also talk about tokenizer fertility in the HF model card [1][1]: https://huggingface.co/Aleph-Alpha/Pharia-1-LLM-7B-control discuss order hn newest jamesblonde|1 year ago "Tokenizer fertility is a metric used to evaluate tokenizer performance and measures a tokenizer’s ability to represent text, calculated by dividing the number of tokens in a text (after tokenizing) by the number of words in that same text"
jamesblonde|1 year ago "Tokenizer fertility is a metric used to evaluate tokenizer performance and measures a tokenizer’s ability to represent text, calculated by dividing the number of tokens in a text (after tokenizing) by the number of words in that same text"
jamesblonde|1 year ago