top | item 41356267

(no title)

haensi | 1 year ago

Very interesting to see an LLM with weights and the code base. They also talk about tokenizer fertility in the HF model card [1]

[1]: https://huggingface.co/Aleph-Alpha/Pharia-1-LLM-7B-control

discuss

order

jamesblonde|1 year ago

"Tokenizer fertility is a metric used to evaluate tokenizer performance and measures a tokenizer’s ability to represent text, calculated by dividing the number of tokens in a text (after tokenizing) by the number of words in that same text"