top | item 40334925

(no title)

kacperlukawski | 1 year ago

Why is that an issue? Training the tokenizer seems much more straightforward than training the model as it is based on the statistics of the input data. I guess it may take a while for massive datasets, but is calculating the frequencies impossible to be done on a bigger scale?

discuss

order

No comments yet.