Thanks for sharing, I wasn't aware of this. I'm having trouble seeing how it differs from the byte-pair encoding algorithm. More importantly though, how can it be linear time when it's recursive and you have to tally the counts of each pair again after each merge?
yorwba|1 year ago
alexandermorgan|1 year ago