(no title)
entilzha | 1 year ago
Good description! Maybe what parent got mixed up on is an alternate way to view this is trying to chunk bytes to have roughly similar information. EG we initially tried a bunch of patching schemes, EG, keep a running total of entropy until the total exceeds a threshold, but ended up finding simple things worked better.
I’ll see if we can add more information about the small CNN in a next update to arXiv paper.
cschmidt|1 year ago
https://aclanthology.org/Y03-1017/ https://aclanthology.org/I05-1009/ https://aclanthology.org/P06-2056/
Exactly the same approach of segmenting a word when the entropy goes up compared to the previous byte.
ted_dunning|1 year ago
https://dspace.mit.edu/handle/1721.1/7191?show=full
entilzha|1 year ago
psb217|1 year ago
yorwba|1 year ago