top | item 43967879

(no title)

entilzha | 9 months ago

Great to see our paper here again! Since the paper release, we've also released model weights here for anyone interesting in building on top of it: https://huggingface.co/facebook/blt. We also added HF Hub code to easily load the model https://github.com/facebookresearch/blt?tab=readme-ov-file#l....

discuss

order

accassar|9 months ago

The thing that stood out for me was the use of ngram hashes as an additional feature set. My understanding of this is that its typically used as a positional feature.

Is this a limitation of the byte patches in that the positional information needs to be augmented?