top | item 39482925

(no title)

throwaway19423 | 2 years ago

I am confused how all these things are able to interoperate. Are the creators of these models following the same IO for their models? Won't the tokenizer or token embedder be different? I am genuinely confused by how the same code works for so many different models.

discuss

brucethemoose2|2 years ago

It's complicated, but basically because most are llama architecture. Meta all but set the standard for open source llms when they released llama1, and anyone trying to deviate from it has run into trouble because the models don't work with the hyper optimized llama runtumes.

Also, there's a lot of magic going on behind the scenes with configs stored in gguf/huggingface format models, and the libraries that use them. There are different tokenizers, but they mostly follow the same standards.

null_point|2 years ago

I found the magic! https://github.com/search?q=repo%3Aggerganov%2Fggml%20magic&...