top | item 38406896

(no title)

closetnerd | 2 years ago

My understanding is they are all still transformers. The tweaks are more about quantization that better to generalize over data more efficiently (so less parameters requires) and improvement of the training data/process itself.

Otherwise I'd like to know specifically whats better/improved between models themselves.

discuss

order

No comments yet.