top | item 43019082

(no title)

mluo | 1 year ago

It's simply bc the model is small (1.5B), making it sensitive to weight perturbations

discuss

order

No comments yet.