top | item 42722932

(no title)

tigershark | 1 year ago

The biggest model that they have used has only 760M parameters, and it outperforms models 1 order of magnitude larger.

discuss

order