top | item 42722932 (no title) tigershark | 1 year ago The biggest model that they have used has only 760M parameters, and it outperforms models 1 order of magnitude larger. discuss order hn newest NotAnOtter|1 year ago Gah dmn
NotAnOtter|1 year ago