top | item 44500914

(no title)

Ldorigo | 7 months ago

The data might be the limiting factor of current transformer architectures, but there's no reason to believe it's a general limiting factor of any language model (e.g. humans brains are "trained" on orders of magnitude less data and still generally perform better than any model available today)

discuss

hnaccount_rng|7 months ago

That depends on whether these current learning models can really generalise or whether they can only interpolate within their training set