(no title)
hansonw | 1 year ago
What Algorithms can Transformers Learn? A Study in Length Generalization https://arxiv.org/abs/2310.16028
hansonw | 1 year ago
What Algorithms can Transformers Learn? A Study in Length Generalization https://arxiv.org/abs/2310.16028
shawntan|1 year ago
With both empirical and theoretical support I find it's pretty clear this is an obvious limitation.