top | item 41386533

(no title)

CuriousJ | 1 year ago

This paper shows that 200-800 is the ideal chunk size; if you go above, the model starts getting confused / distracted. https://arxiv.org/pdf/2406.14497

discuss

order

zaptrem|1 year ago

Makes sense. Thanks!