top | item 44900337

(no title)

sadiq | 6 months ago

You might find https://arxiv.org/abs/2401.17377v3 interesting..

discuss

order

JPLeRouzic|6 months ago

Only if you have access to corporate-level hardware:

"It took us 48 hours to build the suffix array for RedPajama on a single node with 128 CPUs and 1TiB RAM"

protomikron|6 months ago

It's okayish. Considering 64G to 128G are available for (nerd) high-end consumers you're just off with a factor 5 (if we can squeeze out a little bit more performance).

Thas is pretty astonishing in my opinion.