top | item 43642865

(no title)

Zvez | 10 months ago

calling everything 'for AI' is the new standard

>if you're reading from, like, big Parquet files, that probably means lots of random reads

and it also usually means that you shouldn't use s3 in the first place for workloads like this. Because they are usually very inefficient comparing to distributed fs. Unless you have some prefetch/cache layer, you will get both bad timings and higher costs

discuss

order

CobrastanJorji|10 months ago

But a distributed FS is far more expensive than cloud blob storage would be, and I can't imagine most workloads would need the features of a POSIX filesystem.