(no title)
Zvez | 10 months ago
>if you're reading from, like, big Parquet files, that probably means lots of random reads
and it also usually means that you shouldn't use s3 in the first place for workloads like this. Because they are usually very inefficient comparing to distributed fs. Unless you have some prefetch/cache layer, you will get both bad timings and higher costs
CobrastanJorji|10 months ago