top | item 40937272

(no title)

The question is why would someone need search through TBs of data.

If you are not google cloud and just have your workers ready to stream all data in parallel on x amount of workers in parallel, i would force usefull limitations and for broad searches, i would add a background system.

Start your query, come back later or get streaming results.

On the other hand, if not toooo many people search in parallel constantly and you go with data pods like backblaze, just add a little bit more cpu and memory and use the cpu of the datapods for parallisation. Should still be much cheaper than putting it on s3 / cloud.

discuss

No comments yet.