(no title)
sqlcook | 10 years ago
on a 100 node cluster i had roughly 500GB on each node. this was not a single index, multiple indexes, with roughly 8 shards per index per node. Shard count is pretty important to get correct.
I did not manually control document routing (it was hard based on the type of data i was ingressing), so it was set to auto and during the load i observed hotspots in the cluster (you have to look at BULK thread/queue length), some nodes were getting burst of docs while others were idle, roughly 40-50% of the nodes in the cluster were under utilized, and maybe 5-10% had hot spots from time to time.
Also, depending what you use to push data in, (I used ES hadoop plugin) , you have to account for shard segment merges, which literally pause ingress for a brief moment and merge segments in a given shard. You have to set retry to -1 (infinite) and retry delay to something like a second or two, otherwise you will end up with dropped documents.
sandGorgon|10 years ago