top | item 31772316

(no title)

fizwhiz | 3 years ago

Can you walk me through how waiting some amount of time actually improves their guarantee? It's not like they're using Google's TrueTime api that provides bounds on time. How is "unnecessary lag" avoided between Producers & Consumers?

discuss

order

dikei|3 years ago

> Clock skew across different sensors: Sensors might be located across different datacenters, computers, and networks, so their clocks might not be synchronized to the millisecond.

We have to assume that although the sensors time are not synchronized to the milliseconds, but their time are still accurate to a certain degree (say ~1 seconds). Then factoring in the time for data to travel from sensors to the collector, if you wait for 5 secs, most events would have arrived and get stored in database. Events that arrive after that watermark would probably be safe to discard or become irrelevant. Waiting also solve the out-of-order problem, since you can sort the events again.

> Implementation4 cons: Producers and consumers must have synchronized clocks (up to a certain resolution)

So let's say the consumer's clock is slower than the producer's by 5 seconds, i.e when the producer thinks the time is 10, the consumer thinks the time is only 5. In this case, even if the producer has already written `insert_time = 10`, consumer will only read up to `insert_time = 5` and there's an extra 5 seconds worth of lag on top the of wait introduce by the algorithm.