top | item 9286676

(no title)

mjpt777 | 11 years ago

When I re-read the definitions I can see what you take from it. I think it comes down to what you consider as systemic progress. With respect to the preciseness of the definitions I have likely misinterpreted this. Is the system the algorithm itself or is it the system it lives in? I assumed the latter which may well be a mistake.

Each thread under the algorithm can perform their actions in a finite number of steps without ever blocking. This means the producers can continue to do other work. The consumer can continue to consume from other log buffers without being blocked and complete in a finite number of steps. If a producer is killed mid operation then no further progress can be made on that log buffer. If this is considered blocking then the algorithm is blocking and therefore not wait-free. It would need to be killed by another malicious thread for this to happen.

What is clear is that this algorithm gives the best latency profile of of all the measured messaging systems and the highest throughput. I now have the challenge of searching for a name that best describes its behaviour.

discuss

danbruc|11 years ago

Glad to see that we reached consensus. As mentioned before, wait-freedom has nice progress guarantees but does not necessarily provide the best possible latency or throughput because of the overhead associated with guaranteeing progress for all live threads. I also finally realized that you are the speaker, didn't make the connection before.

And something I wanted to mention before but forgot to do - there is not only a problem if the responsible writer fails while trying to rotate the buffers but also if there is just another writer trying to write before the the buffer rotation completed. There is really not much you can do in this case besides retrying until you succeed. But this again also means that a writer may have bad luck and every time he looks a buffer rotation - not necessarily the same one - is in progress causing the writer to starve.

mjpt777|11 years ago

While one thread is in the action of rotation other producers return right away from the offer. The possibility of starvation occurs if the same thread each time it retried, that buffer has advanced to the next rotation. If adding 100 byte messages to a 128 MB buffer that is greater than a 1 in a million chance on each rotation. To have this continue then the probabilities have to be multiplied for the number of times you expect it to happen. So for ultimate starvation that gets crazy very very quickly ;-) Do you see it as more likely than that?