top | item 46863181

(no title)

mprovost | 27 days ago

FASP uses forward error correction instead of retransmission. So instead of waiting for something not to show up on the other end and sending it again, it calculates parity and transmits slightly more data up front, with enough redundancy that the receiving end is capable of reconstructing any missing bits. This is basically how all storage systems work, not just Weka. You calculate enough parity bits to be able to reconstruct the missing data when a drive fails. The more disks you have, the smaller the parity overhead is. Object storage like S3 does this on a massive scale. With a network transfer you typically only need a few percent, unless it's really lossy like Wifi, in which case standards like 802.11n are doing FEC for you to reduce retransmissions at the TCP layer.

discuss

order

adolph|26 days ago

In RDMA are the NICs able to perform the reconstruction or does that use a different mechanism to avoid CPU?

mprovost|26 days ago

Usually RDMA is over a network that is supposed to be lossless, but it does have checksums to detect corruption and recovers with retransmission. Infiniband NICs handle all of that.