How do you deal with adversarial/byzantine updates that attempt to degrade performance or even install a backdoor? Do you use plain averaging, or some other aggregation algorithm like Multi-Krum?
For now, the only separation we have is that each worker is responsible for its own weights, since network security has not been our top priority. Still, we've been thinking about adding some security measures like proof-of-work for each node and detection of anomalous inputs/gradients (or simply NaN values). Right now we're running experiments on internal hardware, but before a public launch we'll make sure that malicious participants won't put everybody else's work to waste :)
This is also what I was thinking about. Considering that making up bad data does not require any GPU work as opposed to honest calculating nodes, the model can fall quickly if without taking some measures to deal with them (adverserial nodes).
A draft solution would be for the central server to measure the goodness of each update and drop the ones that don't perform well. This could somehow work since inference is much cheaper than gradients computing.
mryab|5 years ago
ouromoros|5 years ago
A draft solution would be for the central server to measure the goodness of each update and drop the ones that don't perform well. This could somehow work since inference is much cheaper than gradients computing.