(no title)
guites | 2 years ago
I've been working on a project for over a year that uses flower to train cv models on medical data.
One aspect that we see being brought up again and again is how we can prove to our clients that no unnecessary data is being shared over the network.
Do you have any tips on solving that particular problem? I.e. proving that no data apart from model weights are being transferred to the centralized server?
Thanks a lot for the project.
edit: Just to clarify I am aware of differential privacy, I'm talking more on a "how to convince a medical institution that we are not sending its images over the network" level.
cpmpcpmp|2 years ago
onethought|2 years ago
tanto|2 years ago
guites|2 years ago
danieljanes|2 years ago
One approach to increase the transparency on the client side (and build trust with the organization where the Flower clien is deployed) is to integrate a review step that asks the someone to confirm the update that gets send back to the server.
On top of that, you should definitely use differential privacy. To quote Andrew Trask here: "friends don't let friends use FL without DP". Other approaches like Secure Aggregation can also help, depending on what kind of exposure your clients are concerned about.
My general take is that the best way to solve for transparency and trust is to tackle it on multiple layers of the stack.
guites|2 years ago
I'll be looking into secure aggregation as I'm not fully aware of how it works. As of now we rely on differential privacy only.
Thanks!
jorgeili|2 years ago
I'm trying to apply federated learning to the medical domain too and I'm trying to define the best "stack" that guarantees privacy and compliance with regulations like the GDPR