top | item 40616648

Ask HN: Privacy-Aware Inferencing for LLMs

1 points| manili | 1 year ago

Hello,

One of the biggest challenges with cloud-based inferencing for LLMs is keeping user data private. Is it possible to use both local and cloud machines together to solve this?

For example, could we run the first and last layers of an LLM on a local machine to protect the data privacy and use the cloud for the rest to speed things up? We could fine-tune the first and last layers locally to change the weights and keep them away from the cloud.

Please let me know if there's any ongoing researches using such approach for privacy-aware inferencing.

Thank you.

2 comments

koutetsu|1 year ago

I had a similar idea some time ago but didn't implement due to its complexity and the need to juggle different parameters. On top of that, I don't think there are any guarantees when it comes to privacy. Your users will have to trust that no mistake will be made in handling the raw data or the preprocessed data and that no malicious actor will be able to access the original weights.

You should instead try looking into Homomorphic Encryption:

https://huggingface.co/blog/encrypted-llm

It is resource intensive and slower but it serves your purpose better, in my opinion.

manili|1 year ago

Thanks @koutetsu,

I know about FHE and TFHE, but as you said, they need a lot of computational resources.

Ignoring the training process and just looking at inference, what are the "technical" drawbacks of this idea? If the first and last layers of the network run on the local machine, how could a malicious cloud reverse engineer the inputs and discover the raw data?