top | item 45838336

(no title)

macrael | 3 months ago

Howdy, head of Eng at confident.security here, so excited to see this out there.

I'm not sure I understand what you mean by inference provider here? The inference workload is not shipped off the compute node once it's been decrypted to e.g. OpenAI, it's running directly on the compute machine on open source models loaded there. Those machines are cryptographically attesting to the software they are running. Proving, ultimately, that there is no software that is logging sensitive info off the machine, and the machine is locked down, no SSH access.

This is how Apple's PCC does it as well, clients of the system will not even send requests to compute nodes that aren't making these promises, and you can audit the code running on those compute machines to check that they aren't doing anything nefarious.

The privacy guarantee we are making here is that no one, not even people operating the inference hardware, can see your prompts.

discuss

bjackman|3 months ago

> no one, not even people operating the inference hardware

You need to be careful with these claims IMO. I am not involved directly in CoCo so my understanding lacks nuance but after https://tee.fail I came to understand that basically there's no HW that actually considers physical attacks in scope for their threat model?

The Ars Technica coverage of that publication has some pretty yikes contrasts between quotes from people making claims like yours, and the actual reality of the hardware features.

https://arstechnica.com/security/2025/10/new-physical-attack...

My current understanding of the guarantees here is:

- even if you completely pwn the inference operator, steal all root keys etc, you can't steal their customers' data as a remote attacker

- as a small cabal of arbitrarily privileged employees of the operator, you can't steal the customers' data without a very high risk of getting caught

- BUT, if the operator systematically conspires to steal the customers' data, they can. If the state wants the data and is willing to spend money on getting it, it's theirs.

macrael|3 months ago

I'm happy to be careful, you are right we are relying on TEEs and vTPMs as roots of trust here and TEEs have been compromised by attackers with physical access.

This is actually part of why we think it's so important to have the non-targetability part of the security stack as well, so that even if someone where to physically compromise some machines at a cloud provider, there would be no way for them to reliably route a target's requests to that machine.

michaelt|3 months ago

> I came to understand that basically there's no HW that actually considers physical attacks in scope for their threat model?

xbox, playstation, and some smartphone activation locks.

Of course, you may note those products have certain things in common...

zeusk|3 months ago

Nvidia has been investing in confidential compute for inference workloads in cloud - that covers physical ownership/attacks in their thread model.

https://www.nvidia.com/en-us/data-center/solutions/confident...

https://developer.nvidia.com/blog/protecting-sensitive-data-...

jiveturkey|3 months ago

> The privacy guarantee we are making here is that no one, not even people operating the inference hardware, can see your prompts.

that cannot be met, period. your asssumptions around physical protections are invalid or at least incorrect. It works for Apple (well enough) because of the high trust we place in their own physical controls, and market incentive to protect that at all costs.

> This is how Apple's PCC does it as well [...] and you can audit the code running on those compute machines to check that they aren't doing anything nefarious.

just based on my recollection, and I'm not going to have a new look at it to validate what I'm saying here, but with PCC, no you can't actually do that. With PCC you do get an attestation, but there isn't actually a "confidential compute" aspect where that attestation (that you can trust) proves that is what is running. You have to trust Apple at that lowest layer of the "attestation trust chain".

I feel like with your bold misunderstandings you are really believing your own hype. Apple can do that, sure, but a new challenger cannot. And I mean your web page doesn't even have an "about us" section.

dcliu|3 months ago

That's a strong claim for not looking into it at all.

From a brief glance at the white paper it looks like they are using TEE, which would mean that the root of trust is the hardware chip vendor (e.g. Intel). Then, it is possible for confidentiality guarantees to work if you can trust the vendor of the software that is running. That's the whole purpose of TEE.

macrael|3 months ago

Apple actually attests to signatures of every single binary they install on their machines, before soft booting into a mode where no further executables can be installed: https://security.apple.com/documentation/private-cloud-compu...

We don't _quite_ have the funding to build out our own custom OS to match that level of attestation, so we settled for attesting to a hash of every file on the booted VM instead.

ryanMVP|3 months ago

Thanks for the reply! By "inference provider" I meant someone operating a ComputeNode. I initially skimmed the paper, but I've now read more closely and see that we're trying to get guarantees that even a malicious operator is unable to e.g. exfiltrate prompt plaintext.

Despite recent news of vulnerabilities, I do think that hardware-root-of-trust will eventually be a great tool for verifiable security.

A couple follow-up questions:

1. For the ComputeNode to be verifiable by the client, does this require that the operator makes all source code running on the machine publicly available?

2. After a client validates a ComputeNode's attestation bundle and sends an encrypted prompt, is the client guaranteed that only the ComputeNode running in its attested state can decrypt the prompt? Section 2.5.5 of the whitepaper mentions expiring old attestation bundles, so I wonder if this is to protect against a malicious operator presenting an attestation bundle that doesn't match what's actually running on the ComputeNode.

macrael|3 months ago

Great questions!

1. The mechanics of the protocol are that a client will check that the software attested to has been released on a transparency log. dm-verity is what enforces that the hashes of the booted filesystem on the compute node match what was built and so those hashes are what are put on the transparency log, with a link to the deployed image that matches them. The point of the transparency log is that anyone could then go inspect the code related to that release to confirm that it isn't maliciously logging. So if you don't publish the code for your compute nodes then the fact of it being on the log isn't really useful.

So I think the answer is yes, to be compliant with OpenPCC you would need to publish the code for your compute nodes, though the client can't actually technically check that for you.

2. Absolutely yes. The client encrypts its prompt to a public key specific to a single compute node (well, technically it will encrypt the prompt N times for N specific compute nodes) where the private half of that key is only resident in the vTPM, the machine itself has no access to it. If the machine were swapped or rebooted for another one, it would be impossible for that computer to decrypt the prompt. The fact that the private key is in the vTPM is part of the attestation bundle, so you can't fake it