Open Source Implementation of Apple's Private Compute Cloud

ryanMVP|3 months ago

Reading the whitepaper, the inference provider still has the ability to access the prompt and response plaintext. This scheme does seem to guarantee that plaintext cannot be read for all other parties (e.g. the API router), and that the client's identity is hidden and cannot be associated with their request. Perhaps the precise privacy guarantees and allowances should be summarized in the readme.

With that in mind, does this scheme offer any advantage over the much simpler setup of a user sending an inference request:

- directly to an inference provider (no API router middleman)

- that accepts anonymous crypto payments (I believe such things exist)

- using a VPN to mask their IP?

macrael|3 months ago

Howdy, head of Eng at confident.security here, so excited to see this out there.

I'm not sure I understand what you mean by inference provider here? The inference workload is not shipped off the compute node once it's been decrypted to e.g. OpenAI, it's running directly on the compute machine on open source models loaded there. Those machines are cryptographically attesting to the software they are running. Proving, ultimately, that there is no software that is logging sensitive info off the machine, and the machine is locked down, no SSH access.

This is how Apple's PCC does it as well, clients of the system will not even send requests to compute nodes that aren't making these promises, and you can audit the code running on those compute machines to check that they aren't doing anything nefarious.

The privacy guarantee we are making here is that no one, not even people operating the inference hardware, can see your prompts.

Terretta|3 months ago

> the inference provider still has the ability to access the prompt and response plaintext

Folks may underestimate the difficulty of providing compute that the provider “cannot”* access to reveal even at gunpoint.

BYOK does cover most of it, but oh look, you brought me and my code your key, thanks… Apple's approach, and certain other systems such as AWS's Nitro Enclaves, aim at this last step of the problem:

- https://security.apple.com/documentation/private-cloud-compu...

- https://aws.amazon.com/confidential-computing/

NCC Group verified AWS's approach and found:

1. There is no mechanism for a cloud service provider employee to log in to the underlying host.

2. No administrative API can access customer content on the underlying host.

3. There is no mechanism for a cloud service provider employee to access customer content stored on instance storage and encrypted EBS volumes.

4. There is no mechanism for a cloud service provider employee to access encrypted data transmitted over the network.

5. Access to administrative APIs always requires authentication and authorization.

6. Access to administrative APIs is always logged.

7. Hosts can only run tested and signed software that is deployed by an authenticated and authorized deployment service. No cloud service provider employee can deploy code directly onto hosts.

- https://aws.amazon.com/blogs/compute/aws-nitro-system-gets-i...

Points 1 and 2 are more unusual than 3 - 7.

Folks who enjoy taking things apart to understand them can hack at Apple's here:

https://security.apple.com/blog/pcc-security-research/

* Except by, say, withdrawing the system (see Apple in UK) so users have to use something less secure, observably changing the system, or other transparency trippers.

immibis|3 months ago

It's probably illegal for a business to take anonymous cryptocurrency payments in the EU. Businesses are allowed to take traceable payments only, or else it's money laundering.

With the caveat that it's not clear what precisely is illegal about these payments and to what level it's illegal. It might be that a business isn't allowed to have any at all, or isn't allowed to use them for business, or can use them for business but can't exchange them for normal currency, or can do all that but has to check their customer's passport and fill out reams of paperwork.

https://bitcoinblog.de/2025/05/05/eu-to-ban-trading-of-priva...

unknown|3 months ago

[deleted]

anon721656321|3 months ago

at that point, it seems easier to run a slightly worse model locally. (or on a rented server)

rasengan|3 months ago

We are introducing Verifiably Private AI [1] which actually solves all of the issues you mention. Everything across the entire chain is verifiably private (or in other words, transparent to the user in such a way they can verify what is running across the entire architecture).

[1] https://ai.vp.net/

derpsteb|3 months ago

I was part of a team that does the same thing. Arguably as a paid service, but source availability and meaningful attestation.

Service: https://www.privatemode.ai/ Code: https://github.com/edgelesssys/privatemode-public

jmort|3 months ago

OpenPCC is Apache 2.0 without a CLA to prevent rugpulls whereas edgeless is BSL

m1ghtym0|3 months ago

Exactly, attestation is what matters. Excluding the inference provider from the prompt is the USP here. Privatemode can do that via an attestation chain (source code -> reproducible build -> TEE attestation report) + code/stack that ensures isolation (Kata/CoCo, runtime policy).

saurik|3 months ago

Yes: "provably" private... unless you have $1000 for a logic analyzer and a steady hand to solder together a fake DDR module.

https://news.ycombinator.com/item?id=45746753

Lord-Jobo|3 months ago

well, also indefinite time and physical access.

rossjudson|3 months ago

GCP can and does live migrate confidential VMs between machines. Which of the 50k machines in a cluster were you going to attach your analyzer to?

kiwicopple|3 months ago

impressive work jmo - thanks for open sourcing this (and OSI-compliant)

we are working on a challenge which is somewhat like a homomorphic encryption problem - I'm wondering if OpenPCC could help in some way? :

When developing websites/apps, developers generally use logs to debug production issues. However with wearables, logs can be privacy issue: imagine some AR glasses logging visual data (like someone's face). Would OpenPCC help to extract/clean/anonymize this sort of data for developers to help with their debugging?

jmort|3 months ago

Yep, you could run an anonymization workload inside the OpenPCC compute node. We target inference as the "workload" but it's really just attested HTTP server where you can't see inside. So, in this case your client (the wearable) would send its data first through OpenPCC to a server that runs some anonymization process.

If it's possible to anonymize on the wearable, that would be simpler.

The challenge is what does the anonymizer "do" to be perfect?

As an aside, IMO homomorphic encryption (still) isn't ready...

wferrell|3 months ago

Really nice release. Excited to see this out in the wild and hopeful more companies leverage this for better end user privacy.

sublimefire|3 months ago

Quite similar to what Azure with conf ai inference did [1].

[1] https://techcommunity.microsoft.com/blog/azureconfidentialco...

jmort|3 months ago

I haven’t been able to find their source code. Pretty important for the transparency side of it. Have you seen it?

DeveloperOne|3 months ago

Glad to see Golang here. Go will surpass Python in the AI field, mark my words.

jabedude|3 months ago

Where is the compute node source code?

utopiah|3 months ago

That's nice... in theory. Like it could be cool, and useful... but like what would I actually run on it if I'm not a spammer?

Edit : reminds me of federated learning and FlowerLLM (training only AFAIR, not inference), like... yes, nice, I ALWAYS applaud any way to disentangle from proprieaty software and wall gardens... but like what for? What actual usage?

utopiah|3 months ago

Gimme an actual example instead of downvoting, help me learn.

Edit on that too : makes me think of OpenAI Whisper as a service via /e/OS and supposedly anonymous proxying (by mixing), namely running STT remotely. That would be an actual potential usage... but IMHO that's low end enough to be run locally. So I'm still looking for an application here.

unknown|3 months ago

[deleted]

nixpulvis|3 months ago

Thought this was going to be about Orchard from the title.

MangoToupe|3 months ago

@dang can we modify the title to acknowledge that it's specific to chatbots? The title reads like this is about generic compute, and the content is emphatically not about generic compute.

I realize this is just bad branding by apple but it's still hella confusing.

jmort|3 months ago

It does work generically. Like Apple, we initially targeted inference, but it under the hood just an anonymous, attested HTTP server wrapper. The ComputeNode can run an arbitrary workload.

mr_windfrog|3 months ago

[deleted]

okelahbos28|3 months ago

[deleted]

pjmlp|3 months ago

[deleted]

kreetx|3 months ago

I read this and your reply to the sibling, you seem to have reputation to be sensible - what are you trying to say? If someone re-implements or reverses a service then it doesn't need to be in the same language.

mlnj|3 months ago

It is an implementation. As long has it behaves the same...

102 comments