FrasiertheLion's comments

FrasiertheLion | 4 days ago | on: Privacy-preserving age and identity verification via anonymous credentials

Most people outside of a narrow set of cryptography engineers are unfamiliar with the term anonymous credentials, while age and identity verification are two privacy-invasive requirements that are being heavily discussed and rapidly being written into laws lately. The post's intro discusses both quite heavily, and they form the author's entire motivation for writing the post.

The central question the post attempts to answer is "The problem for today is: how do we live in a world with routine age-verification and human identification, without completely abandoning our privacy?"

My rephrase is an attempt to surface that, compared to the dry and academic title that will get overlooked. I think this is a very important topic these days where we are rapidly ceding are privacy to at best, confused and at worst, malicious regulations.

FrasiertheLion | 4 days ago | on: Elevated Errors in Claude.ai

AI has normalized single 9's of availability, even for non-AI companies such as Github that have to rapidly adapt to AI aided scaleups in patterns of use. Understandably, because GPU capacity is pre-allocated months to years in advance, in large discrete chunks to either inference or training, with a modest buffer that exists mainly so you can cannibalize experimental research jobs during spikes. It's just not financially viable to have spades of reserve capacity. These days in particular when supply chains are already under great strain and we're starting to be bottlenecked on chip production. And if they got around it by serving a quantized or otherwise ablated model (a common strategy in some instances), all the new people would be disappointed and it would damage trust.

Less 9's are a reasonable tradeoff for the ability to ship AI to everyone I suppose. That's one way to prove the technology isn't reliable enough to be shipped into autonomous kill chains just yet lol.

FrasiertheLion | 10 days ago | on: OpenAI, the US government and Persona built an identity surveillance machine

Recent paper by Nicholas Carlinini and others really showcases how little it takes to deanonymize users across platforms with LLMs: https://arxiv.org/abs/2602.16800

FrasiertheLion | 13 days ago | on: How an inference provider can prove they're not serving a quantized model

Yes, it is a TLS certificate generated by the enclave on boot (the code responsible for doing this is open source and the attestation is also included in the certificate so you can check that this is exactly what’s happening). We go into more detail in our attestation verification docs here: https://docs.tinfoil.sh/verification/attestation-architectur...

FrasiertheLion | 14 days ago | on: How an inference provider can prove they're not serving a quantized model

jashulma above has a great link: https://news.ycombinator.com/item?id=47105315

FrasiertheLion | 14 days ago | on: How an inference provider can prove they're not serving a quantized model

The disk isn’t client owned, but anyone can run modelwrap on any device and reproduce the root measurement that is being attested against.

FrasiertheLion | 14 days ago | on: How an inference provider can prove they're not serving a quantized model

When the enclave boots, two things happen:

1. An HPKE (https://www.rfc-editor.org/rfc/rfc9180.html ) key is generated. This is the key that encrypts communication to the model.

2. The enclave is provisioned a certificate

The certificate is embedded with the HPKE key accessible only inside the enclave. The code for all this is open source and part of the measurement that is being checked against by the client.

So if the provider attempts to send a different attestation or even route to a different enclave, this client side check would fail.

FrasiertheLion | 14 days ago | on: How an inference provider can prove they're not serving a quantized model

Yes, we use Intel TDX/AMD SEV-SNP with H200/B200 GPUs configured to run in Nvidia Confidential Computing mode

FrasiertheLion | 14 days ago | on: How an inference provider can prove they're not serving a quantized model

There’s a few components that are necessary to make it work:

1. The provider open sources the code running in the enclave and pins the measurement to a transparency log such as Sigstore

2. On each connection, the client SDK fetches the measurement of the code actually running (through a process known as remote attestation)

3. The client checks that the measurement that the provider claimed to be running exactly matches the one fetched at runtime.

We explain this more in a previous blog: https://tinfoil.sh/blog/2025-01-13-how-tinfoil-builds-trust

FrasiertheLion | 14 days ago | on: How an inference provider can prove they're not serving a quantized model

The committed weights are open source and pinned to a transparency log, along with the full system image running in the enclave.

At runtime, the client SDK (also open source: https://docs.tinfoil.sh/sdk/overview) fetches the pinned measurement from Sigstore, and compares it to the attestation from the running enclave, and checks that they’re equal. This previous blog explains it in more detail: https://tinfoil.sh/blog/2025-01-13-how-tinfoil-builds-trust

FrasiertheLion | 14 days ago | on: How an inference provider can prove they're not serving a quantized model

The verification is not happening locally only. The client SDKs fetch the measurement of the weights (+ system software, inference engine) that are pinned to Sigstore, then grabs the same measurement (aka remote attestation of the full, public system image) from the running enclave, and checks that the two are exactly equal. Our previous blog explains this in more detail: https://tinfoil.sh/blog/2025-01-13-how-tinfoil-builds-trust

Sorry it wasn’t clear from the post!

FrasiertheLion | 2 months ago | on: GPT-5.2

Try elevenlabs

FrasiertheLion | 3 months ago | on: Personal blogs are back, should niche blogs be next?

I would argue personal blogs are back and Substack is the medium of choice this time around

FrasiertheLion | 9 months ago | on: Launch HN: Tinfoil (YC X25): Verifiable Privacy for Cloud AI

Ha we didn't think of that, thanks for the tip!

FrasiertheLion | 9 months ago | on: Launch HN: Tinfoil (YC X25): Verifiable Privacy for Cloud AI

Ollama does heavily quantize models and has a very short context window by default, but this has not been my experience with unquantized, full context versions of Llama3.3 70B and particularly, Deepseek R1, and that is reflected in the benchmarks. For instance I used Deepseek R1 671B as my daily driver for several months, and it was at par with o1 and unquestionably better than GPT-4o (o3 is certainly better than all but typically we've seen opensource models catch up within 6-9 months).

Please shoot me an email at [email protected], would love to work through your use cases.

FrasiertheLion | 9 months ago | on: Launch HN: Tinfoil (YC X25): Verifiable Privacy for Cloud AI

Yeah not the case for FHE. But yes, not practically viable. We would be happy to switch as soon as it is.

FrasiertheLion | 9 months ago | on: Launch HN: Tinfoil (YC X25): Verifiable Privacy for Cloud AI

We’re already using vllm as our inference server for our standard models. We can run whatever inference server for custom deployments.

FrasiertheLion | 9 months ago | on: Launch HN: Tinfoil (YC X25): Verifiable Privacy for Cloud AI

Aye comrade, shot you an email :)

FrasiertheLion | 9 months ago | on: Launch HN: Tinfoil (YC X25): Verifiable Privacy for Cloud AI

That's the best part, you don't. You only need to trust NVIDIA and AMD/Intel. Modulo difficult to mount physical attacks and side channels, which we wrote more about here: https://tinfoil.sh/blog/2025-05-15-side-channels

FrasiertheLion | 9 months ago | on: Launch HN: Tinfoil (YC X25): Verifiable Privacy for Cloud AI

Yes, in the US right now. We don't run our own datacenters, though we sometimes consider it in a moment of frustration when the provider is not able to get the correct hardware configuration and firmware versions. Currently renting bare metal servers from neoclouds. We can't use hyperscalers because we need bare metal access to the machine.