A primer on confidential AI

Modern systems already encrypt sensitive data in two places. Encryption at rest protects data sitting in storage. Encryption in transit protects data flowing across the network. Confidential AI needs a third guarantee: data in use — protected even from the operator running the inference. That is what a Trusted Execution Environment delivers, and it is the foundation Shroud is built on.

The hardware Shroud uses for this is Intel TDX (Trust Domain Extensions). A TDX virtual machine runs on a normal datacentre server, but its memory is encrypted and integrity-checked by the CPU itself. The hypervisor, the host kernel, and the operator administering the box cannot read what runs inside. A program running in the TDX VM can also ask the CPU to produce a signed measurement of its own boot image — an attestation quote — that anyone holding the right public keys can verify.

This is what makes "data in use" tractable. With encryption at rest and in transit alone, the operator could still read every prompt as it flows through the inference server. A TEE collapses that exposure window: the prompt is decrypted only inside hardware that the operator cannot peek into, and the same hardware proves to your client that the right code is running there before the first byte of plaintext is produced.

The blind-relay threat model

Shroud's gateway sits between your application and the inference workers. It handles authentication, billing, and rate limiting — but for confidential inference it is a blind relay. The Cocoon SDK opens an end-to-end encrypted channel that terminates inside the TEE, not at the gateway. The gateway forwards encrypted blobs and a small amount of metadata, but it cannot decrypt them.

Concretely, the operator running the gateway sees:

  • Which API key made the call and how many tokens it consumed.

  • Which model was requested and which Cocoon network served it.

  • Encrypted ciphertext, public-key handshake material, and per-message nonces.

The operator does not see the prompt, the response, the system message, function arguments, retrieved documents, or any other content of the session. Those bytes are sealed end-to-end between your client and the TEE. A signed usage attestation lets your client also confirm the token counts the gateway billed against weren't tampered with after the fact.

What attestation gives you

Attestation is the bridge between hardware guarantees and what your code can check. On every Cocoon session the SDK receives a fresh TDX quote that proves three things:

  1. The connection terminates inside an Intel TDX VM, not on a plain server pretending to be one.

  2. The VM booted a specific image — measured by the CPU and recoverable from the quote — that matches the open-source code Shroud publishes.

  3. The session public key the SDK is encrypting against was minted inside that VM, so the operator cannot man-in-the-middle the handshake by substituting its own keypair.

If any of those checks fail, the SDK aborts the connection before any prompt is sent. That is the verifiability claim in one paragraph: the operator cannot make Shroud read your prompts without breaking the TDX seal, and your client refuses to talk to a Shroud that has been tampered with.

For the deeper walk-through — what the two TEEs in a Cocoon session each prove, how the image hash is computed, and how reproducible builds tie the quote back to source code — see How attestation works.

Where this fits

This page is the conceptual entry point. From here:

  • How attestation works — the two-quote model, measured boot, reproducible builds, and what the SDK verifies on every session.

  • Cocoon wire protocol — the cryptographic construction in detail: ECDH key exchange, AES-256-GCM session encryption, the binary message format, and the signed usage report.

  • Verification paths — three runnable workflows: default policy, custom allowlist, reproduce the build from source.

  • Threat model — what attestation protects against and what it does not.

  • Cocoon — confidential inference — the architecture diagram and endpoint reference.

Last modified: 08 May 2026