Overview
Shroud is a verifiable confidential AI platform. Send prompts and receive completions through an OpenAI-compatible API, with a Trusted Execution Environment (TEE) on the inference side that your client can cryptographically verify before any data leaves it.
Who it's for
Teams that can't send user data to a public LLM provider for legal, regulatory, or competitive reasons.
Agent builders who need a verifiable record proving which model ran on which enclave image.
Developers who want OpenAI-style developer experience with a privacy posture they can audit, not just trust.
Two paths in
The same API key, the same model catalog, two different privacy guarantees depending on which transport you pick.
Path | When to use | What it gives you |
|---|---|---|
OpenAI-compatible HTTP at | Quick start, drop-in for any OpenAI client (Python | TEE-hosted inference, TLS to the gateway. The gateway sees prompts in clear so it can route, rate-limit, and bill. |
Cocoon SDK over WebSocket at | Confidential inference where the platform operator must not see plaintext. | End-to-end ECDH + AES-256-GCM encryption between client and TEE, with an Intel TDX attestation quote verified before the first byte of plaintext is sent. |
You can start on the HTTP path and migrate to the SDK when confidentiality becomes a requirement. See OpenAI-compatible API for the HTTP path and Confidential inference for the end-to-end-encrypted path.
60-second proof
For end-to-end encrypted, attested inference pick the Go or TypeScript SDK and connect to wss://shroud.us/v1/cocoon/stream.
Surfaces
OpenAI-compatible API at
/v1/chat/completions— chat completions with optional SSE streaming. Drop-in for any OpenAI client.Cocoon SDKs — Go and TypeScript clients that add E2E encryption, TEE attestation verification, and signed usage reports on top of the inference path.
MCP server at
/mcp— agents (Claude Desktop, Cursor, Cline, LangChain) invoke confidential inference and other tools through the Model Context Protocol.JSON-RPC 2.0 at
/rpc— the same tool surface for non-MCP integrations.Midnight integration — the gateway also fronts the Midnight blockchain (RPC proxy, explorer, agent tools) for partners building privacy-preserving on-chain agents.
Verifiability
The product's defining property is that "confidential" is verifiable, not just promised:
TEE attestation — every Cocoon SDK session terminates inside a TEE whose ephemeral public key is bound to a fresh Intel TDX attestation quote. The SDK verifies the quote against an image-hash allowlist before sending any prompt; on failure the connection is refused.
End-to-end encryption — Ed25519 → X25519 ECDH key exchange and per-session AES-256-GCM encryption. The gateway operator and any network observer see only encrypted blobs and billing metadata.
Signed usage reports — token counts come back signed by the TEE private key, so the gateway can't tamper with the data that drives billing.
See Confidential inference for the architecture and Wire protocol for the cryptographic construction.
SDKs
SDK | Language | Package |
|---|---|---|
Go |
| |
TypeScript / JavaScript |
|
Architecture
The Cocoon SDK encrypts payloads end-to-end past the gateway: the gateway is a blind proxy for inference traffic, routing encrypted blobs between the SDK and cocoon-bridge without access to plaintext. The OpenAI-compatible HTTP path takes the same path but terminates TLS at the gateway in cleartext — drop-in OpenAI compatibility in exchange for surrendering end-to-end confidentiality.
Both TEEs (cocoon-bridge and the upstream Cocoon proxy) issue their own TDX attestation quotes; the Cocoon SDK verifies the cocoon-bridge quote on every session and exposes the Cocoon proxy quote so callers who want defence-in-depth can verify it independently. See How attestation works for the full trust chain. Both paths bill on token counts the TEE reports.
Authentication and billing
API keys are scoped per environment (shroud_dev_…, shroud_stage_…, shroud_prod_…), with per-key rate limits and Credit Unit (CU) budgets. Stripe, TON, and Telegram Stars are accepted.
See Authentication for plan limits and key management.
Quick links
Quickstart — first call in under five minutes.
Authentication — keys, plans, rate limits, CU.
Confidential inference — how Cocoon works.
Wire protocol — cryptographic details.
OpenAI-compatible API — chat completions reference.