Cocoon Wire Protocol

This page documents the WebSocket message format used for Cocoon confidential inference. Understanding the wire protocol is useful for implementing custom clients or debugging integration issues.

Connection

WebSocket /v1/cocoon/stream
Authorization: Bearer shroud_prod_...

Message Flow

Client                          Server (TEE)
  │                                │
  │──── 1. Init ──────────────────►│
  │     (Ed25519 pubkey + model)   │
  │                                │
  │◄─── 2. Session ───────────────│
  │     (Ed25519 pubkey + quote)   │
  │                                │
  │  [ECDH key exchange happens]   │
  │                                │
  │──── 3. Encrypted Request ─────►│
  │     (AES-256-GCM ciphertext)   │
  │                                │
  │◄─── 4. Encrypted Chunks ──────│
  │     (streaming response)       │
  │                                │
  │◄─── 5. Done Detail ──────────│
  │     (encrypted usage + sig)    │
  │                                │
  │◄─── 6. Done ─────────────────│
  │     (final usage summary)      │
  │                                │

Message Types

1. Init (Client → Server)

The first message sent by the client after WebSocket connection:

{
  "sdk_public_key": "<base64 Ed25519 public key, 32 bytes>",
  "model": "Qwen/Qwen3-32B",
  "shroud": {
    "disclose": ["total_tokens", "model", "prompt_tokens"]
  }
}

Field	Type	Required	Description
`sdk_public_key`	string	Yes	Base64-encoded Ed25519 public key (32 bytes)
`model`	string	Yes	Model ID to use for inference
`shroud.disclose`	string[]	No	Requested disclosure fields

2. Session (Server → Client)

Server responds with its public key and TEE attestation:

{
  "type": "session",
  "public_key": "<base64 Ed25519 public key>",
  "attestation_quote": "<base64 TDX quote>",
  "proxy_public_key": "<base64 Ed25519 public key>",
  "effective_disclose": ["total_tokens", "model"]
}

Field	Type	Description
`type`	string	Always `"session"`
`public_key`	string	Base64 TEE Ed25519 public key
`attestation_quote`	string	Base64 TDX attestation quote
`proxy_public_key`	string	Base64 cocoon proxy public key (optional)
`effective_disclose`	string[]	Negotiated disclosure fields

After receiving this message, both sides perform ECDH:

Convert Ed25519 keys to X25519 (Edwards → Montgomery curve)
Compute shared secret: secret = X25519(my_private, their_public)
Derive AES key: key = SHA-256(secret)

3. Encrypted Request (Client → Server)

The inference request encrypted with the shared AES-256-GCM key:

{
  "payload": "<base64 ciphertext>",
  "nonce": "<base64 12-byte nonce>"
}

The plaintext (before encryption) is a JSON object:

{
  "model": "Qwen/Qwen3-32B",
  "messages": [
    {"role": "user", "content": "Hello!"}
  ],
  "max_tokens": 256,
  "stream": true
}

4. Chunk (Server → Client)

Encrypted response chunks streamed as they are generated:

{
  "type": "chunk",
  "payload": "<base64 ciphertext>",
  "nonce": "<base64 12-byte nonce>"
}

The decrypted payload contains OpenAI-compatible SSE data:

data: {"choices":[{"delta":{"content":"Hello"},"index":0}]}

data: {"choices":[{"delta":{"content":"!"},"index":0}]}

Each SSE chunk may contain:

choices[0].delta.content — generated text
choices[0].delta.reasoning_content — chain-of-thought reasoning

5. Done Detail (Server → Client)

Encrypted usage data with attestation signature:

{
  "type": "done_detail",
  "payload": "<base64 ciphertext>",
  "nonce": "<base64 12-byte nonce>"
}

Decrypted payload:

{
  "usage": {
    "prompt_tokens": 15,
    "cached_tokens": 0,
    "completion_tokens": 42,
    "reasoning_tokens": 0,
    "total_tokens": 57,
    "model": "Qwen/Qwen3-32B",
    "proxy_start_time": 1705312200,
    "proxy_end_time": 1705312205
  },
  "attestation": {
    "usage_hash": "<hex SHA-256 of usage JSON>",
    "signature": "<base64 Ed25519 signature>",
    "session_id": "sess_abc123"
  }
}

The attestation is verified by:

Computing SHA-256(raw_usage_json_bytes)
Comparing with usage_hash
Verifying Ed25519.verify(signature, SHA-256(usage_json), tee_public_key)

6. Done (Server → Client)

Final message indicating stream completion:

{
  "type": "done",
  "usage": {
    "prompt_tokens": 15,
    "completion_tokens": 42,
    "total_tokens": 57
  }
}

The usage field in done is a fallback — prefer the attested usage from done_detail.

Error (Server → Client)

Error message that can arrive at any point:

{
  "type": "error",
  "error": "model not found",
  "code": "MODEL_NOT_FOUND"
}

Encryption Details

Key Exchange

Step	Operation
1	Client generates Ed25519 keypair
2	Server generates Ed25519 keypair
3	Both convert Ed25519 → X25519
4	Shared secret = `X25519(my_priv, their_pub)`
5	AES key = `SHA-256(shared_secret)`

Per-Message Encryption

Parameter	Value
Algorithm	AES-256-GCM
Key size	256 bits (from ECDH)
Nonce size	12 bytes (random per message)
Auth tag	16 bytes (appended to ciphertext)
AAD	None

Attestation Quote Binding

The TDX attestation quote's report_data field (64 bytes) contains:

report_data = SHA-512(tee_ed25519_public_key)

This binds the TEE's public key to the hardware attestation, proving the key was generated inside the attested TEE instance.

TDX Quote Format

The SDK supports Intel TDX quote versions 3, 4, and 5:

Offset	Size	Field
0	2	Version (3, 4, or 5)
2	2	Attestation key type
4	4	TEE type (0x81 = TDX)
48	584	TD Report body

TD Report Body Fields (used for image hash)

Field	Size	Description
MRTD	48 bytes	Measurement of initial TD contents
MR_CONFIG_ID	48 bytes	Configuration ID
MR_OWNER	48 bytes	Owner measurement
MR_OWNER_CONFIG	48 bytes	Owner configuration
RTMR[0..3]	4 × 48 bytes	Runtime measurements
Report Data	64 bytes	Custom data (contains key hash)

Image Hash Computation

image_hash = SHA-256(
  MRTD ||
  MR_CONFIG_ID ||
  MR_OWNER ||
  MR_OWNER_CONFIG ||
  RTMR[0] || RTMR[1] || RTMR[2] || RTMR[3] ||
  zeros[64]
)

The image hash uniquely identifies the code running inside the TEE.

Last modified: 08 May 2026