Cocoon TypeScript SDK

Overview

The TypeScript SDK provides end-to-end encrypted Cocoon inference for Node.js (>= 18) and modern browsers. It performs an X25519 key exchange with the in-TEE proxy, encrypts every prompt and response chunk with AES-GCM under the session key, verifies the proxy's TDX attestation against a built-in allow-list of measurements, and surfaces TEE-signed token usage at the end of each request. Streaming responses are exposed as an async iterable of Chunk values; the Usage object on the stream carries an Ed25519 signature the SDK validates locally.

The SDK has no retry behaviour and no client-side rate limiting. Callers are expected to wrap calls in their own backoff. See Retry policy below.

Installation

npm install @alphatoncapital/shroud-sdk

For Node.js, also install a WebSocket implementation:

npm install ws

Migration from OpenAI

The SDK does not expose an OpenAI-compatible surface. If you have existing code talking to the OpenAI SDK and want to keep that shape, point your OpenAI client at the Shroud gateway's /v1/chat/completions endpoint — the gateway accepts the OpenAI request body and returns OpenAI responses. See Migrate from OpenAI and the OpenAI-compatible API reference.

Use the Cocoon TypeScript SDK when you need end-to-end encryption between your process and the TEE, attestation verification, and TEE-signed usage receipts. The HTTP path terminates TLS at the gateway and does not provide any of those properties.

Quick start

import { CocoonClient } from '@alphatoncapital/shroud-sdk'; const client = new CocoonClient('wss://shroud.us', { apiKey: 'shroud_prod_...', modelsPath: '/v1/models', }); const stream = await client.inference({ model: 'Qwen/Qwen3-32B', messages: [{ role: 'user', content: 'Hello!' }], stream: true, }); for await (const chunk of stream) { process.stdout.write(chunk.content); } if (stream.error) { throw stream.error; } const usage = stream.usage; console.log(`\nTokens: ${usage?.totalTokens}`);

Listing models

const models = await client.listModels(); for (const m of models) { console.log(`${m.id} (owned by ${m.ownedBy})`); }

listModels() accepts an optional AbortSignal for per-call cancellation: client.listModels(controller.signal).

Selecting a Cocoon network

The default paths follow the deployment's default_network. To pin the client to a specific network — for example cocoon-classic regardless of the deployment default — set both Cocoon-bound paths to the network-prefixed form:

const client = new CocoonClient('wss://shroud.us', { apiKey: 'shroud_prod_...', streamPath: '/cocoon-classic/v1/cocoon/stream', modelsPath: '/cocoon-classic/v1/models', });

See Cocoon networks for the full route grid.

Streaming

InferenceStream implements AsyncIterableIterator<Chunk>. Iterate with for await, then read stream.usage and check stream.error:

const stream = await client.inference({ model: 'Qwen/Qwen3-32B', messages, stream: true, }); for await (const chunk of stream) { process.stdout.write(chunk.content); } if (stream.error) { throw stream.error; } const usage = stream.usage;

Always check stream.error after the loop. A network drop mid-stream ends iteration cleanly but surfaces only through stream.error; without the check, a truncated response looks like a complete one.

Selective disclosure

Control which usage fields the TEE reveals to the gateway. By default the TEE returns only token totals; opt in to per-request fields by listing them in disclose.

import { DiscloseFields } from '@alphatoncapital/shroud-sdk'; const stream = await client.inference({ model: 'Qwen/Qwen3-32B', messages: [{ role: 'user', content: 'Hello!' }], stream: true, disclose: [ DiscloseFields.TotalTokens, DiscloseFields.Model, DiscloseFields.PromptTokens, DiscloseFields.CompletionTokens, ], }); console.log('Effective disclosure:', stream.effectiveDisclose);

Available fields:

Constant

Wire value

DiscloseFields.PromptTokens

prompt_tokens

DiscloseFields.CachedTokens

cached_tokens

DiscloseFields.CompletionTokens

completion_tokens

DiscloseFields.ReasoningTokens

reasoning_tokens

DiscloseFields.TotalTokens

total_tokens

DiscloseFields.Model

model

DiscloseFields.ProxyStartTime

proxy_start_time

DiscloseFields.ProxyEndTime

proxy_end_time

DiscloseFields.WorkerStartTime

worker_start_time

DiscloseFields.WorkerEndTime

worker_end_time

DiscloseFields.WorkerDebug

worker_debug

DiscloseFields.ProxyDebug

proxy_debug

Reasoning content / chat-template overrides

The TS SDK exposes chatTemplateKwargs on InferenceRequest, mirroring the vLLM/sglang chat_template_kwargs convention. The only field today is enableThinking, which the TEE forwards as enable_thinking to the worker. Pass it to opt models with reasoning support into chain-of-thought emission:

const stream = await client.inference({ model: 'Qwen/Qwen3-32B', messages, stream: true, chatTemplateKwargs: { enableThinking: true }, }); for await (const chunk of stream) { if (chunk.reasoningContent) { process.stderr.write(chunk.reasoningContent); } else { process.stdout.write(chunk.content); } }
interface ChatTemplateKwargs { enableThinking?: boolean; // wire: enable_thinking }

The Go SDK does not expose chat_template_kwargs today; see the note in Cocoon Go SDK.

Cancellation

Use AbortSignal per call. The constructor does not accept a client-wide signal; pass one to each operation.

const controller = new AbortController(); setTimeout(() => controller.abort(), 30_000); const stream = await client.inference(request, controller.signal); for await (const chunk of stream) { process.stdout.write(chunk.content); } const models = await client.listModels(controller.signal);

Attestation verification

By default, the SDK verifies Intel TDX attestation quotes against a built-in allow-list of Cocoon proxy image measurements. When the quote is missing, malformed, signed by an untrusted measurement, or its report_data does not bind to the proxy public key the gateway returned, inference() throws an AttestationError and the WebSocket is closed before any prompt is sent.

import { defaultAttestationPolicy } from '@alphatoncapital/shroud-sdk'; // Use default policy (recommended) const client = new CocoonClient(url, { apiKey: key, modelsPath: '/v1/models' }); // Custom policy with additional image hashes const client = new CocoonClient(url, { apiKey: key, modelsPath: '/v1/models', attestationPolicy: { allowedCocoonProxyImageHashes: [ 'c4f99569acaa71ae2f6b091b64ff6645b97eb7ab3d8c463dc2d7be752212008d', 'your_custom_hash', ], }, }); // Disable verification (not recommended) const client = new CocoonClient(url, { apiKey: key, modelsPath: '/v1/models', attestationPolicy: null, });

After the stream completes, usage.verified reports whether the per-usage Ed25519 signature checked out against the session's TEE public key. The SDK does not currently fail closed when verification fails — the stream still yields content and verified === false. Treat unverified usage as untrusted and reject the response in your own code:

const usage = stream.usage; if (!usage || !usage.verified) { throw new Error('cocoon: usage attestation verification failed'); }

For the wire-level details see How attestation works.

Error handling

import { SessionError, CryptoError, AttestationError, StreamError, } from '@alphatoncapital/shroud-sdk'; try { const stream = await client.inference(request); for await (const chunk of stream) { process.stdout.write(chunk.content); } if (stream.error) { throw stream.error; } } catch (err) { if (err instanceof AttestationError) { console.error('Attestation failed:', err.message); } else if (err instanceof SessionError) { console.error('Session error:', err.message, err.code); } else if (err instanceof CryptoError) { console.error('Crypto error:', err.message); } else if (err instanceof StreamError) { console.error('Stream error:', err.message, err.code); } throw err; }

Error class

When

SessionError

WebSocket connection or session setup failed

CryptoError

Key derivation, encryption, or decryption failed

AttestationError

TEE attestation verification failed

StreamError

Server returned an error during inference

For HTTP-status-coded gateway errors (auth, rate limits, CU limits) see Error reference.

Retry policy

The SDK does not retry. Each inference() call performs exactly one WebSocket dial and one session setup; listModels() issues a single fetch. Transient drops, 503 responses, and DNS hiccups all surface to the caller as thrown errors with no retry attempt.

Wrap calls in your own backoff. See the Production guide for the recommended pattern (exponential backoff with jitter, capped retry budget, respect for Retry-After).

API reference

Constructor

const client = new CocoonClient(baseURL, options?);

baseURL is the Cocoon WebSocket endpoint, typically wss://.... ws:// is accepted for local development. The HTTP-shim endpoints are derived by replacing the scheme with https:///http://.

Client options

interface CocoonClientOptions { apiKey?: string; modelsPath?: string; streamPath?: string; attestationPolicy?: AttestationPolicy | null; WebSocket?: unknown; }

Option

Type

Description

apiKey

string

Bearer token for authentication.

modelsPath

string

Override the path used by listModels (SDK default is currently /v1/cocoon/models; pass /v1/models explicitly — see Quick start).

streamPath

string

Override the WebSocket inference path (default /v1/cocoon/stream).

attestationPolicy

AttestationPolicy \| null

TDX verification policy. Pass null to disable verification (not recommended).

WebSocket

unknown

WebSocket constructor for Node.js (pass the ws library).

AbortSignal cancellation is per call, not per client — see Cancellation.

Client methods

Method

Returns

Description

listModels(signal?)

Promise<Model[]>

Fetch available models from the OpenAI-shim.

inference(req, signal?)

Promise<InferenceStream>

Open a TEE-encrypted streaming session.

InferenceRequest

interface InferenceRequest { model: string; // Model ID (e.g. "Qwen/Qwen3-32B") messages: Message[]; // Chat messages maxTokens?: number; // Max tokens to generate stream?: boolean; // Enable streaming (default: false) disclose?: DiscloseField[]; // Selective disclosure fields chatTemplateKwargs?: ChatTemplateKwargs; } interface Message { role: string; // "system", "user", or "assistant" content: string; }

InferenceStream

Property / method

Description

for await (const chunk of stream)

Iterate over response chunks.

stream.usage

TEE-signed token usage after stream completes (null during streaming).

stream.effectiveDisclose

Disclosure fields negotiated with the TEE.

stream.error

Error that stopped the stream (null on success).

stream.close()

Close the WebSocket connection.

interface Chunk { content: string; // Generated text reasoningContent: string; // Chain-of-thought (if model supports it) }

Usage

interface Usage { promptTokens: number; cachedTokens: number; completionTokens: number; reasoningTokens: number; totalTokens: number; model: string; proxyStartTime?: number; proxyEndTime?: number; workerStartTime?: number; workerEndTime?: number; workerDebug?: string; proxyDebug?: string; attestation?: Attestation; verified: boolean; // true if attestation signature is valid } interface Attestation { usageHash: string; // SHA-256 of raw usage JSON signature: string; // Base64 Ed25519 signature from TEE sessionId: string; }

Model

interface Model { id: string; object: string; ownedBy: string; activeWorkers: number; coefficientMin: number; coefficientBucket50: number; coefficientMax: number; }
Last modified: 08 May 2026