The TypeScript SDK provides end-to-end encrypted Cocoon inference for Node.js (>= 18) and modern browsers. It performs an X25519 key exchange with the in-TEE proxy, encrypts every prompt and response chunk with AES-GCM under the session key, verifies the proxy's TDX attestation against a built-in allow-list of measurements, and surfaces TEE-signed token usage at the end of each request. Streaming responses are exposed as an async iterable of Chunk values; the Usage object on the stream carries an Ed25519 signature the SDK validates locally.
The SDK has no retry behaviour and no client-side rate limiting. Callers are expected to wrap calls in their own backoff. See Retry policy below.
Installation
npm install @alphatoncapital/shroud-sdk
For Node.js, also install a WebSocket implementation:
npm install ws
Migration from OpenAI
The SDK does not expose an OpenAI-compatible surface. If you have existing code talking to the OpenAI SDK and want to keep that shape, point your OpenAI client at the Shroud gateway's /v1/chat/completions endpoint — the gateway accepts the OpenAI request body and returns OpenAI responses. See Migrate from OpenAI and the OpenAI-compatible API reference.
Use the Cocoon TypeScript SDK when you need end-to-end encryption between your process and the TEE, attestation verification, and TEE-signed usage receipts. The HTTP path terminates TLS at the gateway and does not provide any of those properties.
const models = await client.listModels();
for (const m of models) {
console.log(`${m.id} (owned by ${m.ownedBy})`);
}
listModels() accepts an optional AbortSignal for per-call cancellation: client.listModels(controller.signal).
Selecting a Cocoon network
The default paths follow the deployment's default_network. To pin the client to a specific network — for example cocoon-classic regardless of the deployment default — set both Cocoon-bound paths to the network-prefixed form:
InferenceStream implements AsyncIterableIterator<Chunk>. Iterate with for await, then read stream.usage and check stream.error:
const stream = await client.inference({
model: 'Qwen/Qwen3-32B',
messages,
stream: true,
});
for await (const chunk of stream) {
process.stdout.write(chunk.content);
}
if (stream.error) {
throw stream.error;
}
const usage = stream.usage;
Always check stream.error after the loop. A network drop mid-stream ends iteration cleanly but surfaces only through stream.error; without the check, a truncated response looks like a complete one.
Selective disclosure
Control which usage fields the TEE reveals to the gateway. By default the TEE returns only token totals; opt in to per-request fields by listing them in disclose.
The TS SDK exposes chatTemplateKwargs on InferenceRequest, mirroring the vLLM/sglang chat_template_kwargs convention. The only field today is enableThinking, which the TEE forwards as enable_thinking to the worker. Pass it to opt models with reasoning support into chain-of-thought emission:
The Go SDK does not expose chat_template_kwargs today; see the note in Cocoon Go SDK.
Cancellation
Use AbortSignal per call. The constructor does not accept a client-wide signal; pass one to each operation.
const controller = new AbortController();
setTimeout(() => controller.abort(), 30_000);
const stream = await client.inference(request, controller.signal);
for await (const chunk of stream) {
process.stdout.write(chunk.content);
}
const models = await client.listModels(controller.signal);
Attestation verification
By default, the SDK verifies Intel TDX attestation quotes against a built-in allow-list of Cocoon proxy image measurements. When the quote is missing, malformed, signed by an untrusted measurement, or its report_data does not bind to the proxy public key the gateway returned, inference() throws an AttestationError and the WebSocket is closed before any prompt is sent.
After the stream completes, usage.verified reports whether the per-usage Ed25519 signature checked out against the session's TEE public key. The SDK does not currently fail closed when verification fails — the stream still yields content and verified === false. Treat unverified usage as untrusted and reject the response in your own code:
const usage = stream.usage;
if (!usage || !usage.verified) {
throw new Error('cocoon: usage attestation verification failed');
}
import {
SessionError,
CryptoError,
AttestationError,
StreamError,
} from '@alphatoncapital/shroud-sdk';
try {
const stream = await client.inference(request);
for await (const chunk of stream) {
process.stdout.write(chunk.content);
}
if (stream.error) {
throw stream.error;
}
} catch (err) {
if (err instanceof AttestationError) {
console.error('Attestation failed:', err.message);
} else if (err instanceof SessionError) {
console.error('Session error:', err.message, err.code);
} else if (err instanceof CryptoError) {
console.error('Crypto error:', err.message);
} else if (err instanceof StreamError) {
console.error('Stream error:', err.message, err.code);
}
throw err;
}
Error class
When
SessionError
WebSocket connection or session setup failed
CryptoError
Key derivation, encryption, or decryption failed
AttestationError
TEE attestation verification failed
StreamError
Server returned an error during inference
For HTTP-status-coded gateway errors (auth, rate limits, CU limits) see Error reference.
Retry policy
The SDK does not retry. Each inference() call performs exactly one WebSocket dial and one session setup; listModels() issues a single fetch. Transient drops, 503 responses, and DNS hiccups all surface to the caller as thrown errors with no retry attempt.
Wrap calls in your own backoff. See the Production guide for the recommended pattern (exponential backoff with jitter, capped retry budget, respect for Retry-After).
API reference
Constructor
const client = new CocoonClient(baseURL, options?);
baseURL is the Cocoon WebSocket endpoint, typically wss://.... ws:// is accepted for local development. The HTTP-shim endpoints are derived by replacing the scheme with https:///http://.