Skip to main content
Use WebRTCSession in the browser for WebRTC voice. Mint the credential on the server with Client.generateSessionToken and TRANSPORT_WEBRTC, then pass webrtc_token (LiveKit media-room JWT), url, room, and identity to the client. The response also includes a platform token (session JWT) for other APIs; use webrtc_token for WebRTCSession.connect.

Session token

API reference for the session-token request and response fields.

JavaScript SDK

Package install and SDK entry points.

Connection (message types)

WebSocket Session message catalog. WebRTCSession uses the same event names for signaling except it does not emit response_audio (agent speech is only on the WebRTC audio path into remoteAudioContainer).

Prerequisites

  • @vatel/sdk
  • Organization API key on the server only, agent UUID, WebRTC enabled on your stack

generateSessionToken (WebRTC)

import { Client, TRANSPORT_WEBRTC } from "@vatel/sdk";

const client = new Client({
  baseUrl: process.env.VATEL_BASE_URL ?? "https://api.vatel.ai",
  getToken: () => process.env.VATEL_API_KEY,
});

const { data, status } = await client.generateSessionToken(agentId, {
  transport: TRANSPORT_WEBRTC,
});
On the client, call WebRTCSession.connect with data.webrtc_token as token (the LiveKit media-room JWT), plus data.url and data.room — not data.token (the platform session JWT for WebSocket and other APIs). data.identity is part of the WebRTC payload. Check status and data for failures like any other SDK call.

Connect WebRTCSession (client)

import { WebRTCSession } from "@vatel/sdk";

const session = new WebRTCSession({
  remoteAudioContainer: document.getElementById("remote-audio"),
});

await session.connect({
  token: credential.webrtc_token,
  url: credential.url,
  room: credential.room,
});

await session.start();
await session.setMicrophoneEnabled(true);
credential is the response from the generateSessionToken result.

Session events

WebRTCSession uses the same signaling-style events as a WebSocket Session, except response_audio is not fired — agent audio is delivered only as a remote media stream (attach with remoteAudioContainer or your own playback). Subscribe with session.on(eventName, (msg) => { ... }). Each msg has type, timestamp, and data (fields depend on type).
EventPurposeTypical msg.data
session_startedSession is liveid — session identifier
session_endedSession closedMay be an empty object; treat as end-of-call
response_textAgent text for the current turntext, turn_id
input_audio_transcriptSTT of user speechtranscript
speech_startedUser speech detected (VAD)emulated — if true, VAD did not fire but a transcript arrived, so start-of-speech is synthetic
speech_stoppedUser speech segment ended (VAD)Usually empty
interruptionUser cut off the agent while it was speakingUsually empty
tool_callAgent invoked a client tooltoolCallId, toolName, arguments (array of parameter descriptors with optional value)
For base64 response_audio chunks, use a WebSocket Session instead of WebRTCSession. Reply to tools with session.sendToolCallOutput(toolCallId, outputString) (async in some builds; use .catch(...) if needed).
session.on("session_started", (msg) => {
  console.log("session", msg.data?.id);
});
session.on("session_ended", () => console.log("ended"));
session.on("response_text", (msg) => console.log("agent:", msg.data?.text));
session.on("input_audio_transcript", (msg) => console.log("you:", msg.data?.transcript));
session.on("speech_started", (msg) => console.log("speech", msg.data?.emulated ? "emulated" : "vad"));
session.on("speech_stopped", () => console.log("speech stopped"));
session.on("interruption", () => console.log("interruption"));
session.on("tool_call", async (msg) => {
  const id = msg.data?.toolCallId;
  if (id) await session.sendToolCallOutput(id, "ok").catch(() => {});
});

See also

TopicLink
Browser voice demo (Next.js)Next.js
SDK overviewJavaScript SDK