Turn-taking - Developers

The Turn-taking API runs the timing of an agent in a live chat so it behaves like a person rather than a request/response bot. You feed it the messages as they arrive; it decides whether the agent should speak now or stay silent, and when the agent does reply, it paces that reply out as 1–5 short messages delivered a beat apart — with a typing indicator — over a WebSocket. You drive it with three calls per conversation and listen on one WebSocket:

open_thread — start (or re-open) a chat thread and get a connect_url to stream the agent’s messages.
submit_messages — hand it each batch of inbound messages; it returns a speak / stay_silent decision.
respond — when the decision is speak, submit your agent’s drafted reply; it is paced out and delivered on the thread’s WebSocket.

record_event is an optional fourth call for reporting non-message activity (a user starts or stops typing, edits a message).

The integration loop

A typical client wires turn-taking into a chat UI like this:

Open a thread when the conversation starts. Keep the returned thread.id, and open a WebSocket to realtime.connect_url.
Stream the agent’s messages from that WebSocket for the life of the thread (see Receiving messages from the agent).
On every inbound human message (or small batch), call submit_messages. Keep the turn_epoch it returns.
If the decision is stay_silent, do nothing — wait for the next inbound message. If it is speak, have your agent draft a reply and call respond with that draft and the turn_epoch from step 3.
The reply is paced into chat messages and pushed to you over the WebSocket — render them as they arrive. Repeat from step 3.

 inbound messages ──▶ submit_messages ──▶ decision
                                            │
                          stay_silent ◀─────┴─────▶ speak
                                                      │
                              your agent drafts a reply
                                                      │
                                                  respond
                                                      │
            paced agent messages  ◀── WebSocket ──────┘

Threads

A thread is one conversation and the unit of state. Open it once with open_thread:

Omit thread_id to start a fresh thread — a new id is minted and returned.
Pass a thread_id to re-open a specific thread. The id is an idempotency key: opening an id that doesn’t exist creates it; opening one that does re-issues its connect_url without creating a duplicate (this is the reconnect path). Opening a thread you don’t own reads as absent.

Threads are owner-scoped: they belong to the account whose token opened them, and another account cannot open or read them.

Receiving messages from the agent

open_thread returns a realtime grant with a short-lived connect_url. Open a WebSocket to that URL exactly as given — it already carries everything needed to attach to the thread’s channel:

const socket = new WebSocket(thread.realtime.connect_url);
socket.onmessage = (event) => {
  const envelope = JSON.parse(event.data);
  // handle envelope.type
};

Each frame is a JSON envelope:

string

A unique id for this delivery.

type

string

The event type — one of the events below.

channel

string

The thread channel the event belongs to, turn-taking-thread/{thread_id}.

string

When the event was emitted (ISO 8601).

data

object

The event payload, shaped by type (see below).

The thread channel carries three event types:

`type`	When	`data`
`turn_taking.message`	One naturalized chat message from the agent. A single `respond` emits 1–5 of these, a beat apart.	`{ message_id, thread_id, content, position, sent_at }`
`turn_taking.typing`	The agent’s ”… is typing” indicator toggled. Render it directly.	`{ thread_id, typing }`
`turn_taking.signal`	A behavioural signal about a person in the thread (for example a long silence, or typing without sending). Only sent when behavioural signals are enabled.	`{ thread_id, user_id, kind }`

turn_taking.message.position is the message’s 0-based order within the reply, so you can render a multi-message reply in sequence.

The connect_url is short-lived. If the socket drops or the URL expires, re-open the thread with its thread_id (open_thread) to get a fresh connect_url, then reconnect. Re-opening does not create a new thread or lose state.

Interruptions and `turn_epoch`

People talk over each other, and so do real users. Every submit_messages returns a turn_epoch — a counter for the thread’s current turn. You pass that same turn_epoch back in the matching respond. If a newer batch of messages arrives (a higher turn_epoch) before your draft is submitted, the conversation has moved on and your reply is stale. respond detects this, schedules nothing, bills nothing, and returns superseded: true. Draft against the latest decision and submit promptly.

Behavioural signals

When you open a thread you can enable behavioural signals by sending an integrations.social_signals block. With it on, turn-taking watches the timing of the conversation and surfaces two extra things:

Per-batch tags on the submit_messages response — short behavioural labels for the messages in that batch (for example ["fast", "comeback"]).
turn_taking.signal events on the WebSocket for activity that isn’t a message (for example a user going silent, or typing without sending).

These signals also inform the speak/stay-silent decision. To feed them, report typing and edits with record_event, and pass client_ts on inbound messages so timing is measured from the client clock. Without the integrations.social_signals block, turn-taking runs standalone: tags is always empty and no turn_taking.signal events are sent.

Authentication

Every request authenticates with a bearer token. See Authentication.

Authorization: Bearer <token>

Billing

submit_messages (the decision) and respond (the reply) are billable and metered in credits; open_thread and record_event are not. A respond that comes back superseded: true is not billed. See Credits and billing.

Errors

All actions return the standard error envelope. Common status codes across the turn-taking actions:

Status	Code	When
`401`	`UNAUTHORIZED`	The bearer token is missing, invalid, or expired.
`402`	`PAYMENT_REQUIRED`	Your account can’t cover a billable call (`submit_messages`, `respond`). You are not charged.
`403`	`forbidden`	The token is valid but not allowed here.
`422`	`VALIDATION_ERROR`	The body is malformed, or a field is missing or out of range.
`502`	`UPSTREAM_ERROR`	A dependency the request relies on was unavailable. Retry with backoff.

Open a thread — start a conversation and connect.
Submit messages — get a speak / stay-silent decision.
Respond — pace out the agent’s reply.
Record an event — report typing and edits.

​The integration loop

​Threads

​Receiving messages from the agent

​Interruptions and turn_epoch

​Behavioural signals

​Authentication

​Billing

​Errors

​Next

The integration loop

Threads

Receiving messages from the agent

Interruptions and `turn_epoch`

Behavioural signals

Authentication

Billing

Errors

Next