OpenClaw
Make an OpenClaw agent you already run answer the phone in its own context. The @trysaperly/voice-openclaw extension binds Saperly's manual-mode websocket, so only text crosses to your agent — directives out, no audio on your machine.
A connector makes an agent you already run phone-reachable without building anything yourself. It holds a Saperly manual-mode websocket and bridges it to your running agent: the caller's transcribed turn goes in, your agent's reply comes out as a directive, and the agent connects out — so it needs no public URL. Speech-to-text and text-to-speech stay in-network; no call audio ever reaches your machine.
The @trysaperly/voice-openclaw connector is an OpenClaw extension loaded by the
Gateway. It registers the saperly_voice_reply agent tool and holds the
manual-mode websocket, so each call routes to a stable per-call session
(saperly-voice:<call_id>) and a multi-turn call stays in one conversation.
Install
Install the plugin into your gateway from npm, then enable it (the plugin id
is saperly-voice; the npm package is @trysaperly/voice-openclaw):
openclaw plugins install npm:@trysaperly/voice-openclaw
openclaw plugins enable saperly-voiceConfigure
Point it at a manual-mode connection with the
connection id and its manual secret — set them in openclaw.json5 under
plugins.entries.saperly-voice.config, or via the SAPERLY_BASE_URL /
SAPERLY_CONNECTION_ID / SAPERLY_MANUAL_SECRET environment variables (env
wins, so the secret can stay out of the file). Then answer each caller turn with
the saperly_voice_reply tool, echoing the turn's request_id.
{
plugins: {
enabled: true,
allow: ["saperly-voice"],
entries: {
"saperly-voice": {
enabled: true,
config: {
baseUrl: "https://api.saperly.com",
connectionId: "conn_123",
// manualSecret: "mc_…", // prefer SAPERLY_MANUAL_SECRET in env
},
},
},
},
}Full config, launch steps, and the reply/directive reference: Voice channels → OpenClaw.
Not the OpenClaw voice-call plugin
OpenClaw also ships a separate voice-call plugin that streams raw call audio
to a realtime provider and holds a media socket for the call's duration — which
Saperly deliberately avoids. @trysaperly/voice-openclaw binds the manual-mode
websocket instead: signaling, speech-to-text, and text-to-speech stay
in-network, and only text turns reach your process.
Related
- Claude Code connector — the same idea for a Claude Code agent.
- Voice channels — call your own agent and talk to it in its context (the full connector walkthrough).
- Manual mode — the websocket protocol connectors are built on, if you want to roll your own.
MCP
Saperly exposes its tools over the Model Context Protocol so any agent framework can provision numbers, send SMS, and place calls as first-class tools — over Streamable HTTP with a bearer sk_ key.
Claude Code
Make a Claude Code agent answer the phone in its own context. The saperly-voice channel (an MCP server) binds Saperly's manual-mode websocket, so only text crosses to your agent — directives out, no audio on your machine.