Backed by Y Combinator

The API for AI phone calls

Send a prompt. The call happens. JSON comes back.

$0.05 /min all-in7 lines of code
import callingbox
call = callingbox.calls.create(
    to="+15551234567",
    prompt="Confirm appointment tomorrow 2pm",
    context={ "patient": "Maria Lopez" },
    returns={ "confirmed": "boolean" }
)
call dispatched · ai talks · json returned
+1 (680) 215-4060

Talk to a CallingBox agent right now

or listen to an example call

How it works

You send a request. We make the call.

Specify what you need from the call, trigger it from a POST request. You get structured JSON back.

1
Request
POST /v1/calls
to = "+1555..."
prompt = "Confirm appt"
returns = { confirmed }
2
Call
Tool call

check_availability(date="Thu")

3
Result
{
"confirmed":
}

Examples

What will you automate?

Write the prompt. CallingBox handles the call and returns structured data, a transcript, and a recording.

POST /v1/calls200 OK
{
"to": "+1 (555) 234-8901"
"prompt": "Confirm appointment tomorrow 2pm"
"context": { "patient": "Maria Lopez" }
"returns": { "confirmed": "bool" }
}
Same endpoint, same SDK, same webhook
0:34
AI

Hi, calling about your 2pm appointment tomorrow with Dr. Chen.

Patient

Yes, I'll be there.

{
"confirmed": true,
"reschedule_to": null
}
POSTwebhook delivered
200 OK

Quality

The best voice quality available

Normalized quality scores across key dimensions

Higher is better ↑

MOS
Turn F1
Recovery

Lower is better ↓

WER
Halluc.
Latency (ms)

Detailed results

Metric
CallingBox
Self-built
Best commercial
Human
Voice naturalness (MOS ↑)
4.31
3.92
4.08
4.50
Response latency p50 (ms ↓)
340
920
680
~300
Word error rate (↓)
2.9%
5.2%
4.1%
2.1%
Hallucination rate (↓)
0.9%
8.4%
3.2%
0.5%
Turn-taking F1 (↑)
0.97
0.81
0.89
0.99
Interruption recovery (↑)
92%
54%
71%
96%

Evaluation on 12,400 real phone calls across US English, noisy environments (SNR 10–25dB), diverse accents. MOS rated by 3 independent evaluators per sample (ITU-T P.808). Latency measured end-to-end including network. Hallucination scored against ground-truth structured outputs. Human baseline from professional call center agents on same dataset. ★ = best non-human result. 95% CI, p < 0.001 for all CallingBox vs. self-built comparisons.

Why CallingBox

Building AI phone calls was hard and expensive. Until now.

Without CallingBox

~$0.12/min
1Server infra
SIPWebSocketsMedia
2Telephony
ProviderNumbersRouting
3Speech-to-text
StreamingChunkingVAD
4Language model
PromptsStateTools
5Text-to-speech
VoicesEncodingStreaming
6Orchestration
TurnsInterruptsErrors
7Data extraction
ParsingValidationSchema
8Operations
LogsAlertsRecovery
8 layers to build, test, and maintain
Multiple vendor contracts
Weeks of engineering

With CallingBox

$0.05/min
callingbox.create(   to="+1555...",   prompt="..." )

That's it.

cheaper
< 500ms
latency
7
lines of code

Ship it today

Create an account and make your first AI phone call in minutes. No credit card required.

FAQ

Common questions