Vapi vs. Retell — when does each win?
Vapi wins when you need developer control — custom models, custom TTS / STT, custom webhook tools, custom transport. Best for teams that want to tune every surface. Retell wins when you want faster-to-launch defaults — more opinionated, cleaner out-of-the-box analytics, easier for straightforward intake / scheduling. We pick on the Day-1 scoping based on workflow complexity, team bandwidth, and compliance requirements.
What kind of latency is achievable?
Sub-second first-response on the tuned build. Typical breakdown: STT (Deepgram Nova-3) 200–300ms, LLM (Claude Sonnet with caching, or GPT-5-mini) 400–700ms, TTS (ElevenLabs Flash or Cartesia) 100–200ms. Total: ~800–1200ms end-to-end per turn. Above 1.5 seconds callers describe it as 'robotic'; we tune explicitly to stay under that threshold.
Does the HIPAA path work with any TTS / STT vendor?
No. HIPAA requires BAAs with every data processor. BAA-covered TTS: ElevenLabs (enterprise tier), Cartesia (enterprise). BAA-covered STT: Deepgram, AssemblyAI. BAA-covered LLM: Anthropic via Bedrock, OpenAI via Azure. We coordinate every BAA in the call path; it adds about $1,500 to the engagement. Transcript encryption and PHI redaction are included.
Can the voice agent handle multiple intents?
Yes. The LLM classifier in the agent loop decides per turn what the caller is asking about (book, reschedule, cancel, billing question, support issue, other) and routes accordingly. We wire explicit escalation triggers for 'other' — not every caller fits the happy-path workflows, and warm-transfer is better than a robotic 'I don't understand.'
What does it cost to run a Vapi agent in production?
Per-minute all-in cost: $0.10–$0.30 depending on model tier (Haiku vs. Sonnet vs. Opus), TTS choice (Cartesia cheaper, ElevenLabs richer), and platform fee. A 5-minute call costs roughly $0.50–$1.50. Versus $2–$4 for a human agent at $25/hr. Above 500 conversation-minutes/day the economics justify enterprise pricing negotiations; we map that on handoff.
What's the typical engagement shape?
Voice Agent Launch ($4,999 / 2 weeks) is the most common — one production voice workflow (intake, qualification, follow-up, or scheduling), live on your phone number, wired to CRM, with analytics, escalation, and HIPAA path if needed. For multi-workflow or custom transport builds, we scope a custom engagement on Day 1.