Web widget, phone (IVR), or WhatsApp — one AI agent handles them all. Bring your own API keys from OpenAI, Anthropic, Google, Groq, Inworld AI, or OpenRouter — or use our platform-managed GPU models. From $0.015/min platform-managed GPU, or use BYOK with 7+ AI providers.
Trusted by businesses in India and the US · No credit card required
Channels
One AI agent. Four channels. The same intelligent conversation — whether they click a widget, call a number, or message on WhatsApp or Telegram.
One line of code. Instant voice on your site.
Embed a voice button into any website or web app. Customers click, speak, and get answers — no app download, no phone number. WebRTC delivers crystal-clear audio in the browser.
<script src="https://widget.scandeer.ai/widget.js" data-api-key="YOUR_KEY" data-agent-id="YOUR_AGENT"> </script>
Your AI agent answers every call.
Give tenants virtual phone numbers. Inbound calls route to your AI agent via SIP bridge — no traditional IVR menus, just natural conversation. Powered by Twilio and Telnyx.
The channel your customers already use.
Connect your WhatsApp Business Account. When customers send a message or tap your number, your AI agent responds — text or voice. Powered by Meta Cloud API with full voice call support.
Reach users on the fastest-growing messenger.
Connect a Telegram bot in seconds. Your AI agent handles text and voice messages with the same intelligence — inline replies, voice notes, and seamless human escalation.
Real-time contact timeline, full history, AI briefs, and live escalation queue — the command center for your support team.
Raj Kumar
3 calls
Call in progress · 2m 14s
SupportBot · Escalated
Call history · 3
Today · 2:41 PM · 4m 12s
SupportBot — order enquiry → escalated
Yesterday · 11:05 AM · 2m 53s
SupportBot — shipping update
Mar 6 · 4:22 PM · 1m 38s
SalesBot — product inquiry
How it works
Open the chat panel and tell the AI what your agent should do. It creates the agent, attaches knowledge bases, configures tools, sets up channels — all through natural language. 18 tools, zero forms.
Pick from 1,200+ MCP integrations or 400+ n8n workflows. Enable Web widget, Phone, WhatsApp, or Telegram — each channel routes to the same agent.
Drop a single embed into your website. Or buy a phone number. Or link your WhatsApp Business account. Customers reach you in under 800ms.
Multilingual
17 languages across 3 model tiers. Purpose-built Indic STT/TTS models deliver native-quality Hindi, Telugu, Tamil, and Bengali accents — not robotic translations. Every Indian language gets a dedicated model pipeline, not generic Whisper fallbacks.
Indian languages use dedicated Indic models — not English models forced into Hindi. Native accent, natural prosody, regional pronunciation.
BYOK providers (OpenAI, Anthropic, Google, Groq, Inworld AI, OpenRouter) use base compute cost for all tiers. Mix providers for best quality per language.
Platform
Every layer engineered to minimise latency, maximise accuracy, and eliminate SaaS dependency.
Parakeet TDT + Llama 3.3 70B + CosyVoice2 — fully self-hosted, fully streaming. VAD barge-in under 200ms. No SaaS latency taxes.
Learn more →
No keypad menus. Caller describes what they need — AI routes to Sales, Support, or any department, or handles it directly. PSTN/SIP bridge turns any phone number into an AI-powered room.
Learn more →
Connect your WhatsApp Business Account. Text and voice messages handled by the same AI agent. Dominant channel in India, SE Asia, LATAM.
Learn more →
Create automations in n8n, assign them per agent, and let VOX extract parameters from conversation to fire them — CRM, calendar, ticketing, payments — mid-call, automatically.
Learn more →
Tag documents by agent or department. Contextual Retrieval (49% more accurate than naive RAG) injects only relevant chunks per query — strictly from your content, zero hallucinations outside it.
Learn more →
Isolated agents, API keys, knowledge bases, analytics and billing per tenant. Team management with roles (admin/agent/viewer). Serve hundreds on one deployment.
Learn more →
~$0.015/min platform-managed GPU vs $0.07–0.10/min for Retell or Vapi. No per-minute SaaS fees. Self-hosted GPU workers with real-time health monitoring.
Learn more →
Use OpenAI, Anthropic, Google, Groq, Inworld AI (200+ models), or OpenRouter (400+ models) — or mix STT/LLM/TTS across providers. Your API key, your bill. No markup.
Learn more →
Platform-managed GPU models for lowest cost, or BYOK from OpenAI, Groq, Google, Anthropic, Inworld AI, and OpenRouter. Pick STT, LLM, and TTS independently per agent.
Learn more →
Contact-centric history panel — like WhatsApp Web for voice calls. Live active calls with elapsed timers, inline transcripts, copy-to-clipboard, full-text search. Real-time via WebSocket.
Learn more →
AI detects when to hand off. Call queues to the right team in real time with full transcript context and ringback tone. One click for your agent to take over — zero hold music, zero cold start.
Learn more →
Tell the AI what you need in plain English. It creates agents, knowledge bases, toolboxes, tags, channels, and groups — all through natural language. No forms, no code, no guesswork.
Learn more →
Assign color-coded tags to documents, workflows, and tools. Per-turn filtering ensures agents only access content relevant to each conversation — no leaking across departments.
Learn more →
Multi-model pool with heuristic routing. Fast models handle simple queries, powerful models tackle complex ones. Tier-based fallback, health tracking, and tag-driven routing hints — automatic cost optimization.
Learn more →
Visual inference trace shows exactly what happened: STT timing, LLM reasoning, TTS synthesis, tool calls — all in a waterfall timeline. Debug latency, understand routing decisions, and optimize per-turn performance.
Learn more →
Voice pipeline — open-source · self-hosted · GPU-accelerated
Total target latency: <800ms · Platform-managed GPU workers · RTX 4090 / A100
AI Providers
Every voice call flows through three AI stages — Speech-to-Text, Language Model, and Text-to-Speech. Choose a provider for each slot independently, or let our platform handle it all.
Caller speaks → text
Platform Managed
Parakeet TDT 0.6B
English — 150ms, streaming
IndicWhisper
Hindi, Bengali, Urdu
Vakyansh
Telugu, Tamil, Malayalam, Kannada
Bring Your Own Key
Groq Whisper v3 Turbo
Ultra-fast, free tier
OpenAI Whisper
Industry standard
Google Chirp
125+ languages
Inworld AI STT
200+ models, smart routing
Understands → reasons → responds
Platform Managed
Llama 3.3 70B INT8
Default — 300ms first token
Qwen 2.5 72B
Best for CJK languages
Bring Your Own Key
GPT-4o / GPT-4o-mini
OpenAI — versatile
Claude Sonnet / Opus
Anthropic — best reasoning
Gemini Pro / Flash
Google — multimodal
Groq Llama / Mixtral
LPU — fastest inference
OpenRouter 400+ models
Any model, one API
Text → natural voice
Platform Managed
CosyVoice2 0.5B
EN, ZH, JA — 150ms streaming
IndicF5
Hindi, Telugu, Tamil + 7 Indic
IndicTTS (IIT Madras)
Dravidian languages
XTTS v2
European languages
MeloTTS
Fast multilingual fallback
Bring Your Own Key
OpenAI TTS-1 / TTS-1-HD
6 voices, natural
Google Chirp3 HD
8 voices per language
Groq Orpheus
English + Arabic, expressive
Inworld AI TTS
Low latency, multi-voice
Bring Your Own Key
Use our platform-managed GPU models at $0.015/min — or bring your own API keys from OpenAI, Anthropic, Google, Groq, Inworld AI, or OpenRouter. Mix and match STT, LLM, and TTS from different providers. Pay only for what you use.
VOX Platform
Self-hosted open-source models. Lowest cost.
Groq
LPU inference. Ultra-fast Llama & Whisper.
Inworld AI
Top TTS + STT + LLM router. One key.
OpenRouter
400+ LLM models. OpenAI-compatible.
OpenAI
GPT-4o, Whisper, TTS-1. Industry standard.
Gemini Pro & Flash. Multimodal.
Anthropic
Claude Opus, Sonnet, Haiku. Best reasoning.
Custom Mix
Mix any STT + LLM + TTS provider.
Add your API key
Paste your OpenAI, Anthropic, Google, Groq, Inworld AI, or OpenRouter API key in agent settings. Keys are encrypted and never leave your tenant.
Pick your models
Choose STT, LLM, and TTS independently — or use a preset card that auto-configures the best combination per provider.
Pay per use, at cost
Your API key, your bill. No VOX markup on BYOK usage. Platform fee covers orchestration, analytics, and channels only.
Custom Mix — full flexibility
Combine Groq Whisper for STT, Claude for reasoning, and OpenAI TTS for voice — or any other combination. Each model slot is independently configurable.
BYOK costs are estimates based on typical conversation length (~30s user speech, ~45s agent speech per minute). Actual cost depends on token usage and provider pricing.
Workflow Intelligence
Book appointments, process orders, look up accounts, create tickets — your AI agent executes real business actions mid-conversation using n8n workflows and MCP tools. Fully secured within your organization.
How it works — from voice to action
Customer speaks
"I'd like to book an appointment for Friday afternoon"
Agent understands intent
Extracts: action=book, date=Friday, time=afternoon. Asks follow-ups if needed.
Workflow executes
n8n webhook fires → checks calendar → books slot → sends confirmation SMS
Agent confirms
"Done! You're booked for Friday at 2 PM. You'll get a confirmation text shortly."
Every tenant gets a dedicated, sandboxed n8n instance running inside your infrastructure. Build multi-step automations visually — no code, no external dependencies, no data leaving your network.
400+
Pre-built integrations
Per-agent
Workflow assignment
AES-256
Credential encryption
0
Data leaves your infra
Model Context Protocol gives your agent instant access to 1,200+ tool integrations. Sub-200ms execution. Self-hostable. Call any CRM, database, API, or payment processor natively — inside a live voice conversation.
1,200+
Tool integrations
<200ms
Execution latency
Self-hosted
MCP servers
Open
Standard protocol
What your agents can do
Agent asks for date/time preferences, checks availability via n8n → Google Calendar, books the slot, sends SMS confirmation.
Customer orders by voice. Agent extracts items, confirms details, triggers n8n → POS/ERP workflow, sends order confirmation.
Agent verifies identity, queries CRM/database via MCP, reads back balance, transaction history, or policy details — securely.
Agent captures the issue, creates a Zendesk/Freshdesk ticket with full transcript via n8n, emails the customer a ticket number.
Agent collects payment intent, triggers Stripe/Razorpay workflow via n8n, confirms payment status back to the customer.
Customer asks about delivery. Agent queries logistics API via MCP, provides real-time tracking status and ETA.
Integrations — connect VOX to any backend
1,200+
MCP integrations
The open standard for AI tool calls. Sub-200ms. Self-hostable. VOX calls any MCP server natively — CRM, ticketing, calendar, database, payments — inside a live voice conversation.
400+
n8n workflow templates
Complex multi-step workflows with visual editing and 400+ pre-built integrations. VOX calls your n8n webhooks with structured payloads extracted from conversation — you build the logic, VOX triggers it.
Plus direct REST API calls, PostgreSQL, MongoDB, Qdrant vector search, Typesense geo+structured search — VOX connects to anything.
FAQ
Web widget, phone, or WhatsApp — all channels, 17 languages. Platform GPU models from $0.015/min or BYOK with 7+ AI providers including OpenAI, Claude, Gemini, Groq, Inworld AI, and OpenRouter. No SaaS lock-in.