Self-hosted · BYOK · Multi-Tenant · Production-Ready

AI voice agents for every customer channel.

Web widget, phone (IVR), or WhatsApp — one AI agent handles them all. Bring your own API keys from OpenAI, Anthropic, Google, Groq, Inworld AI, or OpenRouter — or use our platform-managed GPU models. From $0.015/min platform-managed GPU, or use BYOK with 7+ AI providers.

Trusted by businesses in India and the US · No credit card required

4
Channels: Web, Phone, WhatsApp, Telegram
17
Supported languages
<800ms
End-to-end latency
~$0.015
Cost per minute (self-hosted)
7+
AI Providers (BYOK)
15
Platform features

Channels

Meet customers wherever they are.

One AI agent. Four channels. The same intelligent conversation — whether they click a widget, call a number, or message on WhatsApp or Telegram.

Live

Web Widget

One line of code. Instant voice on your site.

Embed a voice button into any website or web app. Customers click, speak, and get answers — no app download, no phone number. WebRTC delivers crystal-clear audio in the browser.

<script src="https://widget.scandeer.ai/widget.js"
  data-api-key="YOUR_KEY"
  data-agent-id="YOUR_AGENT">
</script>
Live

Phone / AI IVR

Your AI agent answers every call.

Give tenants virtual phone numbers. Inbound calls route to your AI agent via SIP bridge — no traditional IVR menus, just natural conversation. Powered by Twilio and Telnyx.

TwilioTelnyx
Live

WhatsApp

The channel your customers already use.

Connect your WhatsApp Business Account. When customers send a message or tap your number, your AI agent responds — text or voice. Powered by Meta Cloud API with full voice call support.

Meta Cloud APIVoice + Text
Live

Telegram

Reach users on the fastest-growing messenger.

Connect a Telegram bot in seconds. Your AI agent handles text and voice messages with the same intelligence — inline replies, voice notes, and seamless human escalation.

Bot APIVoice Notes
All channels converge to the same LiveKit room → same agent worker — one codebase, consistent experience everywhere
Dashboard

Your entire call operation. One screen.

Real-time contact timeline, full history, AI briefs, and live escalation queue — the command center for your support team.

Live product preview
lockapp.scandeer.ai/conversations

Conversations

searchSearch contacts…
RK
Raj Kumar2m
SupportBot· Live
SA
Sarah A.1h
SalesBot
MJ
Mike J.3h
SupportBot
PL
Priya L.1d
SalesBot
RK

Raj Kumar

3 calls

Active call
phone_in_talk

Call in progress · 2m 14s

SupportBot · Escalated

Call history · 3

expand_more

Today · 2:41 PM · 4m 12s

SupportBot — order enquiry → escalated

Escalated
person
Hi, I placed an order yesterday but haven't received a confirmation email.
smart_toy
I'd be happy to help with that! Could you share the email address used for the order?
smart_toy
I found your order #ORD-2891. The confirmation was sent — please check your spam folder. Is there anything else I can help with?
person
Actually I'd like to speak to a human agent please.
smart_toy
Of course! Connecting you to a support specialist now. Please hold for a moment.
chevron_right

Yesterday · 11:05 AM · 2m 53s

SupportBot — shipping update

Completed
chevron_right

Mar 6 · 4:22 PM · 1m 38s

SalesBot — product inquiry

Ended

How it works

From idea to live agent in minutes.

01

Describe your business to the AI Configurator

Open the chat panel and tell the AI what your agent should do. It creates the agent, attaches knowledge bases, configures tools, sets up channels — all through natural language. 18 tools, zero forms.

02

Connect tools and channels

Pick from 1,200+ MCP integrations or 400+ n8n workflows. Enable Web widget, Phone, WhatsApp, or Telegram — each channel routes to the same agent.

03

Deploy with one line of code

Drop a single embed into your website. Or buy a phone number. Or link your WhatsApp Business account. Customers reach you in under 800ms.

Multilingual

Your agent speaks their language.

17 languages across 3 model tiers. Purpose-built Indic STT/TTS models deliver native-quality Hindi, Telugu, Tamil, and Bengali accents — not robotic translations. Every Indian language gets a dedicated model pipeline, not generic Whisper fallbacks.

STTParakeet (EN) · IndicWhisper (Hindi, Bengali, Urdu) · Vakyansh (Telugu, Tamil, Malayalam) · Whisper large-v3 (all others)
LLMLlama 3.3 70B (most) · Qwen 2.5 72B (CJK) · Or BYOK: GPT-4o, Claude, Gemini — system prompt language injection
TTSCosyVoice2 (EN/ZH/JA) · IndicTTS IIT Madras (Dravidian) · IndicF5 (Indic multilingual) · XTTS v2 (European) · MeloTTS (fast multilingual) · Or BYOK: OpenAI TTS
Tier A — Base pricing
🇺🇸English
🇨🇳Chinese
🇮🇩Indonesian
Tier B — +20%
🇮🇳Hindi
🇧🇩Bengali
🇵🇰Urdu
🇫🇷French
🇩🇪German
🇪🇸Spanish
🇧🇷Portuguese
🇷🇺Russian
Tier C — +40% (specialist models)
🇮🇳Telugu
🇮🇳Tamil
🇮🇳Malayalam
🇸🇦Arabic
🇯🇵Japanese
🇰🇷Korean
🇮🇳

Indian languages use dedicated Indic models — not English models forced into Hindi. Native accent, natural prosody, regional pronunciation.

BYOK providers (OpenAI, Anthropic, Google, Groq, Inworld AI, OpenRouter) use base compute cost for all tiers. Mix providers for best quality per language.

Platform

Built for production at scale.

Every layer engineered to minimise latency, maximise accuracy, and eliminate SaaS dependency.

Voice Pipeline

Sub-800ms. Indistinguishable from human.

Parakeet TDT + Llama 3.3 70B + CosyVoice2 — fully self-hosted, fully streaming. VAD barge-in under 200ms. No SaaS latency taxes.

Learn more →

Smart IVR

Voice IVR that understands intent.

No keypad menus. Caller describes what they need — AI routes to Sales, Support, or any department, or handles it directly. PSTN/SIP bridge turns any phone number into an AI-powered room.

Learn more →

WhatsApp

The world's #1 messaging channel.

Connect your WhatsApp Business Account. Text and voice messages handled by the same AI agent. Dominant channel in India, SE Asia, LATAM.

Learn more →

Workflows

Build a workflow. Assign it to an agent. Done.

Create automations in n8n, assign them per agent, and let VOX extract parameters from conversation to fire them — CRM, calendar, ticketing, payments — mid-call, automatically.

Learn more →

Knowledge Base

Your agent answers only from what you teach it.

Tag documents by agent or department. Contextual Retrieval (49% more accurate than naive RAG) injects only relevant chunks per query — strictly from your content, zero hallucinations outside it.

Learn more →

Multi-Tenant

One platform. Unlimited businesses.

Isolated agents, API keys, knowledge bases, analytics and billing per tenant. Team management with roles (admin/agent/viewer). Serve hundreds on one deployment.

Learn more →

Self-Hosted

Your data. Your infra. Your margin.

~$0.015/min platform-managed GPU vs $0.07–0.10/min for Retell or Vapi. No per-minute SaaS fees. Self-hosted GPU workers with real-time health monitoring.

Learn more →

BYOK

Bring your own AI provider. Pay at cost.

Use OpenAI, Anthropic, Google, Groq, Inworld AI (200+ models), or OpenRouter (400+ models) — or mix STT/LLM/TTS across providers. Your API key, your bill. No markup.

Learn more →

AI Providers

12+ STT, LLM, and TTS models. Mix and match.

Platform-managed GPU models for lowest cost, or BYOK from OpenAI, Groq, Google, Anthropic, Inworld AI, and OpenRouter. Pick STT, LLM, and TTS independently per agent.

Learn more →

Conversations

Every caller. Every call. Full context.

Contact-centric history panel — like WhatsApp Web for voice calls. Live active calls with elapsed timers, inline transcripts, copy-to-clipboard, full-text search. Real-time via WebSocket.

Learn more →

Live Escalation

AI tries first. Your team steps in when it matters.

AI detects when to hand off. Call queues to the right team in real time with full transcript context and ringback tone. One click for your agent to take over — zero hold music, zero cold start.

Learn more →

AI Configurator

Describe your business. We build the agent.

Tell the AI what you need in plain English. It creates agents, knowledge bases, toolboxes, tags, channels, and groups — all through natural language. No forms, no code, no guesswork.

Learn more →

Content Tags

Tag everything. Route intelligently.

Assign color-coded tags to documents, workflows, and tools. Per-turn filtering ensures agents only access content relevant to each conversation — no leaking across departments.

Learn more →

Smart Router

The right model for every turn.

Multi-model pool with heuristic routing. Fast models handle simple queries, powerful models tackle complex ones. Tier-based fallback, health tracking, and tag-driven routing hints — automatic cost optimization.

Learn more →

Observability

See every millisecond of every call.

Visual inference trace shows exactly what happened: STT timing, LLM reasoning, TTS synthesis, tool calls — all in a waterfall timeline. Debug latency, understand routing decisions, and optimize per-turn performance.

Learn more →

Voice pipeline — open-source · self-hosted · GPU-accelerated

LiveKit WebRTC
Real-time audio transport
Silero VAD
Barge-in · 85ms
Parakeet TDT 0.6B
STT · 150ms
Llama 3.3 70B
LLM · 300ms first token
CosyVoice2 0.5B
TTS · 150ms

Total target latency: <800ms · Platform-managed GPU workers · RTX 4090 / A100

AI Providers

Pick your models. We wire the pipeline.

Every voice call flows through three AI stages — Speech-to-Text, Language Model, and Text-to-Speech. Choose a provider for each slot independently, or let our platform handle it all.

Platform Managed
Bring Your Own Key

Speech-to-Text

Caller speaks → text

Platform Managed

Parakeet TDT 0.6B

English — 150ms, streaming

IndicWhisper

Hindi, Bengali, Urdu

Vakyansh

Telugu, Tamil, Malayalam, Kannada

Bring Your Own Key

Groq Whisper v3 Turbo

Ultra-fast, free tier

OpenAI Whisper

Industry standard

Google Chirp

125+ languages

Inworld AI STT

200+ models, smart routing

Language Model

Understands → reasons → responds

Platform Managed

Llama 3.3 70B INT8

Default — 300ms first token

Qwen 2.5 72B

Best for CJK languages

Bring Your Own Key

GPT-4o / GPT-4o-mini

OpenAI — versatile

Claude Sonnet / Opus

Anthropic — best reasoning

Gemini Pro / Flash

Google — multimodal

Groq Llama / Mixtral

LPU — fastest inference

OpenRouter 400+ models

Any model, one API

Text-to-Speech

Text → natural voice

Platform Managed

CosyVoice2 0.5B

EN, ZH, JA — 150ms streaming

IndicF5

Hindi, Telugu, Tamil + 7 Indic

IndicTTS (IIT Madras)

Dravidian languages

XTTS v2

European languages

MeloTTS

Fast multilingual fallback

Bring Your Own Key

OpenAI TTS-1 / TTS-1-HD

6 voices, natural

Google Chirp3 HD

8 voices per language

Groq Orpheus

English + Arabic, expressive

Inworld AI TTS

Low latency, multi-voice

Mix freely — e.g. Groq STT + Claude LLM + OpenAI TTS. Each slot is independent per agent.
See all providers and models →

Bring Your Own Key

Your models. Your choice.

Use our platform-managed GPU models at $0.015/min — or bring your own API keys from OpenAI, Anthropic, Google, Groq, Inworld AI, or OpenRouter. Mix and match STT, LLM, and TTS from different providers. Pay only for what you use.

graphic_eq

VOX Platform

Self-hosted open-source models. Lowest cost.

bolt

Groq

LPU inference. Ultra-fast Llama & Whisper.

graphic_eq

Inworld AI

Top TTS + STT + LLM router. One key.

route

OpenRouter

400+ LLM models. OpenAI-compatible.

auto_awesome

OpenAI

GPT-4o, Whisper, TTS-1. Industry standard.

cloud

Google

Gemini Pro & Flash. Multimodal.

smart_toy

Anthropic

Claude Opus, Sonnet, Haiku. Best reasoning.

tune

Custom Mix

Mix any STT + LLM + TTS provider.

How BYOK works

1

Add your API key

Paste your OpenAI, Anthropic, Google, Groq, Inworld AI, or OpenRouter API key in agent settings. Keys are encrypted and never leave your tenant.

2

Pick your models

Choose STT, LLM, and TTS independently — or use a preset card that auto-configures the best combination per provider.

3

Pay per use, at cost

Your API key, your bill. No VOX markup on BYOK usage. Platform fee covers orchestration, analytics, and channels only.

tune

Custom Mix — full flexibility

Combine Groq Whisper for STT, Claude for reasoning, and OpenAI TTS for voice — or any other combination. Each model slot is independently configurable.

Cost comparison per minute

ProviderCost/minNote
VOX Platform (self-hosted)~$0.015/minOpen-source models on your GPU
Groq (BYOK)~$0.02/minLPU-accelerated, pay Groq directly
Inworld AI (BYOK)~$0.03/min200+ models, smart routing
OpenRouter (BYOK)~$0.03/min400+ models, OpenAI-compatible
OpenAI (BYOK)~$0.04/minGPT-4o-mini + Whisper + TTS-1
Anthropic + OpenAI (BYOK)~$0.05/minClaude reasoning + OpenAI STT/TTS
Retell AI$0.07/minSaaS competitor
Vapi$0.05–0.10/minSaaS competitor

BYOK costs are estimates based on typical conversation length (~30s user speech, ~45s agent speech per minute). Actual cost depends on token usage and provider pricing.

Workflow Intelligence

Your agent doesn't just talk. It acts.

Book appointments, process orders, look up accounts, create tickets — your AI agent executes real business actions mid-conversation using n8n workflows and MCP tools. Fully secured within your organization.

How it works — from voice to action

1

Customer speaks

"I'd like to book an appointment for Friday afternoon"

2

Agent understands intent

Extracts: action=book, date=Friday, time=afternoon. Asks follow-ups if needed.

3

Workflow executes

n8n webhook fires → checks calendar → books slot → sends confirmation SMS

4

Agent confirms

"Done! You're booked for Friday at 2 PM. You'll get a confirmation text shortly."

n8n — Your private workflow engine

Every tenant gets a dedicated, sandboxed n8n instance running inside your infrastructure. Build multi-step automations visually — no code, no external dependencies, no data leaving your network.

400+

Pre-built integrations

Per-agent

Workflow assignment

AES-256

Credential encryption

0

Data leaves your infra

MCP — Direct tool execution

Model Context Protocol gives your agent instant access to 1,200+ tool integrations. Sub-200ms execution. Self-hostable. Call any CRM, database, API, or payment processor natively — inside a live voice conversation.

1,200+

Tool integrations

<200ms

Execution latency

Self-hosted

MCP servers

Open

Standard protocol

What your agents can do

Book appointments

Agent asks for date/time preferences, checks availability via n8n → Google Calendar, books the slot, sends SMS confirmation.

Process orders

Customer orders by voice. Agent extracts items, confirms details, triggers n8n → POS/ERP workflow, sends order confirmation.

Account enquiries

Agent verifies identity, queries CRM/database via MCP, reads back balance, transaction history, or policy details — securely.

Create support tickets

Agent captures the issue, creates a Zendesk/Freshdesk ticket with full transcript via n8n, emails the customer a ticket number.

Process payments

Agent collects payment intent, triggers Stripe/Razorpay workflow via n8n, confirms payment status back to the customer.

Track shipments

Customer asks about delivery. Agent queries logistics API via MCP, provides real-time tracking status and ETA.

All workflows run within your infrastructure. Credentials are encrypted with AES-256-GCM. No customer data leaves your network. No third-party SaaS touches your API keys.
See how workflows power your agents →

Integrations — connect VOX to any backend

1,200+

MCP integrations

The open standard for AI tool calls. Sub-200ms. Self-hostable. VOX calls any MCP server natively — CRM, ticketing, calendar, database, payments — inside a live voice conversation.

400+

n8n workflow templates

Complex multi-step workflows with visual editing and 400+ pre-built integrations. VOX calls your n8n webhooks with structured payloads extracted from conversation — you build the logic, VOX triggers it.

Plus direct REST API calls, PostgreSQL, MongoDB, Qdrant vector search, Typesense geo+structured search — VOX connects to anything.

FAQ

Common questions answered.

Free to start · No credit card required

Ready to deploy your first voice agent?

Web widget, phone, or WhatsApp — all channels, 17 languages. Platform GPU models from $0.015/min or BYOK with 7+ AI providers including OpenAI, Claude, Gemini, Groq, Inworld AI, and OpenRouter. No SaaS lock-in.