Deepgram

Speech AI API platform for speech-to-text, text-to-speech, audio intelligence, and real-time voice agents with usage-based pricing.

8.3/10 Strong

Active

Monthly $200 free credit, then pay-as-you-go Annual Growth saves up to 20% Price Enterprise custom

Best plan

$200 free credit, then pay-as-you-go; Growth saves up to 20%; Enterprise custom

Watch out: Evaluate Deepgram by latency, accuracy, diarization, language coverage, streaming reliability, TTS quality, and total audio hours, not only per-hour headline rates. Voice Agent API stacks STT, TTS, LLM, and telephony charges separately

Try Deepgram free

Editorial · no paid placements

The call

Deepgram is a developer-first speech AI platform. Pick it for real-time transcription, voice agents, text-to-speech, and audio intelligence in production apps. Skip it if you need an end-user meeting assistant or editing suite.

Buy if Developers building voice agents and call analytics
Pick $200 free credit, then pay-as-you-go; Growth saves up to 20%; Enterprise custom
Skip if Manual podcast editing

Evidence rail

Why this recommendation is trusted

Evidence Deepgram official site

Source: Registered source
Freshness: Aging
Confidence: Medium confidence
Verified: Jun 2, 2026
Review: Aug 13, 2026
Volatility: Volatile

Evidence is approaching its review window.

Build comparison

Watch out: Evaluate Deepgram by latency, accuracy, diarization, language coverage, streaming reliability, TTS quality, and total audio hours, not only per-hour headline rates. Voice Agent API stacks STT, TTS, LLM, and telephony charges separately.

Editorial score

Unweighted average of 4 axes · confidence high

Utility 9/10

How much real work it can do for a competent operator, end to end.
Value 8/10

What you get for the dollar relative to the closest alternative.
Moat 8/10

How hard it would be for a competitor to replicate the underlying advantage.
Longevity 8/10

How likely the product is to still be best-in-class 24 months out.

Key facts

Best For Best for developers building voice AI products that need speech-to-text, text-to-speech, audio intelligence, and real-time agent APIs.
high Drifts 2026-06-12 Deepgram official site
Pricing Anchor Free $200 credit; Pay As You Go (no minimums); Growth (annual prepaid credits from $4K+/year, save up to 20%); Enterprise custom. STT Flux English $0.0065/min streaming PAYG, Nova-3 Monolingual $0.0048/min streaming PAYG. TTS Aura-2 $0.030/1k characters PAYG. Voice Agent API $0.050 to $0.163/min depending on configuration.
high Volatile 2026-06-12 Deepgram pricing
Watch Out For Evaluate Deepgram by latency, accuracy, diarization, language coverage, streaming reliability, TTS quality, and total audio hours, not only per-hour headline rates. Voice Agent API stacks STT, TTS, LLM, and telephony charges separately.
high Volatile 2026-06-12 Deepgram pricing
Api Available Deepgram is API-first, with developer docs as the source of truth for endpoint behavior, authentication, streaming, and SDK assumptions.
high Drifts 2026-06-12 Deepgram docs
Real Time Voice The Voice Agent API is Deepgram's dedicated surface for low-latency conversational voice agents, with tiered per-minute pricing that varies based on whether the customer brings their own LLM or TTS.
high Volatile 2026-06-12 Deepgram Voice Agent API

Deepgram is a speech AI API platform. Its product set covers speech-to-text, text-to-speech, audio intelligence, and real-time voice-agent APIs for developers building voice products. The platform is built for embedded speech workflows: streaming transcription, call analytics, conversational agents, dictation, captions, and product features where voice quality affects the user experience.

It competes with AssemblyAI, OpenAI Whisper, Google Speech-to-Text, Azure AI Speech, Amazon Transcribe, Speechmatics, and voice-agent stacks.

System Verdict

Pick Deepgram for real-time voice infrastructure. It is strongest when transcription, TTS, and live voice-agent performance are product dependencies.

Skip it for human-facing meeting notes. Fathom, Fireflies, Otter.ai, and Read AI package the user workflow.

The product is API-first. That is exactly right for developers and exactly wrong for someone who just wants a transcript from a Zoom call.

Key Facts


Core product	Speech AI APIs
Speech-to-text models	Flux English/Multilingual (real-time agents) · Nova-3 Monolingual/Multilingual (general transcription)
Text-to-speech models	Aura-1 · Aura-2
Voice Agent API	Real-time conversational voice agents · tiered pricing
Audio intelligence	Summarization, topic, sentiment, intent features
Free credit	$200 credit for new projects
STT pricing (verified 2026-06-12)	Flux English $0.0065/min streaming PAYG ($0.0057/min Growth); Nova-3 Monolingual $0.0048/min streaming PAYG
TTS pricing (verified 2026-06-12)	Aura-2 $0.030/1k characters PAYG; Aura-1 $0.0150/1k characters PAYG
Voice Agent pricing (verified 2026-06-12)	$0.050 to $0.163/min depending on tier and bring-your-own LLM
Growth plan	Annual prepaid credits from $4K+/year, save up to 20%
Best fit	Voice products, call centers, agents, audio analytics

When to pick Deepgram

You need streaming transcription. apps need low latency and stable WebSocket behavior.
You are building voice agents. Deepgram bundles STT, TTS, and voice-agent pieces.
You need developer controls. API-first configuration is better than a meeting-note UI for embedded products.
You handle high audio volume. Usage-based pricing and Growth plans are built for scale.
You care about data residency. Deepgram lists a dedicated EU endpoint for EU processing.
You need model choice for voice. Deepgram separates real-time conversational STT, prerecorded transcription, TTS, and audio intelligence so teams can tune cost and quality by workflow.

When to pick something else

Meeting note workflow: Fathom, Fireflies, Otter.ai, or Read AI.
Audio/video editing: Descript.
Open-source local transcription: Whisper via local GPU or Whisper.
Speech understanding benchmarks: compare closely with AssemblyAI.

Pricing

Deepgram offers a free $200 credit, then pay-as-you-go pricing. Growth plans use annual prepaid credit commitments and save up to 20% versus PAYG rates, and Enterprise is custom. Pricing varies by endpoint: speech-to-text, text-to-speech, Voice Agent API, and audio intelligence all have separate meters.

As verified on 2026-06-12, the pricing page lists explicit per-minute and per-character rates by model family.

Surface	Model	Streaming (PAYG)	Streaming (Growth)	Pre-recorded (PAYG)	Pre-recorded (Growth)
STT	Flux English	$0.0065/min	$0.0057/min	$0.0077/min	$0.0065/min
STT	Flux Multilingual	$0.0078/min	$0.0068/min	n/a	n/a
STT	Nova-3 Monolingual	$0.0048/min	$0.0042/min	$0.0077/min	$0.0065/min
STT	Nova-3 Multilingual	$0.0058/min	$0.0050/min	$0.0092/min	$0.0078/min

Surface	Model	PAYG	Growth
TTS	Aura-2	$0.030 / 1k characters	$0.027 / 1k characters
TTS	Aura-1	$0.0150 / 1k characters	$0.0135 / 1k characters

Voice Agent API ranges from $0.050 to $0.163 per minute depending on configuration tier and whether the customer brings their own LLM and TTS.

Teams should price by audio hours, concurrency, latency target, and whether TTS and LLM components are bundled or brought separately. The pricing page also lists separate concurrency limits for REST, WebSocket, TTS, Voice Agent API, and Audio Intelligence. Those limits can matter more than the per-minute headline rate in production.

Evaluation checklist

Test Deepgram with your real audio before choosing it:

Noisy calls, accents, crosstalk, far-field audio, and domain jargon.
Streaming latency, partial transcripts, endpointing, and interruption behavior.
Diarization quality when multiple speakers overlap.
Language detection and multilingual behavior if calls switch languages.
TTS voice quality, pronunciation, and interruption handling for agents.
Concurrency limits under your peak traffic pattern.
Add-on cost for summarization, topics, sentiment, intent, or custom models.

Buyer fit

, language-learning app, or compliance review workflow can justify integration effort because speech quality becomes product quality.

It is less attractive for occasional transcription. If a team only uploads a few recordings per month, a finished app or a simple Whisper wrapper can be cheaper and easier. Deepgram becomes compelling when the team needs low-latency streaming, high volume, voice-agent primitives, or support for production reliability.

Failure Modes

API integration required. There is no polished consumer workflow.
Voice agent cost stacks. STT, TTS, LLM, and telephony can all bill separately.
Accuracy varies by domain. Jargon, accents, noise, and overlapping speech need testing.
Concurrency matters. Real-time systems fail on limits before they fail on average price.
Feature pricing is modular. Audio intelligence and voice-agent features are not the same bill as transcription.
Voice quality is workload-specific. Model choice should be driven by your audio, not by a generic speech benchmark.

Recent changes

Pricing transparency (reverified 2026-06-12). Deepgram’s pricing page publishes explicit per-minute and per-character rates for Flux, Nova-3, and Aura model families across both PAYG and Growth plans, with Growth starting at $4K+/year in prepaid credits and saving up to 20%.
Voice Agent API pricing surfaced. The Voice Agent API now lists a $0.050 to $0.163 per-minute range, with the upper end reserved for bundled STT, TTS, and LLM and the lower end for bring-your-own-LLM/TTS configurations.
Model lineup confirmed. Flux remains the real-time voice-agent STT model; Nova-3 covers general transcription; Aura-2 is the current premium TTS family.

Methodology

Last verified 2026-06-12 against Deepgram pricing and product documentation. Scoring emphasizes API utility, real-time performance fit, voice-agent breadth, and implementation complexity.

FAQ

Does Deepgram only do speech-to-text? No. It also offers text-to-speech, audio intelligence, and voice-agent APIs.

Is Deepgram good for meeting notes? It can power a meeting-note product, but it is not itself the finished meeting-note app.

Does Deepgram have a free tier? Deepgram lists free credits for new users, then pay-as-you-go pricing.

Sources

Category: AI Voice
See also: AssemblyAI · Whisper · ElevenLabs · Fathom · Read AI

Reader reviews

Loading…

Share LinkedIn

Was this review helpful?

Embed this score on your site Free. Links back.

HTML

<a href="https://aipedia.wiki/tools/deepgram/" target="_blank" rel="noopener"><img src="https://aipedia.wiki/badges/deepgram.svg" alt="Deepgram on aipedia.wiki" width="260" height="72" /></a>

Markdown

[![Deepgram on aipedia.wiki](https://aipedia.wiki/badges/deepgram.svg)](https://aipedia.wiki/tools/deepgram/)

Badge value auto-updates if the editorial score changes. Attribution via the link is required.

Cite this page For journalists, researchers, and bloggers

News writers

According to aipedia.wiki Editorial at aipedia.wiki (https://aipedia.wiki/tools/deepgram/)

APA

aipedia.wiki Editorial. (2026). Deepgram: Editorial Review. aipedia.wiki. Retrieved June 22, 2026, from https://aipedia.wiki/tools/deepgram/

MLA 9

aipedia.wiki Editorial. "Deepgram: Editorial Review." aipedia.wiki, 2026, https://aipedia.wiki/tools/deepgram/. Accessed June 22, 2026.

Chicago

aipedia.wiki Editorial. 2026. "Deepgram: Editorial Review." aipedia.wiki. https://aipedia.wiki/tools/deepgram/.

BibTeX

@misc{deepgram-editorial-review-2026,
  author = {{aipedia.wiki Editorial}},
  title = {Deepgram: Editorial Review},
  year = {2026},
  publisher = {aipedia.wiki},
  url = {https://aipedia.wiki/tools/deepgram/},
  note = {Accessed: 2026-06-22}
}

Spotted an error or want to share your experience with Deepgram?

Every tool page is re-verified on a recurring cycle, and corrections land faster when readers flag them directly. If you spot a stale fact, a missing capability, or have used Deepgram and want to share what worked or didn't, the editorial desk reviews every message sent through this form.

Email editorial@aipedia.wiki

Report outdated info Help us keep this page accurate