Open-source observability for LLM applications. Drop one line into your OpenAI, Anthropic, Google, or LiteLLM client config and every request gets logged, traced, costed, and available for replay in the Helicone dashboard.
As of June 12, 2026, Helicone also leans hard into the AI Gateway: unified access to 100+ models, 0% markup provider credits, bring-your-own provider keys, caching, rate limits, automatic fallbacks, and complete observability from the same integration path.
System Verdict
Pick Helicone if you’re shipping an LLM-powered product and need to see what’s happening under the hood. The one-line integration is genuinely one line. Prompt-level cost tracking, latency distributions, per-user analytics, and a prompt playground for iteration all ship in the free tier.
Skip it if you’re already committed to LangSmith. LangChain’s first-party observability integrates more deeply with LangChain/LangGraph than any third-party tool. If you’re deep in that ecosystem and don’t mind LangSmith’s pricing, stick with it.
Helicone’s edge over LangSmith: multi-provider support is first-class (LangSmith is LangChain-centric), the AI Gateway adds real production features (caching saves 30-80% on repeated prompts, failover prevents OpenAI outages from killing your app), and the free tier is meaningfully usable (10k requests/mo vs LangSmith’s smaller limits).
Key Facts
| License | Open source (self-hostable) |
| Hobby (free) | 10,000 requests/month, 1 GB storage, 1 seat, no credit card |
| Pro | $79/mo, unlimited seats, alerts, reports, HQL, usage-based above 10K |
| Team | $799/mo, SOC-2 and HIPAA, 5 organizations, dedicated Slack support |
| Enterprise | Custom with SAML SSO, MSA, on-prem deployment |
| Integration effort | One line of code (changes base URL or adds proxy) |
| Providers | OpenAI, Anthropic, Google, Mistral, Groq, Together, any OpenAI-compatible |
| Core observability | Traces, sessions, metrics (cost, latency, quality), user analytics |
| AI Gateway features | Load balancing, caching, automatic failover, rate limiting |
| Gateway billing | Helicone credits with 0% markup, or bring your own provider keys |
| Current public scale signals | 10B requests processed, 2.5T tokens/month, 64.9M users tracked (vendor-reported) |
| Integrations (2026) | LangGraph, LiteLLM, Vercel AI SDK, OpenAI Realtime API, OpenAI-compatible clients |
| Backed by | Y Combinator (W23) |
When to pick Helicone
- Production LLM apps. Log every request, debug why a prompt degraded, catch cost runaways before the end-of-month bill.
- Multi-provider workloads. Route between OpenAI, Anthropic, Google, Mistral, Groq, Together, Bedrock, Azure, or OpenAI-compatible endpoints based on latency, cost, and fallback needs; Helicone tracks the traffic either way.
- Cost optimization. Prompt caching alone typically saves 30-80% on repeated-prompt workloads. The gateway handles it.
- Agent workflow debugging. LangGraph integration shows you exactly what each node in a graph did, including tool calls and state changes.
- Self-hosted preference. Open-source core lets you run Helicone on your own infrastructure.
When to pick something else
- LangChain-centric shops: LangSmith integrates deeper. If you’re all-in on LangChain, stay there.
- Prompt management + evals focus: Langfuse overlaps and has a stronger prompt management story.
- App-wide observability: Datadog, New Relic, Sentry for full-stack; Helicone is LLM-specific.
- Simple prototypes: Direct provider dashboards (OpenAI Usage, Anthropic Console) suffice until you have real scale.
Pricing
Helicone ships a cloud-hosted service with generous free tier plus optional self-hosting.
| Plan | Price | What’s included |
|---|---|---|
| Hobby | $0 | 10,000 requests/month, 1 GB storage, 1 seat, 1 org |
| Pro | $79/mo | Everything in Hobby plus unlimited seats, alerts, reports, HQL query language, usage-based scaling |
| Team | $799/mo | Everything in Pro plus 5 organizations, SOC-2 and HIPAA compliance, dedicated Slack support |
| Enterprise | Custom | SAML SSO, MSA agreements, on-prem deployment options |
| Self-hosted | $0 | Run Helicone on your own infrastructure |
Usage-based pricing applies above the 10K free tier (calculator estimates vary by request volume, storage, and integration path). AI Gateway credits use provider-cost pass-through positioning; observability-only deployments can bring their own provider keys. See helicone.ai/pricing and Helicone docs for current paid-tier details. Verified 2026-06-12.
Failure modes
- Free tier caps at 10k requests/month. Small production apps can blow through this in days. Plan the upgrade path or self-host.
- Proxy vs async logging tradeoff. Helicone-as-proxy adds latency (~5-20ms). Async logging avoids latency but can miss logs during failures. Know which mode you’re in.
- Prompt caching needs cache-aware prompt design. If your prompts include timestamps or random nonces, cache hit rate is zero.
- Not a replacement for prompt eval harnesses. For systematic evaluation of prompt changes, use Helicone’s evals + a dedicated eval tool (Braintrust, Promptfoo).
- Gateway adds a hop. applications (real-time voice, sub-100ms SLA), the extra proxy hop matters.
Against the alternatives
| Helicone | Langfuse | LangSmith | Braintrust | |
|---|---|---|---|---|
| Open source | Yes | Yes (MIT) | No | No |
| Free tier | 10k req/mo | 50k units/mo | Limited | Limited |
| AI Gateway (proxy features) | Yes | No | No | No |
| LangChain integration | Good | Good | Best (native) | Good |
| Self-hosted | Yes | Yes | No | No |
| Best for | Multi-provider production | Evals + prompt mgmt | LangChain-centric teams | Eval-heavy teams |
Methodology
Produced by the aipedia.wiki editorial pipeline. Last verified 2026-06-12 against helicone.ai, Helicone pricing, Helicone docs, and the Helicone GitHub repository.
FAQ
Is Helicone really free? The Hobby cloud free tier covers 10,000 requests/month with no credit card. Self-hosting is free forever under the open-source license. Pro at $79/mo unlocks unlimited seats, alerts, reports, and the HQL query language; Team at $799/mo adds SOC-2 and HIPAA compliance with 5 organizations.
How does Helicone compare to Langfuse? Helicone emphasizes the AI Gateway (caching, failover, load balancing). Langfuse emphasizes prompt management and evals. Many teams use both. Both are free-tier generous.
Does Helicone work with Claude Code or Cursor? Both tools call LLM APIs; if you configure those APIs to route through Helicone, yes. For Claude Code, you’d set a custom Anthropic base URL. For Cursor, it’s harder because Cursor manages its own API config.
What’s the AI Gateway? A high-performance proxy that sits in front of LLM providers. It can use Helicone credits with 0% markup or bring-your-own provider keys, then add unified access to 100+ models, caching, automatic failover, load balancing, rate limiting, and observability. Functionally like an API gateway, but LLM-aware.
Related
- Category: AI Automation · AI Coding
- Compare: Helicone vs Langfuse
- See also: LangGraph · Mastra