Moonshot AI’s chatbot and LLM family, built in Beijing by Tsinghua alumni and backed by Alibaba, Tencent, and IDG Capital at a reported $10B valuation. Alibaba holds a 36% stake after a $1B round in February 2024. Kimi launched October 2023.
The current flagship is Kimi K2.6, released April 21, 2026 as open-weights with four operating modes: Instant, Thinking, Agent, and Agent Swarm. K2.6 posts the strongest published coding and agentic benchmarks for any open-weights model on release day: HLE with tools 54.0, SWE-Bench Pro 58.6, SWE-bench Multilingual 76.7. Kimi K2 Thinking remains in the lineup as the prior open-weight reasoning flagship (HLE 44.9% with tools).
Recent developments
Verified 2026-06-12.
- April 24, 2026: DeepSeek V4 preview released. The cheapest credible competitor for budget-API long-context work moves first; Kimi K2.6 must defend the agentic and open-weights edge as DeepSeek’s coding scores climb.
- April 21, 2026: Kimi K2.6 released with four modes including Agent Swarm (multi-instance parallel execution with planner/executor/verifier/critic roles). Benchmarks: HLE-with-tools 54.0, SWE-Bench Pro 58.6, SWE-bench Multilingual 76.7. Strongest open-weights coding and agentic model as of release.
System Verdict
Pick Kimi if you need free long-context chat, open-weight frontier reasoning, or agentic workflows with sustained tool use. The 256K context on the free tier eliminates chunking makes migration trivial.
Skip it if you need the absolute cheapest API or Western enterprise trust. providers on raw input tokens, and Moonshot is a well-funded startup rather than an established enterprise vendor. English UX, while improved, still lags ChatGPT and Claude on consumer polish.
Who uses which surface: for current multimodal agentic apps, Kimi K2 Thinking (open weights) for self-hosted reasoning experiments, and Agent Swarm beta for teams testing parallel multi-step tasks.
Key Facts
| Flagship model | Kimi K2.6 (released April 21, 2026, open-weights) |
| Prior reasoning flagship | Kimi K2 Thinking (open weights, modified MIT) |
| Architecture | 1T parameter MoE, 32B active per request (K2.5 lineage; K2.6 architecture details per Moonshot release notes) |
| Context window | 256K tokens (262,144) on chat and API |
| Multimodal | Native vision + text training |
| HLE with tools (K2.6) | 54.0 |
| SWE-Bench Pro (K2.6) | 58.6 |
| SWE-bench Multilingual (K2.6) | 76.7 |
| HLE with tools (K2 Thinking) | 44.9%, 51.0% heavy mode |
| Sequential tool calls | 200-300 per agent task |
| Chat pricing | Free tier with 256K context, paid memberships available |
| K2.6 API pricing | $0.95/M input, $4.00/M output |
| K2.6 cache pricing | $0.16/M cache-hit tokens |
| Agent Swarm | Beta, up to 100 coordinated sub-agents |
Every volatile data point above was rechecked on 2026-06-12. See Sources.
What it actually is
Two surfaces on one model family: kimi.com / kimi.moonshot.cn for chat, and platform.moonshot.ai for API. The chat product offers unlimited basic conversations and full 256K context free, which removes the chunking step most competitors force on long documents.
Kimi K2.6 is now the current model on the API. The platform describes multimodal input, thinking and non-thinking modes, dialogue and agent tasks, automatic context caching, ToolCalls, JSON Mode, Partial Mode, and internet search. Agent Swarm remains a newer surface, so test quotas and reliability before building production dependencies around it.
The real moats are the free 256K chat context, the open-weight K2 Thinking release, and the tool-call depth. Most frontier thinking models cap tool sequences far earlier. Kimi’s 200-300 sequential call depth supports long-horizon automation competitors cannot match at the open-weight tier.
When to pick Kimi
- Long-document or codebase analysis. 256K tokens free handles hundred-page PDFs, research papers, and mid-size repos in one pass.
- Agentic workflows with sustained tool use. Agent mode supports 200-300 sequential tool calls in a single task, enough for multi-step web research, deep refactors, or chained API orchestration.
- Open-weight reasoning deployment. Kimi K2 Thinking ships under modified MIT. Self-host for HLE-leading thinking without vendor lock-in.
- Bilingual Chinese-English research. Trained with deep idiom and technical-terminology coverage in both languages.
- Cache-friendly workloads. Automatic context caching cuts input cost up to 75% on repeated or overlapping prompts.
When to pick something else
- Cheapest API: DeepSeek and other budget API providers can undercut Kimi K2.6 on raw input price.
- Polished English consumer UX: ChatGPT or Claude. Kimi’s English chat is functional but not first-class.
- Google Workspace integration: Gemini. Kimi has no Workspace hooks.
- Best-in-class long-form writing: Claude Opus 4.8. Kimi’s prose lags Claude’s coherence on 10K-word outputs.
- Broadest plugin ecosystem: ChatGPT. No Kimi equivalent to the GPT Store.
Pricing
Chat at kimi.com, API at platform.moonshot.ai.
| Plan / Model | Price | Key details |
|---|---|---|
| Free chat (Adagio) | $0/month | Unlimited basic chat, 256K context, limited DeepResearch and agent tasks |
| Paid Membership | Tiered | Higher DeepResearch and agent quotas, current plans on kimi.com |
| Kimi K2.6 API | $0.95/M cache-miss input, $4.00/M output | Cache-hit input $0.16/M, 256K context, multimodal, current recommended Kimi model |
| Older Kimi K2/K2.5 paths | Verify live before use | The current docs navigation centers K2.6 and K2.5, while the old K2 legacy URL now redirects to the docs overview. Treat legacy K2 pricing as historical for new production. |
Prices verified 2026-06-12 via Moonshot AI Platform, Kimi API pricing docs, and the Kimi K2.6 pricing page. API is pay-as-you-go with no minimum listed on the checked pricing pages. DeepSeek V4 preview release (April 24, 2026) raised competitive pressure on the cheap-API segment; Kimi K2.6 pricing held steady in the June 8 check.
Against the alternatives
| Kimi K2.6 | DeepSeek V3 | Claude Opus 4.8 | Qwen3.6 Plus | |
|---|---|---|---|---|
| Context window | 256K | 64K | 1M | 1M |
| Free long context | Yes, 256K | Limited | Pro $20/mo for 1M | API pay-per-token |
| Open weights | K2 Thinking, modified MIT | V3 open | Closed | Apache 2.0 |
| API input price | $0.95/M | lower-cost options exist | $5.00/M | varies by provider/model |
| Tool-call depth | 200-300 sequential | Shorter | 200+ via Claude Code | Agent-mode |
| Multimodal | Native vision | Text-focused | Text + vision | Text, vision, video input |
| Best viewed as | Long-context + agent specialist | Cheap capable API | Reasoning + writing | Open-weight multilingual |
Failure modes
- English UX is a translation layer. Menu labels, error messages, and help docs read as ported from Chinese. Functional, not native.
- Agent Swarm is beta. Parallel 100-agent coordination is not GA for production. Test before building dependencies.
- Free tier caps advanced features. Basic chat is unlimited at 256K. DeepResearch and agent runs have daily quotas on the free plan.
- API cost sits above the cheapest providers. At $0.95/M input for K2.6, Kimi is not the lowest raw-rate option. Cache hits narrow the gap on repeated prompts.
- Vendor risk. Moonshot is well-capitalized but younger than OpenAI or Anthropic. Enterprise buyers factor in startup longevity.
- Kimi K2 Thinking commercial clause. Modified MIT requires Kimi branding for products exceeding 100M MAU or $20M monthly revenue. Edge case for most users but real for scaled deployments.
- Tool-call depth is a ceiling, not a guarantee. 300 sequential steps works on well-scoped tasks. Long loops still surface plan drift on open-ended queries.
Methodology
This page was produced by the aipedia.wiki editorial pipeline, an automated system that ingests vendor documentation, verifies pricing and model details against primary sources, and generates the editorial analysis you are reading. No individual human wrote this review. Scoring follows the four-dimension rubric at /about/scoring/ (Utility, Value, Moat, Longevity; unweighted average). Last verified 2026-06-12 against Moonshot AI Platform docs, Kimi API pricing docs, Kimi K2.6 pricing docs, the old K2 URL redirect behavior, Kimi K2 Thinking on Hugging Face, and the April 24 DeepSeek V4 preview coverage.
FAQ
Is Kimi free to use? Yes. kimi.com provides unlimited basic chat with a 256K-token context window at no cost. DeepResearch and agent features are quota-capped on the free tier. The API is pay-as-you-go.
What is Kimi K2.6? Moonshot AI’s current Kimi platform model as of June 12, 2026, released April 21, 2026 as open-weights with Agent Swarm mode. Kimi’s docs describe 256K context, text/image/video input, thinking and non-thinking modes, dialogue and agent tasks, ToolCalls, JSON Mode, Partial Mode, internet search, and context caching.
What is Kimi K2 Thinking? The open-weight reasoning flagship. It sets state-of-the-art on Humanity’s Last Exam at 44.9% with tools and 51.0% in heavy mode, supports 200-300 sequential tool calls, and ships under a modified MIT license. Weights are on Hugging Face.
How does Kimi compare to Claude for long documents? Kimi’s 256K context is free. Claude Opus 4.8’s 1M context is larger but requires Pro ($20/mo) or API. At equal context lengths, Claude’s English prose is more coherent; Kimi’s bilingual Chinese-English handling is stronger. Pick based on language mix and budget.
Can I use Kimi as a drop-in OpenAI replacement? Yes. Point your OpenAI SDK features port directly.
Sources
- Moonshot AI API Platform: current models, pricing, modes
- Kimi API pricing documentation: cache rates and tier details
- Kimi K2 Thinking on Hugging Face: open-weight release, modified MIT license
- Kimi K2.6 pricing documentation: current K2.6 model description and pricing
- Legacy Kimi K2 pricing URL: checked for redirect/legacy behavior during the June 8 refresh
Related
- Category: AI Chatbots · AI Research
- Compare: Use AI Chatbots & LLMs for assistant and model-provider alternatives; direct comparison pages are reserved for same-workflow substitutes.