The Ultimate Guide to Kimi Code: Architecture, Installation, and Usage
Published on January 27, 2026
Kimi Code is a next-generation open-source coding agent developed by Moonshot AI. Built on the powerful Kimi K2 and Kimi K2.5 (multimodal) models, it is designed to be a transparent, reliable, and highly capable assistant for developers. Whether you are debugging complex issues, refactoring codebases, or simply need an intelligent pair programmer, Kimi Code integrates seamlessly into your workflow via VS Code, Cursor, JetBrains, Zed, or the Command Line Interface (CLI). Kimi K2.5 is the latest release (January 27, 2026), introducing native multimodal support for text, images, and videos, along with agentic enhancements.
This comprehensive guide covers everything from its underlying architecture to detailed installation steps for all major operating systems, usage patterns, and even how to containerize it with Docker. The CLI is now officially referred to as Kimi CLI (kimi-cli package), with version 1.1 or later incorporating K2.5 features.
1. Use Cases
Kimi Code is versatile and supports a wide range of development activities:
- Intelligent Pair Programming: Context-aware code suggestions and completions within your IDE.
- Automated Refactoring: Identify code smells and automatically apply best-practice refactoring patterns.
- Autonomous Task Execution (Agent Mode): Give high-level instructions (e.g., โSet up a Next.js project with Tailwindโ), and Kimi will execute the necessary terminal commands and file edits. K2.5 Agent Swarm (beta) enables directing up to 100 sub-agents in self-directed workflows without predefined structures, enhancing complex task handling like multi-agent coordination.
- Multimodal Understanding (Kimi K2.5): Upload screenshots of UIs or diagrams, and Kimi can generate the corresponding frontend code or logic. Expand to include video processing alongside images and diagrams, enabling generation of code from video inputs or dynamic UI elements like animations.
- Terminal Assistant: A natural language interface for your terminal to generate complex shell commands or explain errors.
- Agentic Workflows with Tool Calling: Support for up to 200-300 sequential tool calls, including web searching, shell execution, and custom integrations via Kimi Agent SDK (Python, Node.js, Go).
2. Architecture & System Requirements
At the heart of Kimi Code lies the Kimi K2 series of models, which use a Mixture-of-Experts (MoE) architecture. This allows for massive scale with efficient inference.
Architecture Highlights
- Model Backbone: Sparse Mixture-of-Experts (MoE) Transformer. Kimi K2.5 is built via continual pretraining on ~15 trillion mixed visual and text tokens atop Kimi-K2-Base.
- Parameters: ~1 Trillion Total Parameters, with 32 Billion Activated Parameters per token generation.
- Context Window: Supports up to 256,000 tokens, allowing it to digest entire repositories or large documentation files in a single context.
- Multimodal Support: Kimi K2.5 features native multimodal architecture that supports visual and text input, with support for video understanding and processing.
- Reasoning: Supports โThinking Modeโ for complex chain-of-thought reasoning before answering. Modes include: K2.5 Instant, Thinking, Agent, and Agent Swarm (beta).
- Protocols: Agent Client Protocol (ACP) for IDE integrations and Model Context Protocol (MCP) for external tools/models.
- SDK: For custom embeddings in applications.
- No Official Docker Support: Setups are Python-based.
System Requirements
- Operating System:
- macOS: Native support.
- Linux: Native support.
- Windows: Supported via WSL 2 (Windows Subsystem for Linux). Native Windows CLI support remains unavailable; still requires WSL2.
- Runtime:
- Python: Version 3.10 or higher (3.13 recommended).
- uv: An extremely fast Python package and project manager (required for CLI installation).
- Editor: VS Code (latest version) for extension support.
- Moonshot AI API Key: Required (free tier available); dependencies like make, git; Nix/Flake for reproducible environments.
- Optional Hardware for Local Inference: Minimum 128GB system RAM, 32GB GPU VRAM; recommended 256GB RAM, 80GB GPU (e.g., A100/H100) for full deployment.
3. Installation Guide
A. VS Code Extension
The easiest way to get started is with the VS Code extension.
- Install via Marketplace:
- Open VS Code.
- Go to the Extensions view (
Ctrl+Shift+XorCmd+Shift+X). - Search for โKimi Codeโ (Publisher:
moonshot-ai). - Click Install.
- Direct Link: Kimi Code on Marketplace
- Authentication:
- Once installed, click the Kimi icon in the Activity Bar.
- Select โSign in with Kimi Accountโ to authorize via browser.
- Alternative: If you have an API Key, click โSkipโ and configure the API key in settings. Add option to use API key directly in settings without skipping.
B. Command Line Interface (CLI)
The Kimi CLI is a powerful terminal agent.
Prerequisites
You must have uv installed. If you donโt, install it first: macOS / Linux / Windows (WSL):
curl -LsSf https://astral.sh/uv/install.sh | sh For Windows, emphasize WSL2 setup (e.g., enable WSL, install Ubuntu).
Installation
Option 1: Install as a tool (Recommended) This installs kimi-cli in an isolated environment and makes the kimi command available globally.
uv tool install kimi-cli Option 2: Install via pip (Standard)
pip install kimi-cli Alternative Repo Clone Method: git clone https://github.com/MoonshotAI/kimi-cli.git && cd kimi-cli; run make prepare for deps; login with kimi login (OAuth or API key).
Note for Windows Users: Native Windows support is still in development. Please use WSL 2 (Ubuntu/Debian) to install and run the CLI for the best experience.
SDK Installation
- Python:
pip install kimi-agent-sdk- Node.js:
npm install @moonshot-ai/kimi-agent-sdk- Go:
go get github.com/MoonshotAI/kimi-agent-sdk/go
4. Usage & Implementation
Using the CLI (
kimi)Kimi CLI has two primary modes: Shell Mode for single commands and Agent Mode for continuous assistance. Add Shell Mode toggle (Ctrl-X); MCP management commands (e.g.,
kimi mcp add --transport http context7 https://mcp.context7.com/mcp --header "CONTEXT7_API_KEY: ctx7sk-your-key",kimi mcp list,kimi mcp remove chrome-devtools); ad-hoc MCP via--mcp-config-file /path/to/mcp.json.
- Start the CLI:
kimi- Common Commands:
/exitorCtrl+D: Exit the session./clear: Clear the context history.- Example Workflow:
Find all python files larger than 100 lines and list them. ``` Kimi will generate the `find` command, explain it, and ask for permission to run it.Using VS Code
- Chat Interface:
- Open via sidebar icon.
- Ask questions like โExplain this functionโ or โRefactor this classโ.
- Context Management (The
@Symbol):
- Type
@to reference files, folders, or code symbols.- Example: โHow does
@auth.tsinteract with@user_model.py?โ- YOLO Mode (Auto-Approve):
- By default, Kimi asks for permission before editing files.
- Enable YOLO Mode in settings (
kimi.yoloMode) to allow it to execute commands and edits autonomously. Use with caution on potential risks!Additional Usage Details
- ACP Startup:
kimi acpfor other IDEs like Cursor, JetBrains, Zed.- Custom Tools: Add to skills dir or via SDK; Python SDK example:
from kimi_agent_sdk import Agent
agent = Agent() response = agent.chat(โWrite a Python script for hello world.โ) print(response)
- Zsh Plugin Integration: `git clone https://github.com/MoonshotAI/zsh-kimi-cli.git ${ZSH_CUSTOM:-~/.oh-my-zsh/custom}/plugins/kimi-cli` and add to `~/.zshrc`: `plugins=(... kimi-cli)`.
---
## 5. Docker Setup (Unofficial)
Since there is no official Docker image yet, you can use this `Dockerfile` to create a clean environment with `kimi-cli` pre-installed.
**File: `Dockerfile.kimi`**
```dockerfile
# Use a lightweight Python base image
FROM python:3.13-slim
# Install system dependencies (curl, git, build essentials)
RUN apt-get update && apt-get install -y curl git build-essential && rm -rf /var/lib/apt/lists/*
# Install uv package manager
RUN curl -LsSf https://astral.sh/uv/install.sh | sh
# Add uv to PATH
ENV PATH="/root/.cargo/bin:$PATH"
# Install kimi-cli using uv
RUN uv tool install kimi-cli
# Ensure kimi is on the path
ENV PATH="/root/.local/bin:$PATH"
# Set working directory
WORKDIR /workspace
# Default entrypoint
CMD ["kimi"] Update ENV PATH to include uv bin; add API key config step (e.g., ENV or volume for config). Build and Run:
# Build the image
docker build -t kimi-code -f Dockerfile.kimi .
# Run the container (mounting current directory)
docker run -it -v $(pwd):/workspace kimi-code 6. Testing & Validation
After installation, validate your setup:
- Check Version:
Expect output likekimi --versionkimi-cli 1.1.xor higher with K2.5 integration. - Test Connectivity:
Run a simple query to ensure the API is connecting:
kimi "Hello, are you ready to code?" - VS Code Check:
- Open a file.
- Select code +
Cmd+K. - Type โAdd commentsโ.
- Verify that Kimi generates a diff and allows you to accept it.
- Local Model Inference Tests: Via vLLM/Hugging Face; API testing with OpenAI-compatible SDK:
pip install openai, then client with Moonshot API key. - Community Benchmarks: On Hugging Face or Together.ai.
Resources & Links
- Official Website: kimi.com/code
- Documentation: kimi.com/code/docs English docs: https://moonshotai.github.io/kimi-cli/en/; Chinese: https://moonshotai.github.io/kimi-cli/zh/.
- GitHub Repository: MoonshotAI/kimi-cli
- VS Code Extension: Visual Studio Marketplace
- Kimi K2.5 Tech Blog: https://www.kimi.com/blog/kimi-k2-5.html
- API Platform: https://platform.moonshot.ai/
- Hugging Face Model: https://huggingface.co/moonshotai/Kimi-K2.5
- NVIDIA NIM: https://build.nvidia.com/moonshotai/kimi-k2.5
SDK and Custom Integrations
The Kimi Agent SDK is an open-source library designed for embedding Kimi agents into custom applications, enabling developers to integrate AI-powered agents seamlessly. It supports multiple languages including Python, Node.js (TypeScript), and Go, allowing for flexible implementation across different tech stacks. The SDK acts as a thin client that proxies requests to the Kimi CLI runtime, reusing configurations, tools, and sessions for efficiency.
Key Features
- Multi-Language Support: Libraries for Python, Node.js, and Go.
- Session Management: Reuse CLI sessions for persistent contexts.
- Tool Integration: Register custom tools and handle tool calls.
- Autonomous Agents: Support for agentic workflows with up to 200-300 tool calls.
- Extensibility: Easy to add custom skills and MCP configurations.
- Transparency: Fully open-source under Apache 2.0, with clear runtime visibility.
Installation
- Python:
pip install kimi-agent-sdk - Node.js:
npm install @moonshot-ai/kimi-agent-sdk - Go:
go get github.com/MoonshotAI/kimi-agent-sdk/go
Usage Examples
Python Example: Basic Chat
from kimi_agent_sdk import Agent
agent = Agent()
response = agent.chat("Generate a Python function to calculate factorial.")
print(response) Node.js Example: Custom Tools
import { Agent } from '@moonshot-ai/kimi-agent-sdk';
const agent = new Agent();
agent.registerTool({
name: 'get_weather',
description: 'Get current weather',
parameters: { type: 'object', properties: { location: { type: 'string' } } },
async handler({ location }) {
// Implement weather API call
return `Weather in ${location}: Sunny`;
}
});
const response = await agent.chat('What is the weather in Jaipur?');
console.log(response); Go Example: Streaming Response
package main
import (
"fmt"
"github.com/MoonshotAI/kimi-agent-sdk/go"
)
func main() {
agent := kimi.NewAgent()
stream := agent.ChatStream("Explain quantum computing simply.")
for chunk := range stream {
fmt.Print(chunk)
}
} For more advanced usage, refer to the examples in the SDK repository directories (e.g., examples/python/customized-tools). Ensure the Kimi CLI is running or configured for the SDK to proxy requests effectively.
Pricing and Limitations
Moonshot AIโs Kimi API uses a pay-as-you-go model with usage-based pricing. Prices vary by model and context length.
Pricing Details
For kimi-k2.5 (as of January 2026):
- Input: $0.10 per 1M tokens
- Output: $0.60 per 1M tokens
- Cache Hit: $3.00 per 1M tokens (for prompt caching)
- Context: Up to 262,144 tokens
For kimi-k2-thinking:
- Input: $0.60 per 1M tokens
- Output: $2.50 per 1M tokens
File-related APIs (e.g., content extraction) are temporarily free. Billing is based on both input and output tokens.
Rate Limits
Rate limits are tiered based on cumulative recharge amount: | Tier | Recharge | Concurrency | RPM | TPM | TPD | |------|----------|-------------|-----|-----|-----| | Tier0 | $1 | 1 | 3 | 500,000 | 1,500,000 | | Tier1 | $10 | 50 | 200 | 2,000,000 | Unlimited | | Tier2 | $20 | 100 | 500 | 3,000,000 | Unlimited | | Tier3 | $100 | 200 | 5,000 | 3,000,000 | Unlimited | | Tier4 | $1,000 | 400 | 5,000 | 4,000,000 | Unlimited | | Tier5 | $3,000 | 1,000 | 10,000 | 5,000,000 | Unlimited |
Free tier has temporary access with low limits; paid tiers unlock higher concurrency and unlimited daily tokens.
Limitations
- API Access: Free tier may have temporary restrictions on new models like K2.5.
- Local Model Hardware Needs: For Kimi K2.5 local inference:
- Minimum: Total disk + RAM + VRAM โฅ 250GB (for quantized versions like INT4 or 1-bit).
- Recommended: 512GB+ RAM, 80GB+ GPU VRAM (e.g., 16-32 H100 GPUs for full performance).
- Native INT4 quantization reduces VRAM requirements, enabling runs on consumer hardware with sufficient total memory (e.g., 640GB Ampere setups).
- For CPU-only: Possible with low specs but slow; use frameworks like llama.cpp with MoE offloading.
- Other: No native Windows CLI; limited internet access in some integrations; high verbosity in responses.
Contributions and Updates
Kimi Code is open-source, welcoming contributions from the community. Follow standard GitHub workflows to contribute.
How to Contribute
- Fork the Repository: Fork https://github.com/MoonshotAI/kimi-cli on GitHub.
- Clone Locally:
git clone https://github.com/your-username/kimi-cli.git - Set Up Environment: Run
make prepareto install dependencies. - Make Changes: Develop features or fixes in a new branch (e.g.,
git checkout -b feature/new-tool). - Format and Check Code:
make format(using black/ruff),make check(linting),make test(run unit/e2e tests). - Commit and Push: Commit with clear messages, push to your fork.
- Create Pull Request: Submit a PR to the main branch, describing changes and linking issues.
- Guidelines: Follow CONTRIBUTING.md (if available); ensure tests pass; adhere to code style.
Changelog for v1.1
Version 1.1 (released around late 2025/early 2026) includes:
- OAuth Integration: Added OAuth login flow for easier authentication.
- Rebranding: Updated from โkimi-cliโ to align with โKimi Codeโ branding, with UI enhancements.
- K2.5 Support: Incorporated multimodal features and agent swarm beta.
- Bug Fixes: Improved session management, MCP handling, and cross-platform compatibility.
- Other: Enhanced documentation, added Zsh plugin, and optimized for Nix environments.
For full changelog, check RELEASE_NOTES.md or GitHub releases in the repository.
Comments
Sign in to join the discussion!
Your comments help others in the community.