The Ultimate Guide to Kimi Code: Architecture, Installation, and Usage

Published on January 27, 2026

Kimi Code is a next-generation open-source coding agent developed by Moonshot AI. Built on the powerful Kimi K2 and Kimi K2.5 (multimodal) models, it is designed to be a transparent, reliable, and highly capable assistant for developers. Whether you are debugging complex issues, refactoring codebases, or simply need an intelligent pair programmer, Kimi Code integrates seamlessly into your workflow via VS Code, Cursor, JetBrains, Zed, or the Command Line Interface (CLI). Kimi K2.5 is the latest release (January 27, 2026), introducing native multimodal support for text, images, and videos, along with agentic enhancements.

This comprehensive guide covers everything from its underlying architecture to detailed installation steps for all major operating systems, usage patterns, and even how to containerize it with Docker. The CLI is now officially referred to as Kimi CLI (kimi-cli package), with version 1.1 or later incorporating K2.5 features.

1. Use Cases

Kimi Code is versatile and supports a wide range of development activities:

Intelligent Pair Programming: Context-aware code suggestions and completions within your IDE.
Automated Refactoring: Identify code smells and automatically apply best-practice refactoring patterns.
Autonomous Task Execution (Agent Mode): Give high-level instructions (e.g., “Set up a Next.js project with Tailwind”), and Kimi will execute the necessary terminal commands and file edits. K2.5 Agent Swarm (beta) enables directing up to 100 sub-agents in self-directed workflows without predefined structures, enhancing complex task handling like multi-agent coordination.
Multimodal Understanding (Kimi K2.5): Upload screenshots of UIs or diagrams, and Kimi can generate the corresponding frontend code or logic. Expand to include video processing alongside images and diagrams, enabling generation of code from video inputs or dynamic UI elements like animations.
Terminal Assistant: A natural language interface for your terminal to generate complex shell commands or explain errors.
Agentic Workflows with Tool Calling: Support for up to 200-300 sequential tool calls, including web searching, shell execution, and custom integrations via Kimi Agent SDK (Python, Node.js, Go).

2. Architecture & System Requirements

At the heart of Kimi Code lies the Kimi K2 series of models, which use a Mixture-of-Experts (MoE) architecture. This allows for massive scale with efficient inference.

Architecture Highlights

Model Backbone: Sparse Mixture-of-Experts (MoE) Transformer. Kimi K2.5 is built via continual pretraining on ~15 trillion mixed visual and text tokens atop Kimi-K2-Base.
Parameters: ~1 Trillion Total Parameters, with 32 Billion Activated Parameters per token generation.
Context Window: Supports up to 256,000 tokens, allowing it to digest entire repositories or large documentation files in a single context.
Multimodal Support: Kimi K2.5 features native multimodal architecture that supports visual and text input, with support for video understanding and processing.
Reasoning: Supports “Thinking Mode” for complex chain-of-thought reasoning before answering. Modes include: K2.5 Instant, Thinking, Agent, and Agent Swarm (beta).
Protocols: Agent Client Protocol (ACP) for IDE integrations and Model Context Protocol (MCP) for external tools/models.
SDK: For custom embeddings in applications.
No Official Docker Support: Setups are Python-based.

System Requirements

Operating System:
- macOS: Native support.
- Linux: Native support.
- Windows: Supported via WSL 2 (Windows Subsystem for Linux). Native Windows CLI support remains unavailable; still requires WSL2.
Runtime:
- Python: Version 3.10 or higher (3.13 recommended).
- uv: An extremely fast Python package and project manager (required for CLI installation).
Editor: VS Code (latest version) for extension support.
Moonshot AI API Key: Required (free tier available); dependencies like make, git; Nix/Flake for reproducible environments.
Optional Hardware for Local Inference: Minimum 128GB system RAM, 32GB GPU VRAM; recommended 256GB RAM, 80GB GPU (e.g., A100/H100) for full deployment.

3. Installation Guide

A. VS Code Extension

The easiest way to get started is with the VS Code extension.

Install via Marketplace:
- Open VS Code.
- Go to the Extensions view (Ctrl+Shift+X or Cmd+Shift+X).
- Search for “Kimi Code” (Publisher: moonshot-ai).
- Click Install.
- Direct Link: Kimi Code on Marketplace
Authentication:
- Once installed, click the Kimi icon in the Activity Bar.
- Select “Sign in with Kimi Account” to authorize via browser.
- Alternative: If you have an API Key, click “Skip” and configure the API key in settings. Add option to use API key directly in settings without skipping.

B. Command Line Interface (CLI)

The Kimi CLI is a powerful terminal agent.

Prerequisites

You must have uv installed. If you don’t, install it first: macOS / Linux / Windows (WSL):

curl -LsSf https://astral.sh/uv/install.sh | sh

For Windows, emphasize WSL2 setup (e.g., enable WSL, install Ubuntu).

Installation

Option 1: Install as a tool (Recommended) This installs kimi-cli in an isolated environment and makes the kimi command available globally.

uv tool install kimi-cli

Option 2: Install via pip (Standard)

pip install kimi-cli

Alternative Repo Clone Method: git clone https://github.com/MoonshotAI/kimi-cli.git && cd kimi-cli; run make prepare for deps; login with kimi login (OAuth or API key).

Note for Windows Users: Native Windows support is still in development. Please use WSL 2 (Ubuntu/Debian) to install and run the CLI for the best experience.

SDK Installation

Python: pip install kimi-agent-sdk

Node.js: npm install @moonshot-ai/kimi-agent-sdk

Go: go get github.com/MoonshotAI/kimi-agent-sdk/go

4. Usage & Implementation

Using the CLI (kimi)

Kimi CLI has two primary modes: Shell Mode for single commands and Agent Mode for continuous assistance. Add Shell Mode toggle (Ctrl-X); MCP management commands (e.g., kimi mcp add --transport http context7 https://mcp.context7.com/mcp --header "CONTEXT7_API_KEY: ctx7sk-your-key", kimi mcp list, kimi mcp remove chrome-devtools); ad-hoc MCP via --mcp-config-file /path/to/mcp.json.
Start the CLI:
kimi
Common Commands:
/exit or Ctrl+D: Exit the session.

/clear: Clear the context history.
Example Workflow:
Find all python files larger than 100 lines and list them.
```
Kimi will generate the `find` command, explain it, and ask for permission to run it.
Using VS Code

Chat Interface:
Open via sidebar icon.

Ask questions like “Explain this function” or “Refactor this class”.

Context Management (The @ Symbol):
Type @ to reference files, folders, or code symbols.

Example: “How does @auth.ts interact with @user_model.py?”

YOLO Mode (Auto-Approve):
By default, Kimi asks for permission before editing files.

Enable YOLO Mode in settings (kimi.yoloMode) to allow it to execute commands and edits autonomously. Use with caution on potential risks!

Additional Usage Details
ACP Startup: kimi acp for other IDEs like Cursor, JetBrains, Zed.
Custom Tools: Add to skills dir or via SDK; Python SDK example:
from kimi_agent_sdk import Agent

agent = Agent() response = agent.chat(“Write a Python script for hello world.”) print(response)

- Zsh Plugin Integration: `git clone https://github.com/MoonshotAI/zsh-kimi-cli.git ${ZSH_CUSTOM:-~/.oh-my-zsh/custom}/plugins/kimi-cli` and add to `~/.zshrc`: `plugins=(... kimi-cli)`.
---
## 5. Docker Setup (Unofficial)
Since there is no official Docker image yet, you can use this `Dockerfile` to create a clean environment with `kimi-cli` pre-installed.
**File: `Dockerfile.kimi`**
```dockerfile
# Use a lightweight Python base image
FROM python:3.13-slim
# Install system dependencies (curl, git, build essentials)
RUN apt-get update && apt-get install -y   curl   git   build-essential   && rm -rf /var/lib/apt/lists/*
# Install uv package manager
RUN curl -LsSf https://astral.sh/uv/install.sh | sh
# Add uv to PATH
ENV PATH="/root/.cargo/bin:$PATH"
# Install kimi-cli using uv
RUN uv tool install kimi-cli
# Ensure kimi is on the path
ENV PATH="/root/.local/bin:$PATH"
# Set working directory
WORKDIR /workspace
# Default entrypoint
CMD ["kimi"]

Update ENV PATH to include uv bin; add API key config step (e.g., ENV or volume for config). Build and Run:

# Build the image
docker build -t kimi-code -f Dockerfile.kimi .
# Run the container (mounting current directory)
docker run -it -v $(pwd):/workspace kimi-code

6. Testing & Validation

After installation, validate your setup:

Check Version:
```
kimi --version
```
Expect output like kimi-cli 1.1.x or higher with K2.5 integration.
Test Connectivity: Run a simple query to ensure the API is connecting:
```
kimi "Hello, are you ready to code?"
```
VS Code Check:
- Open a file.
- Select code + Cmd+K.
- Type “Add comments”.
- Verify that Kimi generates a diff and allows you to accept it.
Local Model Inference Tests: Via vLLM/Hugging Face; API testing with OpenAI-compatible SDK: pip install openai, then client with Moonshot API key.
Community Benchmarks: On Hugging Face or Together.ai.

Resources & Links

Official Website: kimi.com/code
Documentation: kimi.com/code/docs English docs: https://moonshotai.github.io/kimi-cli/en/; Chinese: https://moonshotai.github.io/kimi-cli/zh/.
GitHub Repository: MoonshotAI/kimi-cli
VS Code Extension: Visual Studio Marketplace
Kimi K2.5 Tech Blog: https://www.kimi.com/blog/kimi-k2-5.html
API Platform: https://platform.moonshot.ai/
Hugging Face Model: https://huggingface.co/moonshotai/Kimi-K2.5
NVIDIA NIM: https://build.nvidia.com/moonshotai/kimi-k2.5

SDK and Custom Integrations

The Kimi Agent SDK is an open-source library designed for embedding Kimi agents into custom applications, enabling developers to integrate AI-powered agents seamlessly. It supports multiple languages including Python, Node.js (TypeScript), and Go, allowing for flexible implementation across different tech stacks. The SDK acts as a thin client that proxies requests to the Kimi CLI runtime, reusing configurations, tools, and sessions for efficiency.

Key Features

Multi-Language Support: Libraries for Python, Node.js, and Go.
Session Management: Reuse CLI sessions for persistent contexts.
Tool Integration: Register custom tools and handle tool calls.
Autonomous Agents: Support for agentic workflows with up to 200-300 tool calls.
Extensibility: Easy to add custom skills and MCP configurations.
Transparency: Fully open-source under Apache 2.0, with clear runtime visibility.

Installation

Python: pip install kimi-agent-sdk
Node.js: npm install @moonshot-ai/kimi-agent-sdk
Go: go get github.com/MoonshotAI/kimi-agent-sdk/go

Usage Examples

Python Example: Basic Chat

from kimi_agent_sdk import Agent

agent = Agent()
response = agent.chat("Generate a Python function to calculate factorial.")
print(response)

Node.js Example: Custom Tools

import { Agent } from '@moonshot-ai/kimi-agent-sdk';

const agent = new Agent();
agent.registerTool({
  name: 'get_weather',
  description: 'Get current weather',
  parameters: { type: 'object', properties: { location: { type: 'string' } } },
  async handler({ location }) {
    // Implement weather API call
    return `Weather in ${location}: Sunny`;
  }
});

const response = await agent.chat('What is the weather in Jaipur?');
console.log(response);

Go Example: Streaming Response

package main

import (
	"fmt"
	"github.com/MoonshotAI/kimi-agent-sdk/go"
)

func main() {
	agent := kimi.NewAgent()
	stream := agent.ChatStream("Explain quantum computing simply.")
	for chunk := range stream {
		fmt.Print(chunk)
	}
}

For more advanced usage, refer to the examples in the SDK repository directories (e.g., examples/python/customized-tools). Ensure the Kimi CLI is running or configured for the SDK to proxy requests effectively.

Pricing and Limitations

Moonshot AI’s Kimi API uses a pay-as-you-go model with usage-based pricing. Prices vary by model and context length.

Pricing Details

For kimi-k2.5 (as of January 2026):

Input: $0.10 per 1M tokens
Output: $0.60 per 1M tokens
Cache Hit: $3.00 per 1M tokens (for prompt caching)
Context: Up to 262,144 tokens

For kimi-k2-thinking:

Input: $0.60 per 1M tokens
Output: $2.50 per 1M tokens

File-related APIs (e.g., content extraction) are temporarily free. Billing is based on both input and output tokens.

Rate Limits

Rate limits are tiered based on cumulative recharge amount: | Tier | Recharge | Concurrency | RPM | TPM | TPD | |------|----------|-------------|-----|-----|-----| | Tier0 | $1 | 1 | 3 | 500,000 | 1,500,000 | | Tier1 | $10 | 50 | 200 | 2,000,000 | Unlimited | | Tier2 | $20 | 100 | 500 | 3,000,000 | Unlimited | | Tier3 | $100 | 200 | 5,000 | 3,000,000 | Unlimited | | Tier4 | $1,000 | 400 | 5,000 | 4,000,000 | Unlimited | | Tier5 | $3,000 | 1,000 | 10,000 | 5,000,000 | Unlimited |

Free tier has temporary access with low limits; paid tiers unlock higher concurrency and unlimited daily tokens.

Limitations

API Access: Free tier may have temporary restrictions on new models like K2.5.
Local Model Hardware Needs: For Kimi K2.5 local inference:
- Minimum: Total disk + RAM + VRAM ≥ 250GB (for quantized versions like INT4 or 1-bit).
- Recommended: 512GB+ RAM, 80GB+ GPU VRAM (e.g., 16-32 H100 GPUs for full performance).
- Native INT4 quantization reduces VRAM requirements, enabling runs on consumer hardware with sufficient total memory (e.g., 640GB Ampere setups).
- For CPU-only: Possible with low specs but slow; use frameworks like llama.cpp with MoE offloading.
Other: No native Windows CLI; limited internet access in some integrations; high verbosity in responses.

Contributions and Updates

Kimi Code is open-source, welcoming contributions from the community. Follow standard GitHub workflows to contribute.

How to Contribute

Fork the Repository: Fork https://github.com/MoonshotAI/kimi-cli on GitHub.
Clone Locally: git clone https://github.com/your-username/kimi-cli.git
Set Up Environment: Run make prepare to install dependencies.
Make Changes: Develop features or fixes in a new branch (e.g., git checkout -b feature/new-tool).
Format and Check Code: make format (using black/ruff), make check (linting), make test (run unit/e2e tests).
Commit and Push: Commit with clear messages, push to your fork.
Create Pull Request: Submit a PR to the main branch, describing changes and linking issues.
Guidelines: Follow CONTRIBUTING.md (if available); ensure tests pass; adhere to code style.

Changelog for v1.1

Version 1.1 (released around late 2025/early 2026) includes:

OAuth Integration: Added OAuth login flow for easier authentication.
Rebranding: Updated from “kimi-cli” to align with “Kimi Code” branding, with UI enhancements.
K2.5 Support: Incorporated multimodal features and agent swarm beta.
Bug Fixes: Improved session management, MCP handling, and cross-platform compatibility.
Other: Enhanced documentation, added Zsh plugin, and optimized for Nix environments.

For full changelog, check RELEASE_NOTES.md or GitHub releases in the repository.

Comments

Your comments help others in the community.

The Ultimate Guide to Kimi Code: Architecture, Installation, and Usage

1. Use Cases

2. Architecture & System Requirements

Architecture Highlights

System Requirements

3. Installation Guide

A. VS Code Extension

B. Command Line Interface (CLI)

Prerequisites

Installation

SDK Installation

4. Usage & Implementation

Using the CLI (kimi)

Using VS Code

Additional Usage Details

6. Testing & Validation

Resources & Links

SDK and Custom Integrations

Key Features

Installation

Usage Examples

Python Example: Basic Chat

Node.js Example: Custom Tools

Go Example: Streaming Response

Pricing and Limitations

Pricing Details

Rate Limits

Limitations

Contributions and Updates

How to Contribute

Changelog for v1.1

Comments

Using the CLI (`kimi`)