Skip to main content
graphwiz.aigraphwiz.ai
← Back to AI

CoreCoder: Claude Code's Architecture in 950 Lines of Python

AI
claude-codeai-agentscorecoderreverse-engineeringllmcoding-agentpython

Claude Code is roughly 512,000 lines of TypeScript spread across a closed repository. CoreCoder, built by Yufeng He over two days, distils that into approximately 950 lines of Python. The result is not a toy or a proof of concept. It is a working coding agent that runs against any OpenAI-compatible API, spawns sub-agents, executes tools in parallel, and manages its own context window. If you want to understand how modern AI coding agents actually work, reading CoreCoder is faster than reading the documentation.

The Core Loop

Every coding agent boils down to the same cycle: receive a user message, assemble a prompt with system instructions and conversation history, send it to an LLM, then act on the response. If the LLM returns tool calls, execute them, feed the results back, and loop. If the LLM returns plain text, the task is done.

CoreCoder's Agent.chat() method makes this explicit in under 40 lines. The user's message gets appended to the conversation history. The agent then enters a loop bounded by max_rounds (default 50). Each iteration calls the LLM with the full message list and the tool schemas. If the response contains no tool calls, the text goes back to the user. If it does, each tool call is dispatched and the results are appended as tool role messages before looping again.

def chat(self, user_input: str, on_token=None, on_tool=None) -> str:
    self.messages.append({"role": "user", "content": user_input})
    for _ in range(self.max_rounds):
        resp = self.llm.chat(
            messages=self._full_messages(),
            tools=self._tool_schemas(),
            on_token=on_token,
        )
        if not resp.tool_calls:
            self.messages.append(resp.message)
            return resp.content
        # execute tool calls and loop

That is the entire agentic loop. Everything else in Claude Code exists to make this loop work reliably at scale.

Key Components

Context Manager

Context windows fill up. CoreCoder handles this with a three-layer compression system that mirrors Claude Code's own four-layer strategy. Layer one, triggered at 50 percent capacity, truncates verbose tool outputs by keeping the first three and last three lines of any result over 1,500 characters. Layer two, at 70 percent, sends old conversation turns to the LLM for summarisation, preserving file paths, decisions, and errors while discarding redundant output. Layer three, at 90 percent, is a hard collapse that keeps only the last four messages plus a generated summary. The whole thing is 196 lines in context.py.

Tool System

CoreCoder ships seven tools: bash execution, search-and-replace file editing, file reading, file writing, glob-based file search, content search via grep, and sub-agent spawning. Each tool inherits from a base Tool class that provides a name, description, JSON Schema parameters, and an execute() method. Adding a new tool requires about 20 lines of code.

The file editing tool deserves particular attention. Rather than sending whole-file rewrites or line-number patches, it uses an exact substring match approach: the LLM specifies the text to find and the text to replace it with. The substring must appear exactly once in the file, which eliminates ambiguity. A unified diff is generated and returned so the user can see exactly what changed. This is the same pattern Claude Code's FileEditTool uses, distilled from hundreds of lines to 85.

Permission System

The bash tool implements a safety layer via regex-based dangerous command detection. Nine patterns catch common destructive operations: recursive deletes on root or home directories, mkfs, raw disk writes, chmod 777 on root, fork bombs, and piping curl or wget directly into bash. Matched commands are blocked with an explanation, not silently rejected. It is not a full permission UI like Claude Code's interactive approval system, but it covers the most catastrophic cases in a handful of regex patterns.

Streaming Executor

The LLM.chat() method streams responses token by token, accumulating text content and tool call deltas separately. Tool call arguments arrive incrementally across multiple SSE chunks, so CoreCoder maintains a map indexed by tool call position, concatenating argument fragments as they arrive. Once the stream completes, accumulated arguments are parsed from JSON. Retry logic handles rate limits, timeouts, and server errors with exponential backoff, retrying 5xx errors but failing immediately on 4xx client errors.

What Claude Code Adds: The 99.8%

Claude Code's remaining half million lines cover everything that makes it a production-grade product rather than a proof of concept. The UI layer is a full React/Tauri desktop application with permission prompts, file diffs, and streaming output. Conversation management persists sessions across restarts. Git integration tracks branches, diffs, and commit history. Context window optimisation includes prompt caching, token-aware message selection, and background compaction. MCP (Model Context Protocol) support lets users extend the tool system with external servers. Parallel tool execution in Claude Code's StreamingToolExecutor (530 lines) begins while the model is still generating, not after the full response arrives. Claude Code also includes telemetry, an update system, retry logic with sophisticated backoff, and 44 feature flags discovered in the source analysis.

CoreCoder omits all of this. What remains is the structural skeleton.

The Multi-Agent Architecture

CoreCoder's AgentTool allows the main agent to spawn sub-agents for complex sub-tasks. A sub-agent gets its own conversation history, access to the same tools (minus the ability to spawn further agents, preventing recursive delegation), and an independent context window. It runs to completion and returns a text summary to the parent. Results longer than 5,000 characters are truncated before being returned, preventing the sub-agent from blowing up the parent's context.

This mirrors how real distributed systems delegate work. A senior engineer hands off a research task to a colleague who works in isolation, reports back with findings, and disappears. The parent agent never sees the sub-agent's intermediate reasoning, only the final output. Claude Code's AgentTool spans 1,397 lines and includes shared context passing, timeout handling, and permission inheritance. CoreCoder captures the same architectural idea in 58 lines.

What This Teaches Us

Yufeng He describes CoreCoder as the nanoGPT of coding agents. The comparison is apt. Karpathy's nanoGPT revealed that the core ideas behind large language model training could fit in a few hundred lines of readable Python. CoreCoder does the same for AI coding agents.

The essential complexity is small. Any competent LLM with tool use can function as a coding agent. You need a loop, a way to call tools, and a way to manage context. The accidental complexity is enormous: permission UIs, streaming parsers, session persistence, multi-provider support, error recovery, parallel execution, and all the infrastructure that turns a working prototype into a product users trust with their codebase.

CoreCoder strips away the accidental complexity and leaves the load-bearing walls visible. For anyone building, extending, or simply understanding coding agents, that visibility is invaluable.

Hidden Features Worth Studying

He followed the CoreCoder release with a seven-part article series that dissects Claude Code's architecture in detail. The series covers the agent loop, the tool system, context compression strategies, the streaming executor, multi-agent architecture, and the 44 hidden feature flags discovered during source analysis. The original Chinese-language analysis has accumulated over 170,000 reads and 6,000 bookmarks on Zhihu. The article directory in the repository links to all seven parts.

Conclusion

The best way to learn how something works is to study its simplest correct implementation. CoreCoder takes half a million lines of production code and reduces it to something readable in a single afternoon. It proves that the architecture of modern AI coding agents is not inherently complex. The complexity comes from making them robust, safe, and pleasant to use. Understanding the difference between those two kinds of complexity is the first step toward building something better.

The source code is available at github.com/he-yufeng/CoreCoder under the MIT licence.