What the Claude Code Source Leak Reveals About AI Coding Tool Architecture

TL;DR

Anthropic accidentally shipped a .map file with the Claude Code npm package, exposing the complete readable source code of their CLI tool. The leak reveals anti-distillation mechanisms with fake tool injection, a frustration-detection regex engine, an “undercover mode” that hides AI authorship in open-source commits, and an unreleased autonomous agent mode called KAIROS. Here’s what API developers should know about how AI coding tools work under the hood.

Introduction

On March 31, 2026, security researcher Chaofan Shou discovered that Anthropic shipped a source map file (.map) alongside the Claude Code npm package. Source maps are debug files that map minified production code back to human-readable source. They’re supposed to be stripped before publishing.

They weren’t. The complete Claude Code source code, with comments, internal codenames, and architectural details, was readable by anyone who downloaded the package.

The discovery hit #1 on Hacker News (1,888 points, 926 comments) and spread across Reddit, Twitter, and developer forums within hours. Anthropic removed the package, but the code had already been mirrored and analyzed extensively.

💡

Whether you use Claude Code, Cursor, GitHub Copilot, or Apidog’s API development platform, this leak provides rare technical insight into how AI coding tools work. Understanding these internals helps you make informed decisions about which tools to trust with your codebase. Try Apidog free for transparent, dependency-free API development.

button

This article analyzes the key technical findings and what they mean for developers who rely on AI coding tools.

How the source code leaked

The root cause: a Bun build tool bug

Claude Code is built on Bun, an alternative JavaScript runtime. On March 11, 2026, a bug was filed against Bun (oven-sh/bun#28001) reporting that source maps are served in production mode despite Bun’s documentation specifying they should be disabled.

Anthropic’s build pipeline triggered this bug. When they published the Claude Code npm package, the .map file was included in the distribution. Anyone running npm pack @anthropic-ai/claude-code or inspecting the package contents could access the complete, un-minified source.

The irony is notable: a bug in Anthropic’s own toolchain, the Bun runtime they chose for Claude Code, leaked their proprietary source code through the npm registry they publish to. The same npm registry that, on the same day, distributed the compromised Axios package.

What was exposed

The leak included:

Complete TypeScript source across all modules
Internal comments explaining design decisions
Feature flags and experimental configurations
System prompt templates and safety mechanisms
Internal codenames for unreleased features
Performance optimization details with specific metrics

This isn’t a partial leak or a sanitized open-source release. It’s the production codebase with internal engineering context intact.

Anti-distillation: protecting against model theft

Fake tool injection

One of the most discussed findings is Claude Code’s anti-distillation system. In claude.ts (lines 301-313), when the ANTI_DISTILLATION_CC flag is enabled, the system sends anti_distillation: ['fake_tools'] in API requests.

This instructs Anthropic’s server to inject decoy tool definitions into the system prompt. The purpose: if a competitor records API traffic to extract and replicate Claude’s tool-use behavior, the training data contains fake tools that don’t exist. A model trained on this poisoned data would hallucinate non-existent capabilities.

This is a practical defense against a real threat. Competitors can set up proxy services that intercept Claude API calls, record the system prompts and tool definitions, and use that data to fine-tune their own models. The fake tools make this captured data unreliable.

Connector-text summarization

A second anti-distillation mechanism in betas.ts (lines 279-298) takes a different approach. This server-side system buffers assistant text between tool calls, summarizes it, and returns the summary with a cryptographic signature.

In subsequent conversation turns, the original text can be restored from the signature. But anyone recording API traffic only captures the summaries, not the full reasoning text. This makes it harder to reverse-engineer Claude’s reasoning patterns from intercepted API conversations.

How easy are these to bypass?

The analysis revealed several bypass paths:

A man-in-the-middle proxy could strip the anti_distillation field before API requests reach Anthropic’s servers
Setting the environment variable CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS disables the entire anti-distillation system
The protections only work against passive traffic recording, not against active API usage

This doesn’t mean the protections are useless. They raise the cost and complexity of automated model distillation. But they’re defense-in-depth measures, not unbreakable shields.

Undercover mode: hiding AI authorship

What undercover mode does

The undercover.ts file contains one of the leak’s most controversial findings. When Claude Code operates in non-Anthropic repositories, it activates a behavior-masking system that prevents:

Mentions of internal codenames like “Capybara” or “Tengu”
References to internal Slack channels or repository names
The phrase “Claude Code” itself appearing in outputs

The source code comment is explicit: “There is NO force-OFF. This guards against model codename leaks.”

Why this matters for open source

The practical effect: when Anthropic employees use Claude Code to write commits, pull requests, or code reviews in open-source projects, the tool hides evidence of AI involvement. AI-authored contributions to open-source repositories would lack disclosure of AI authorship.

This raises questions about transparency in open-source development. Several open-source projects have adopted policies requiring disclosure of AI-generated code. If a tool is designed to hide its involvement, those policies become harder to enforce.

The counterargument: undercover mode’s stated purpose is preventing leaks of internal project codenames, not hiding AI usage. But the implementation doesn’t distinguish between “don’t reveal internal names” and “don’t reveal you’re an AI tool.” It blocks both.

Frustration detection via regex

How it works

The userPromptKeywords.ts file implements user frustration detection through regex pattern matching. The system scans user inputs for profanity and emotionally charged language to gauge whether the user is frustrated with Claude Code’s responses.

The community response to this finding was split. Some saw it as reasonable UX research; understanding when users are frustrated helps improve the product. Others viewed it as surveillance of user emotional states.

The technical irony

Several HN commenters pointed out the irony: Anthropic builds the most advanced language models in the world, but uses regex to detect user emotions. The engineering comment in the source explains the rationale. Regex-based detection is faster and cheaper than LLM inference for this use case. Running an LLM call to classify sentiment on every user input would add latency and cost to every interaction.

It’s a pragmatic engineering decision. Fast regex for hot-path sentiment detection, saving LLM calls for the core coding tasks. Whether you’re comfortable with your AI coding tool running emotional analysis on your inputs is a personal decision.

Native client attestation

Cryptographic request verification

In system.ts (lines 59-95), Claude Code’s API requests include a cch=554eb placeholder. Bun’s native HTTP stack (written in Zig) overwrites this placeholder with a computed hash before the request leaves the client.

Anthropic’s servers validate this hash to cryptographically verify that requests originated from the legitimate Claude Code binary, not a fork, wrapper, or proxy.

Why this exists

This attestation system is the technical enforcement mechanism behind Anthropic’s legal actions against unauthorized Claude Code forks. If a fork can’t produce valid attestation hashes, Anthropic’s servers can reject its requests.

The implementation has boundaries though. It’s gated behind compile-time feature flags and can be disabled via the CLAUDE_CODE_ATTRIBUTION_HEADER setting or GrowthBook killswitches. This suggests the enforcement is graduated, with Anthropic able to tighten or loosen restrictions as needed.

For API developers, this is relevant because it demonstrates how SaaS tools can enforce client authenticity at the protocol level. Similar patterns exist in mobile API development, where app attestation prevents unauthorized API access. If you’re designing APIs with client verification, Apidog’s testing tools can help you validate attestation flows and certificate pinning across different client configurations.

KAIROS: the unreleased autonomous agent mode

What the code reveals

References throughout the codebase point to an unreleased feature-gated mode called KAIROS. The discovered scaffolding includes:

A /dream skill for “nightly memory distillation”
Daily append-only logging
GitHub webhook subscriptions for monitoring repository events
Background daemon workers with 5-minute cron refresh intervals

What this means

KAIROS appears to be an always-on, background-running agent that monitors your repositories and performs autonomous tasks without direct user interaction. Think of it as Claude Code running continuously, watching for changes, and proactively suggesting or making code modifications.

This aligns with the broader industry trend toward autonomous coding agents. GitHub Copilot’s Agent Mode, Cursor’s background processing, and Google’s Agent Smith all point toward AI coding tools that don’t wait for you to ask. They watch, learn, and act on their own.

For API development teams, autonomous agents that modify code repositories raise questions about API contract stability. If an agent updates your API endpoint code, does it also update the OpenAPI spec? The tests? The documentation? These are the workflow problems integrated platforms like Apidog are built to solve, keeping API design, tests, mocks, and docs in sync regardless of what triggers a code change.

Performance optimizations exposed

Terminal rendering: game-engine techniques

The ink/screen.ts and ink/optimizer.ts files reveal that Claude Code uses game-engine techniques for terminal rendering:

Int32Array-backed character pools for memory-efficient screen buffers
Patch optimization that reduces character-width calculations by approximately 50x during token streaming

This explains why Claude Code feels responsive even during long output streams. The rendering layer is optimized at a level unusual for CLI tools.

Prompt cache economics

promptCacheBreakDetection.ts tracks 14 distinct cache-break vectors with “sticky latches” that prevent mode toggles from invalidating cached prompts. This reflects how important prompt caching is economically for Claude Code’s business model.

Each cache break forces Anthropic to reprocess the entire system prompt and conversation context. At Claude’s token pricing, preventing unnecessary cache invalidation saves significant infrastructure costs. The fact that they track 14 separate cache-break vectors suggests the engineering team treats prompt cache optimization as a first-class performance concern.

The autocompact failure cascade

A comment in autoCompact.ts (lines 68-70) revealed a significant production issue: “1,279 sessions had 50+ consecutive failures (up to 3,272) in a single session, wasting ~250K API calls/day globally.”

The three-line fix set MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES = 3. This bug only surfaces at scale. When context management fails, the system retries aggressively, burning through API calls without making progress. For a tool with millions of active sessions, 250K wasted API calls per day translates to substantial cost.

This context helps explain the recent Hacker News post about Claude Code users “hitting usage limits way faster than expected” (275 points). Some of that limit consumption may trace back to internal efficiency bugs like this one.

Security hardening details

Bash security: 23 numbered checks

bashSecurity.ts implements 23 numbered security checks for shell command execution, including defenses against:

Zsh builtin exploitation
Unicode zero-width space injection in commands
IFS (Internal Field Separator) null-byte injection
Additional protections discovered during HackerOne security review

This is unusually thorough for a CLI tool. Most AI coding tools that execute shell commands have basic sanitization. Claude Code’s 23 checks suggest they’ve dealt with (or proactively defended against) creative attack vectors.

For API developers who use AI tools to generate and execute API testing scripts, this level of shell security is relevant. If your AI coding tool runs curl commands, database queries, or infrastructure scripts, the security of the command execution layer matters.

What API developers should take away from this

1. Understand what your AI coding tools do behind the scenes

The Claude Code leak reveals capabilities that most users didn’t know existed: anti-distillation measures, frustration detection, undercover mode, client attestation. Other AI coding tools have their own internal mechanisms that users can’t inspect.

Ask yourself: do you know what data your AI coding tool collects? What it sends to external servers? Whether it masks its own involvement in your code?

2. The build toolchain is an attack surface

Anthropic’s source leaked because of a Bun bug. On the same day, Axios was compromised through npm account hijacking. Your build tools, package managers, and runtime environments are all potential failure points.

For API development, this means:

Audit your build pipeline dependencies
Verify that your CI/CD doesn’t expose source maps, .env files, or internal configurations
Use integrated development platforms that minimize third-party dependency surfaces

3. AI coding tools are converging on autonomous operation

KAIROS, GitHub Copilot’s Agent Mode, Google’s Agent Smith. The direction is clear: AI tools that run continuously, watch repositories, and act autonomously.

API teams need to prepare for this by ensuring their API lifecycle is managed in a single platform. When an autonomous agent modifies your API implementation, your tests, mocks, documentation, and specs need to stay synchronized. Disconnected tools create drift. Integrated platforms like Apidog keep the entire API lifecycle in sync, whether changes come from human developers or AI agents.

4. Source code transparency matters

This leak happened because the code was proprietary and accidentally exposed. Open-source AI tools don’t have this risk because their code is already public.

When evaluating AI coding tools, consider whether you prefer tools whose internals you can inspect versus tools that rely on trust in the vendor. Both approaches have trade-offs, but the Claude Code leak demonstrates what “trust the vendor” looks like when the vendor’s code reveals unexpected behaviors.

FAQ

Is Claude Code safe to use after the source leak?

Yes. The leak exposed source code, not user data. Anthropic removed the .map file and the source is no longer distributed with the npm package. The features revealed (anti-distillation, frustration detection, undercover mode) are architectural decisions, not security vulnerabilities. Whether you’re comfortable with those decisions is a separate question from safety.

What is the “undercover mode” in Claude Code?

Undercover mode prevents Claude Code from revealing internal Anthropic project names, codenames, and its own identity when operating in non-Anthropic repositories. It activates automatically and cannot be disabled. The practical effect is that AI-generated code in open-source projects won’t identify itself as written by Claude Code.

What are the fake tools in Claude Code?

When anti-distillation is enabled, Anthropic’s server injects decoy tool definitions into the system prompt. These fake tools don’t do anything. They exist to poison the training data of competitors who record API traffic to train competing models. If someone tries to replicate Claude’s behavior from intercepted data, their model will hallucinate non-existent capabilities.

What is KAIROS in Claude Code?

KAIROS is an unreleased, feature-flagged autonomous agent mode found in the Claude Code source. It includes scaffolding for background daemon workers, GitHub webhook subscriptions, and a /dream skill for memory distillation. It suggests Anthropic is building an always-on coding agent that monitors repositories and acts autonomously.

How did the Claude Code source code leak?

A Bun runtime bug (oven-sh/bun#28001) causes source maps to be included in production builds even when they shouldn’t be. Since Claude Code uses Bun as its build tool, this bug shipped the .map file with the npm package. Anyone inspecting the package could read the complete, un-minified source code.

Does this leak affect Claude API users?

No. The leak exposed the Claude Code CLI tool’s source code, not the Claude API itself. API keys, user data, and model weights were not involved. Claude API users can continue using the API normally. The revealed anti-distillation mechanisms are specific to Claude Code’s request pipeline.

Should I worry about frustration detection in my AI coding tools?

That depends on your comfort level. Claude Code uses regex patterns to detect user frustration (profanity, emotional language) in prompts. This is faster and cheaper than LLM-based sentiment analysis. The data appears to be used for product improvement, not shared externally. Other AI tools may have similar features without disclosing them.

How does this relate to the Axios npm attack on the same day?

Both events occurred on March 31, 2026, but they’re unrelated. The Axios attack was a deliberate supply chain compromise by state-sponsored hackers. The Claude Code leak was an accidental build configuration error. Together, they’ve intensified scrutiny of npm package security and the trust developers place in tools distributed through package registries.

Key takeaways

Claude Code’s source leaked via a Bun build tool bug that shipped source maps in the npm package
Anti-distillation mechanisms inject fake tools and summarize reasoning to prevent model theft
Undercover mode hides Claude Code’s involvement in non-Anthropic open-source repositories
Frustration detection runs via regex on user inputs, not LLM-based analysis
KAIROS scaffolding reveals an unreleased autonomous background agent mode
Client attestation cryptographically verifies requests come from legitimate Claude Code binaries
The leak highlights the importance of transparent, inspectable tooling in API development workflows

Understanding how your AI coding tools work under the hood helps you make better decisions about trust, privacy, and workflow design. For API teams, the key lesson is that your development tools are part of your security surface. Choose tools you can verify, and build workflows that stay consistent regardless of whether a human or an AI agent makes the next change.

button