After discussing so many of Claude Code’s strengths, you might think it’s flawless. But the truth is: no system is perfect, only systems that keep improving.

Today, let’s talk about Claude Code’s shortcomings - not to pour cold water, but to objectively understand its boundaries so you can use it better.

Shortcoming 1: Context Window Ceiling

Have you encountered this: the project is too large, Claude Code can’t read all the code, can only read part by part, resulting in limited global perspective?

Is 200K Tokens Really Enough?

Claude Code’s context window is 200K tokens, which sounds like a lot, but actual consumption is faster than imagined:

  • System prompts: 15-20K
  • Skill list (100 skills): ~8K
  • One file read (2000 lines): 5-20K
  • Code search results (10 results): 10-30K
  • After several rounds of tool calls: already half used

For large codebases (like Linux kernel, Chromium browser), 200K can’t even fit the “directory tree,” let alone detailed content.

Limitations of Existing Solutions

Claude Code uses various tricks to mitigate this problem:

  • Compression: turns old conversations into summaries, freeing space
  • Paginated reading: large files read partially only
  • Selective restoration: after compression, only restore recently used files

But these are all “making do within limited space,” not truly expanding capacity. When project scale exceeds a certain threshold, Claude Code can only “admire a leopard through a tube,” unable to see the whole picture.

This is like: giving you a 200-page capacity folder to organize a library - no matter how skilled the technique, what doesn’t fit simply doesn’t fit.

Shortcoming 2: Tool Latency Accumulation

Have you felt that Claude Code sometimes “thinks” for quite a while? Especially in complex tasks, one round of conversation takes dozens of seconds.

Where Does Latency Come From

Every tool call has latency:

  1. API round-trip: send request → model generates → return result (2-10 seconds)
  2. Tool execution: read file, search code, execute command (0.1-5 seconds)
  3. Multi-round iteration: complex tasks require multiple tool calls (10-50 rounds)

A task “help me refactor this module” might need:

  • Search related files (3-5 rounds)
  • Read critical code (5-10 rounds)
  • Edit multiple files (5-10 rounds)
  • Run tests to verify (2-5 rounds)

Each round 2-10 seconds, total is several minutes. This is the smooth case - if errors occur in the middle and retries are needed, even longer.

Limitations of Parallelization

Claude Code supports parallel tool calls (send multiple tools at once), but this only solves “width” not “depth.” If tasks have dependencies (must find file before editing), parallelization doesn’t help.

This is like: you have a super-smart consultant, but each question takes a few seconds to answer, and they can only process one step at a time. A smart brain is dragged down by slow “hands and feet.”

Shortcoming 3: Cost-Quality Tradeoff

Have you thought about how much it costs to write code with Claude Code?

Real Cost of Token Consumption

Claude Code API calls aren’t free:

  • Input tokens: usually more expensive (because contains lots of context)
  • Output tokens: relatively cheap
  • Cache hit: 90% cheaper
  • Cache miss: full price

One moderately complex task might consume:

  • Input: 500K-2M tokens
  • Output: 50K-200K tokens
  • At current prices: a few dimes to a few yuan

Doesn’t sound like much? But if you use it writing code for 8 hours every day:

  • Daily: dozens to hundreds of tasks
  • Monthly: hundreds to thousands of yuan

Complexity from Optimization

To control costs, Claude Code does lots of optimizations:

  • Prompt caching (cache_creation inputs 90% cheaper)
  • Smart compression (reduces input tokens)
  • Token budget (limits tool result size)

But these optimizations also increase system complexity. Cache break detection, compression strategy tuning, budget allocation - none of these are “free,” requiring engineering investment and maintenance costs.

This is like: driving a high-performance car, but fuel consumption isn’t low. You can use various tricks to save gas, but either sacrifice speed or increase complexity.

Shortcoming 4: Lack of Offline Capability

Have you encountered network disconnection where Claude Code is completely unusable?

Limitations of Complete Cloud Dependency

Claude Code is “cloud-native” - all model inference happens on Anthropic’s servers. This means:

  • No network means no work: can’t use offline locally
  • API failure means stoppage: if Anthropic servers have issues, you suffer too
  • Data must be uploaded: code must be sent to the cloud to process

For certain scenarios, this is a hard limitation:

  • No WiFi on airplanes, can’t write code if you want to
  • Company intranet isolation, can’t connect externally
  • Involving sensitive code, don’t want to upload to third parties

Gap with Local Models

Someone might say: why not run an open-source model locally?

Theoretically possible, but practically the gap is large:

  • Capability gap: local models’ code ability usually weaker than Claude
  • Tool integration: no tool system as complete as Claude Code
  • Context length: local models usually can’t support 200K context

This is like: you have a super-smart remote assistant, but they can only work remotely. Once the network cuts out, you’re on your own.

Shortcoming 5: Complex Logic Limitations

Have you noticed that Claude Code handles simple tasks smoothly, but “flops” when encountering complex algorithms?

Model Capability Boundaries

Claude Code is powered by large language models, which have inherent limitations:

  • Weak symbolic reasoning: complex math proofs, algorithm derivations often error-prone
  • Poor long-range dependencies: complex logic relationships spanning multiple files easily “forgotten”
  • Boundary condition blind spots: easily miss exceptional cases, boundary conditions

For example:

  • “Implement a red-black tree” - might write the basic structure, but balance operations often error-prone
  • “Optimize this SQL query” - might give suggestions, but complex query plan analysis not necessarily accurate
  • “Refactor this concurrency module” - might introduce race conditions

Necessity of Verification Mechanisms

Because of these limitations, Claude Code needs:

  • Verification Agent: specifically verifies implementation correctness
  • Test requirements: run tests to verify modifications
  • YOLO Classifier: requests confirmation for high-risk operations

These aren’t “icing on the cake,” but “necessary safety nets.”

This is like: you hired a smart but somewhat careless assistant. Handles daily affairs well, but for important documents you must review them yourself.

Shortcoming 6: Memory System Boundaries

Have you noticed that Claude Code’s “memory” is sometimes unreliable? After switching sessions, some details are forgotten.

Limitations of Cross-Session Memory

Claude Code has a cross-session memory system (Memdir, Extract Memories, Auto-Dream), but it has boundaries:

  • Granularity issue: memory is coarse-grained topic files, not fine-grained conversation records
  • Latency issue: Auto-Dream is overnight consolidation, new information doesn’t take effect immediately
  • Accuracy issue: automatic extraction might miss critical information or extract incorrectly
  • Privacy issue: sensitive information might be written to memory files

Gap with “Real Memory”

Human memory is:

  • Immediate effect: just learned something and remembered
  • Fine-grained: can recall specific details
  • Rich associations: related content automatically connected

Claude Code’s memory is:

  • Delayed effect: must wait for next session or overnight consolidation
  • Coarse-grained: only summaries, no details
  • Passive retrieval: doesn’t automatically associate

This is like: your assistant has a notebook recording important matters, but after shift change, the new assistant can only see summaries from the notebook, not the detailed discussions from before.

How to View These Shortcomings

Shortcomings Are Results of Design Tradeoffs

These limitations aren’t “bugs,” but results of design tradeoffs:

LimitationDesign ChoiceWhat If Flipped
Limited contextControllable cost and latencyUnlimited context = unlimited cost + latency
Cloud dependentUse strongest modelsLocal running = greatly reduced capability
Non-zero costHigh-quality serviceFree = unsustainable service quality
Complex logic weakStrong general capabilitySpecialized symbolic reasoning = weaker natural language

Behind each “shortcoming” is “sufficient in another dimension.”

Using the Right Scenario Matters Most

Claude Code is suitable for:

  • Medium-sized projects (main code fits in context)
  • Iterative development (can accept multi-round latency)
  • Cost-sensitive but controllable (willing to pay for efficiency)
  • Has network environment (cloud dependency acceptable)
  • Assisted not replaced (humans still review)

Claude Code is not suitable for:

  • Very large projects (need global understanding)
  • Extremely high real-time requirements (latency unacceptable)
  • Extremely cost-sensitive (free is the only option)
  • Completely offline environments (can’t connect to network)
  • Zero-error scenarios (cannot have any errors)

Future Improvement Directions

Short-Term Achievable

  • Larger context windows: as models upgrade, context might expand to 500K or even 1M
  • Faster inference speed: optimize model architecture and inference infrastructure
  • Better local model support: maybe someday can run near-cloud-quality models locally
  • Smarter memory systems: more precise extraction, more timely consolidation

Long-Term Potentially Achievable

  • True persistent state: like humans maintaining complete context across sessions
  • Zero-latency tool calls: local execution + cloud inference hybrid architecture
  • Enhanced symbolic reasoning: combining neural networks and symbolic systems
  • Cost approaching zero: marginal cost reduction from technological progress

Summary

Claude Code’s six major shortcomings:

ShortcomingCore ManifestationCoping Strategy
Context ceiling200K tokens can’t hold large projectsModular development, batch processing
Tool latency accumulationComplex tasks need multiple rounds, time accumulatesParallelization, task splitting
Cost-quality tradeoffHigh quality = high costCache optimization, budget control
Lack of offline capabilityNo network means no workPlan ahead, offline backup plan
Complex logic limitationsAlgorithms, boundary conditions easily errVerification mechanisms, human review
Memory system boundariesCross-session memory coarse-grained, delayedProactive memory management, CLAUDE.md supplement

These shortcomings don’t mean Claude Code is hard to use - it’s still one of the most advanced AI coding assistants today. But understanding boundaries makes using it smoother:

  • Know what it’s good at - daily coding, standard refactoring, code review
  • Know what it’s not good at - complex algorithms, very large projects, zero-error scenarios
  • Know when to intervene - key decisions, boundary conditions, test verification

This is like: understanding your car’s top speed, fuel consumption, off-road capability - not limitations, but what makes you drive more safely and smoothly.


This concludes the main text of the Harness Engineering series. Next: Appendix - File Index, Environment Variables Reference, Glossary, and Feature Flag Checklist.