Tool Execution Orchestration——The Art of Parallelism, Streaming, and Interruption

Have you noticed that when you ask Claude Code to search multiple files at once, it returns results almost instantly? How is this achieved?

Or when you have it execute a time-consuming command, you can see the output in real-time rather than waiting idly. How is this implemented?

The answers all lie in tool execution orchestration. Today we’re talking about this “backend scheduling system.”

From Request to Execution: The Complete Flow

When the model decides to call tools, the execution flow is:

Model Issues tool_use Request
  ↓
Permission Check (Three-tier Check)
  ↓
Concurrency Safety Assessment
  ↓
Add to Execution Queue
  ↓
Actual Execution
  ↓
Streaming Progress Feedback
  ↓
Return Results

Every step matters.

The Complete Permission Check Flow

Tools must pass permission checks before execution. This isn’t a single gate but three checkpoints:

First Gate: User Rule Check

Check user-configured alwaysAllow/alwaysDeny/askBefore rules. If denied, reject directly; if in alwaysAllow, skip subsequent checks.

Second Gate: Tool-level Permission Check

Call the tool’s checkPermissions method. This method is implemented by the tool itself and allows fine-grained control.

For example, BashTool checks if commands are dangerous:

checkPermissions(input) {
  if (input.command.includes('rm -rf /')) {
    return { behavior: 'deny', reason: 'Prohibited from deleting root directory' };
  }
  if (this.isReadOnly(input)) {
    return { behavior: 'allow' };
  }
  return { behavior: 'ask' };
}

Third Gate: YOLO Classifier Check

For complex scenarios, use an AI classifier to judge safety.

A tool can only execute after passing all three checkpoints.

Concurrent Execution Scheduling Strategy

When the model requests multiple tools in one turn, Claude Code attempts to execute them in parallel.

But parallelism isn’t unlimited—considerations include:

Concurrency Safety: Only tools marked isConcurrencySafe can run in parallel with others.

Resource Limits: There’s a cap on simultaneously executing tools to avoid resource exhaustion.

Dependencies: If tool B depends on tool A’s results, it must wait for A to complete before executing B.

Actual scheduling algorithm:

Received Tool List [A, B, C, D]
  ↓
Classify:
  - A: Concurrency-safe → Execute immediately
  - B: Concurrency-safe → Execute immediately
  - C: Unsafe, but queue empty → Execute immediately
  - D: Unsafe, queue has C → Wait
  ↓
Execute A, B, C in parallel
  ↓
After C completes, execute D
  ↓
All complete, return results

Streaming Progress Propagation Mechanism

For long-running tools (like BashTool executing time-consuming commands), Claude Code supports streaming progress feedback.

Implementation mechanism:

onProgress Callback: During tool execution, periodically call onProgress to report progress.

call(input, context, canUseTool, parentMessage, onProgress) {
  const process = spawn(command);

  process.stdout.on('data', (data) => {
    onProgress({ stdout: data.toString() });
  });

  process.stderr.on('data', (data) => {
    onProgress({ stderr: data.toString() });
  });
}

Progress Aggregation: Progress from multiple tools is aggregated in the UI, showing each tool’s execution status.

Render Updates: React Ink components receive progress updates and redraw the interface in real-time.

This lets users see what the AI “is doing” rather than staring at a blank screen.

Interruption Mechanism: Users Can Stop Anytime

Users can press Ctrl+C at any time to interrupt execution. This requires:

Signal Capture: Catch SIGINT signal to trigger the cancellation flow.

Tool Cancellation: Send cancellation signals to tools currently executing.

// BashTool cancels execution
abort() {
  if (this.process) {
    this.process.kill('SIGTERM');
  }
}

State Rollback: Completed tool results are preserved; incomplete ones are cancelled.

Notify Model: Send a “user cancelled” message to the model so it knows what happened.

Error Handling: Failure Isn’t the End

Tool execution may fail. Error handling strategies:

Tool-level Errors: Non-zero exit codes from commands, file not found, etc. These are returned to the model as tool_result, and the model decides the next step.

System-level Errors: Network interruption, memory exhaustion, etc. These are reported to the user.

Timeout Handling: Each tool has a timeout limit; execution terminates automatically after timeout.

Errors aren’t the end—they’re information. Upon receiving errors, the model can:

Retry
Switch to other tools
Explain to the user
Ask the user for help

Practical: Optimizing Tool Usage

Understanding tool execution orchestration helps you:

Leverage Parallelism: Request multiple independent read operations at once to improve efficiency.

Avoid Blocking: Don’t mix time-consuming operations with lightweight ones in serial execution.

Interrupt Appropriately: If the AI is doing useless work, interrupt in time to avoid wasting tokens.

Handle Errors: When tools report errors, give the AI clear feedback to help it adjust strategy.

Summary

Tool execution orchestration is Claude Code’s “backend scheduling system”:

Three-tier permission checks ensure safety
Concurrent scheduling improves efficiency
Streaming feedback enhances experience
Interruption mechanism maintains user control

Understanding this mechanism helps you collaborate better with AI and can inform your own AI Agent designs.

That’s the article. Hopefully it helps you understand tool execution orchestration. In the next article, we’ll talk about model-specific tuning—how different models have different “scripts.”

From Request to Execution: The Complete Flow#

The Complete Permission Check Flow#

Concurrent Execution Scheduling Strategy#

Streaming Progress Propagation Mechanism#

Interruption Mechanism: Users Can Stop Anytime#

Error Handling: Failure Isn’t the End#

Practical: Optimizing Tool Usage#

Summary#