Tool Execution Orchestration——The Art of Parallelism, Streaming, and Interruption

Have you noticed that when you ask Claude Code to search multiple files at once, it returns results almost instantly? How is this achieved?
Or when you have it execute a time-consuming command, you can see the output in real-time rather than waiting idly. How is this implemented?
The answers all lie in tool execution orchestration. Today we’re talking about this “backend scheduling system.”
From Request to Execution: The Complete Flow
When the model decides to call tools, the execution flow is:
Model Issues tool_use Request
↓
Permission Check (Three-tier Check)
↓
Concurrency Safety Assessment
↓
Add to Execution Queue
↓
Actual Execution
↓
Streaming Progress Feedback
↓
Return Results
Every step matters.
The Complete Permission Check Flow
Tools must pass permission checks before execution. This isn’t a single gate but three checkpoints:
First Gate: User Rule Check
Check user-configured alwaysAllow/alwaysDeny/askBefore rules. If denied, reject directly; if in alwaysAllow, skip subsequent checks.
Second Gate: Tool-level Permission Check
Call the tool’s checkPermissions method. This method is implemented by the tool itself and allows fine-grained control.
For example, BashTool checks if commands are dangerous:
checkPermissions(input) {
if (input.command.includes('rm -rf /')) {
return { behavior: 'deny', reason: 'Prohibited from deleting root directory' };
}
if (this.isReadOnly(input)) {
return { behavior: 'allow' };
}
return { behavior: 'ask' };
}
Third Gate: YOLO Classifier Check
For complex scenarios, use an AI classifier to judge safety.
A tool can only execute after passing all three checkpoints.
Concurrent Execution Scheduling Strategy
When the model requests multiple tools in one turn, Claude Code attempts to execute them in parallel.
But parallelism isn’t unlimited—considerations include:
Concurrency Safety: Only tools marked isConcurrencySafe can run in parallel with others.
Resource Limits: There’s a cap on simultaneously executing tools to avoid resource exhaustion.
Dependencies: If tool B depends on tool A’s results, it must wait for A to complete before executing B.
Actual scheduling algorithm:
Received Tool List [A, B, C, D]
↓
Classify:
- A: Concurrency-safe → Execute immediately
- B: Concurrency-safe → Execute immediately
- C: Unsafe, but queue empty → Execute immediately
- D: Unsafe, queue has C → Wait
↓
Execute A, B, C in parallel
↓
After C completes, execute D
↓
All complete, return results
Streaming Progress Propagation Mechanism
For long-running tools (like BashTool executing time-consuming commands), Claude Code supports streaming progress feedback.
Implementation mechanism:
onProgress Callback: During tool execution, periodically call onProgress to report progress.
call(input, context, canUseTool, parentMessage, onProgress) {
const process = spawn(command);
process.stdout.on('data', (data) => {
onProgress({ stdout: data.toString() });
});
process.stderr.on('data', (data) => {
onProgress({ stderr: data.toString() });
});
}
Progress Aggregation: Progress from multiple tools is aggregated in the UI, showing each tool’s execution status.
Render Updates: React Ink components receive progress updates and redraw the interface in real-time.
This lets users see what the AI “is doing” rather than staring at a blank screen.
Interruption Mechanism: Users Can Stop Anytime
Users can press Ctrl+C at any time to interrupt execution. This requires:
Signal Capture: Catch SIGINT signal to trigger the cancellation flow.
Tool Cancellation: Send cancellation signals to tools currently executing.
// BashTool cancels execution
abort() {
if (this.process) {
this.process.kill('SIGTERM');
}
}
State Rollback: Completed tool results are preserved; incomplete ones are cancelled.
Notify Model: Send a “user cancelled” message to the model so it knows what happened.
Error Handling: Failure Isn’t the End
Tool execution may fail. Error handling strategies:
Tool-level Errors: Non-zero exit codes from commands, file not found, etc. These are returned to the model as tool_result, and the model decides the next step.
System-level Errors: Network interruption, memory exhaustion, etc. These are reported to the user.
Timeout Handling: Each tool has a timeout limit; execution terminates automatically after timeout.
Errors aren’t the end—they’re information. Upon receiving errors, the model can:
- Retry
- Switch to other tools
- Explain to the user
- Ask the user for help
Practical: Optimizing Tool Usage
Understanding tool execution orchestration helps you:
Leverage Parallelism: Request multiple independent read operations at once to improve efficiency.
Avoid Blocking: Don’t mix time-consuming operations with lightweight ones in serial execution.
Interrupt Appropriately: If the AI is doing useless work, interrupt in time to avoid wasting tokens.
Handle Errors: When tools report errors, give the AI clear feedback to help it adjust strategy.
Summary
Tool execution orchestration is Claude Code’s “backend scheduling system”:
- Three-tier permission checks ensure safety
- Concurrent scheduling improves efficiency
- Streaming feedback enhances experience
- Interruption mechanism maintains user control
Understanding this mechanism helps you collaborate better with AI and can inform your own AI Agent designs.
That’s the article. Hopefully it helps you understand tool execution orchestration. In the next article, we’ll talk about model-specific tuning—how different models have different “scripts.”
