Your nodes can now talk to each other, sync files, and send commands — but honestly, that’s not quite useful enough yet.
Last week I was thinking: wouldn’t it be great if I could just throw tasks at any idle worker in the cluster, and have the results automatically come back? Like ordering food on a delivery app — the system automatically assigns the nearest rider, and when it’s delivered, you get an automatic confirmation.
So this time, let’s build something truly practical.
What Are We Building?
Simply put, I want these capabilities:
Build a distributed task queue - Like a food delivery dispatch system, automatically assigns work to idle workers
Route tasks to other nodes - Whether the node is in Shanghai or Beijing, one command sends it there
Results find their way back - After task completion, the result knows who sent it and returns automatically
Lay the foundation for MapReduce - Basically the “break big tasks into small ones, do them separately, then aggregate” pattern
Sounds cool, right? Actually, it’s not that complex to implement.
Step 1: Define Jobs and Results
First, we need to agree on what “tasks” and “results” look like. Like food delivery orders, you need order numbers, contents, delivery addresses:
#[derive(Debug, Serialize, Deserialize)]
pub struct Job {
pub id: String, // Task ID, like an order number
pub data: String, // Task content, e.g., "process user data"
}
#[derive(Debug, Serialize, Deserialize)]
pub struct JobResult {
pub id: String, // Task ID, corresponds to the Job above
pub result: String, // Processing result
pub worker_node: String, // Which node did the work
}
These two structs are our “delivery order” and “delivery confirmation”. Simple and clear, with all the necessary info.
Step 2: Create a Worker Actor
Now we need an actual worker to do the job. In my design, each node can run several Worker Actors — they’re like delivery riders: accept orders, do work, report back:
struct Worker;
#[async_trait::async_trait]
impl Actor for Worker {
type Message = Job;
async fn handle(&mut self, msg: Job) {
println!("Received job: {:?}", msg);
// Start working (this is just an example, actual work might be complex computation)
let result = JobResult {
id: msg.id.clone(),
result: format!("Processed: {}", msg.data),
worker_node: "node-b".into(),
};
// Send result back
let json = serde_json::to_string(&result).unwrap();
self.router.send("coordinator@node-a", &json).await;
}
}
See? After the worker receives a task, processes it, and sends the result directly back to whoever assigned it. Just like a rider marking “delivered” in the app after dropping off food.
The key is this self.router.send() — it automatically routes the message to the correct node and Actor, we don’t worry about network transmission details.
Step 3: Build a Coordinator
With workers, we also need someone to assign and collect orders. In distributed systems, this role is called a Coordinator:
struct Coordinator;
#[async_trait::async_trait]
impl Actor for Coordinator {
type Message = JobResult;
async fn handle(&mut self, msg: JobResult) {
println!("Result received from {}: {}", msg.worker_node, msg.result);
// Here you can aggregate results, update database, etc.
}
}
The coordinator’s job is simple: receive results, record them, maybe do some aggregation and statistics. Like the backend of a delivery platform, recording the completion status of each order.
Step 4: Cross-Node Task Assignment
Now for the interesting part — how do we send tasks to other nodes?
Assume node-a runs the Coordinator and node-b runs the Worker. On node-a, we send a task like this:
let job = Job {
id: "job-123".to_string(),
data: "Organize user data".to_string(),
};
let payload = serde_json::to_string(&job).unwrap();
router.send("worker@node-b", &payload).await;
That simple! You just need to know the other party’s “address” (worker@node-b), and the message automatically gets delivered.
But here’s a key question: how does the result know where to go back?
Step 5: The Secret of Return Addresses
This brings us to the design of our message format:
#[derive(Debug, Serialize, Deserialize)]
pub struct NetworkMessage {
pub to: String, // Recipient: "worker@node-b"
pub from: String, // Sender: "coordinator@node-a"
pub payload: String, // Actual content (Job or JobResult)
}
See the from field? That’s the return address.
When the Worker receives a task, it can tell from the from field who sent it, and send results directly back there after processing. Like the sender’s address on a parcel — use it directly for returns.
This design makes bidirectional communication super simple, Workers don’t need to hardcode the Coordinator’s address.
Step 6: Smart Task Assignment
Now let’s upgrade: what if I don’t want to specify which node, but rather “just find any idle person to do the work”?
let peers = cluster.get_all_peers();
let target_node = choose_random(&peers); // Pick randomly
let addr = format!("worker@{}", target_node);
router.send(&addr, &payload).await;
That’s the most basic load balancing. You can extend this further:
- Choose the node with lowest CPU usage
- Choose the node with lowest network latency
- Round-robin assignment to each node
Just like food delivery dispatch algorithms, intelligently assign based on rider distance, order volume, ratings, etc.
What Can We Do?
At this point, we have a pretty powerful distributed task system. Let’s summarize the capabilities:
| Feature | Description |
|---|---|
| Job / JobResult | Standard format for tasks and results |
| Coordinator | Dispatcher, sends tasks, collects results |
| Worker | Worker, does work, returns results |
| from field | Lets results automatically return to sender |
| send(to, payload) | Seamless cross-node message passing |
This mechanism can support many real-world scenarios:
- Video transcoding - Split videos and send chunks to different nodes
- Data analytics - MapReduce-style big data computation
- Web crawling - Distribute URLs to multiple crawler nodes
- Image processing - Batch compression, watermarking, etc.
What Black Magic Can We Add?
Basic functionality is there, but production environments need to consider many edge cases:
Auto retry - Failed tasks automatically resend, max 3 attempts
Timeout mechanism - Track progress with Job.id, consider it failed after 30 seconds of no response
Performance monitoring - Record average task processing time per node
Task categorization - Route by type, e.g., “image processing” always goes to GPU nodes
Map-Shuffle-Reduce - Break big tasks into small ones, aggregate results
These optimizations transform the system from “works” to “production-ready”.
Next Up: Something Even Better
Now you have a task queue, next article we’ll talk about:
Fault Tolerance and Self-Healing - What if nodes crash? Tasks get lost?
Specifically:
- Build a retry queue
- Handle nodes suddenly going offline
- Implement self-healing loops
- Design supervisor patterns to rescue failed tasks
These are the real hard problems in distributed systems. But don’t worry, I’ll explain it all in the same simple way.
If You Want to Learn Rust In-Depth
This article is part of the Distributed Actor System series. If you’re having trouble keeping up, check out the basics:
Rust Basics Series:
- Rust #1: Ownership — Explained for People Who’ve Used Variables
- Rust #2: Functions, Structs & Traits — Cooler Than OOP
- Rust #3: Enums & Pattern Matching — The Power You Didn’t Know You Needed
- Rust #4: Error Handling — unwrap() Is Not a Strategy
Rust Async Series:
- Async Programming Basics - What async/await Really Does
- How Async Works Under the Hood - Future, poll() and State Machines
- Building Real Concurrent Apps with tokio
- Building Real Web Servers with axum
Rust Actor Series:
- What Is the Actor Model? Build One From Scratch
- Advanced Actor Messaging - Timeouts, Replies, Broadcasts
- Actor Lifecycle & State Isolation
- Supervisors & Restart Mechanisms
Distributed Actor Series (current):
- Distributed Actor System Overview
- Build Local Node + WebSocket Transport Layer
- Send Messages to Remote Nodes
- Reliable Peer Communication (Channels + Task Pattern)
- Cross-Node Actor Messaging
- Actor Discovery, Gossip & Dynamic Node Join
- Cluster Commands & Admin Messaging
- File Distribution, Config Sync & Hot Reloads
- Distributed Task Queues & Result Routing (this article)
Next article will be the series finale: Fault Tolerance, Retry Queues & Self-Healing Clusters.
Don’t Miss Updates
Follow “Dream Beast Programming” WeChat official account to get notified when new articles are published.
We also have a Rust tech discussion group with many developers actually building projects with Rust, sharing experiences and discussing problems.
Code speaks, theory is just the map. Try it yourself, and you’ll find Rust distributed development isn’t that scary.
See you next time!
