WebSocket vs SSE Benchmark: SSE Uses 40% Less Memory at 100K Connections
A team was building a real-time dashboard — prices, inventory counts, threshold alerts. They picked WebSocket without hesitation, thinking it was the obvious choice for real-time communication. When the benchmark results came in, the memory usage made them reconsider.
The Bottom Line
At 100K concurrent connections, SSE uses roughly 40% less memory than WebSocket.
These are Ark Protocol’s real benchmark numbers. They implemented both protocols in Rust + Axum, then ran identical stress tests:
| Metric | WebSocket | SSE | Delta |
|---|---|---|---|
| Memory/connection | ~52KB | ~31KB | -40% |
| 100K total memory | ~5.2GB | ~3.1GB | -2.1GB |
| CPU (low-frequency) | baseline | baseline | ~same |
| Horizontal scaling | needs stateful LB | standard HTTP LB | SSE wins |
The gap isn’t from some micro-optimization in a corner of the code. It’s fundamental differences in how the two protocols work under the hood.
Protocol Mechanics: Why the Memory Gap Exists
WebSocket: A Permanent Dedicated Phone Line
Once a WebSocket connection is established, the TCP connection stays open. Both sides can send data anytime. This means:
- Each connection needs an independent Task on the server: read/write separation, state tracking, connection lifecycle management
- Protocol has frame parsing overhead: Opcode (4 bits), Mask bit (1 bit), Payload Length (7/16/64-bit variable), frame checksum
- Application-layer heartbeat needs its own timer: even without data transfer, heartbeat packets still consume memory
When connection counts are low, this isn’t a problem. At scale, memory starts screaming — every single connection is eating RAM.
SSE: A Mailbox on Your Door
SSE is HTTP-based, with the server pushing data one-way to the client. Once the connection is established, the client just needs an EventSource API, and the server sends text events on demand.
- HTTP/1.1 pipeline lets connections be reused: one connection can carry SSE streams for multiple clients
- No WebSocket-style frame parsing overhead: it’s just HTTP streaming response, Text/Event-Source type
- HTTP middleware natively understands SSE: NGINX, Cloudflare, AWS ALB all know how to handle it — they won’t misidentify it as an idle connection
The trade-off is clear: one-way server→client only. Client wants to send data? Open another HTTP request.
When to Use Which: Real-World Scenarios
Go with SSE when
Single-direction data flow is your use case. For example:
- Real-time price feeds (server pushes, client watches)
- Inventory count notifications
- Log streams, monitoring dashboards, CI/CD build status
- Alert threshold triggers
In these scenarios the client never initiates, and SSE handles it perfectly.
Horizontal scaling is SSE’s strong suit. SSE runs on standard HTTP, so you can route traffic through any HTTP-aware load balancer. Adding 100 backend servers is trivial — no connection state sharing problems like with WebSocket.
Go with WebSocket when
Bidirectional, frequent, low-latency interaction is required. For example:
- Chat rooms, collaborative editing (multiple people operating simultaneously)
- Game commands (you move, the other player sees immediately)
- Financial order submission (bidirectional handshake confirmation)
In these scenarios the bidirectional channel is non-negotiable. Forcing SSE means opening two connections (one push, one pull), which adds complexity.
Gray area: Hybrid Architecture
Some teams use both: SSE for high-volume downstream data (market data, notifications), WebSocket for small high-frequency upstream commands (orders, chat). This is a reasonable compromise.
Interpreting the Benchmark: CPU and Latency Are a Different Story
Ark Protocol’s tests showed a 40% memory gap, but CPU usage was nearly identical. SSE saves memory, WebSocket saves CPU? Be careful with that conclusion.
What actually impacts CPU is message frequency and payload size, not the protocol itself. In millisecond-level high-frequency push scenarios, WebSocket’s binary frames (minimum 2-byte frame header) are more compact than SSE’s text events (data: ...\n\n), so WebSocket actually uses less CPU.
Don’t make decisions based on memory alone:
| Dimension | SSE advantage | WebSocket advantage |
|---|---|---|
| Memory usage | ✅ 40%↓ at 100K connections | - |
| CPU efficiency (high-freq) | - | ✅ binary frames more compact |
| Bidirectional | ❌ | ✅ native support |
| Horizontal scaling | ✅ standard HTTP LB | ❌ needs state sharing |
| Middleware compatibility | ✅ standard HTTP | ❌ proprietary protocol |
| Reconnection | browser auto-reconnects | manual implementation |
| Heartbeat | LB handles it | needs app-layer impl |
Code Comparison: Axum SSE vs WebSocket
SSE Version
use axum::{Router, routing::get, response::sse::{Sse, Event}};
use tokio_stream::wrappers::BroadcastStream;
use tokio::sync::broadcast;
use std::time::Duration;
async fn sse_handler(broadcast_rx: broadcast::Receiver<String>) -> Sse<Event> {
let stream = BroadcastStream::new(broadcast_rx).map(|msg| {
Ok(Event::default().data(msg.unwrap_or_default()))
});
Sse::new(stream).keepalive(
axum::response::sse::keep_alive()
.interval(Duration::from_secs(15))
)
}
#[tokio::main]
async fn main() {
let (tx, _rx) = broadcast::channel::<String>(100);
let app = Router::new()
.route("/stream", get(sse_handler));
// Broadcast to all subscribers with tx.send()
println!("SSE server running on :8080");
}
SSE core advantages:
- 45 lines of code, done
- No connection mapping table to maintain
- Browser handles reconnection automatically
- Load balancers support it natively
WebSocket Version
use axum::{Router, routing::get, ws::{WebSocket, WebSocketUpgrade}};
use tokio::sync::broadcast;
use std::collections::HashMap;
use std::sync::Arc;
use tokio::sync::Mutex;
struct WSState {
peers: Arc<Mutex<HashMap<String, broadcast::Sender<String>>>>,
}
async fn ws_handler(
ws: WebSocketUpgrade,
State(state): State<WSState>,
Path(client_id): Path<String>,
) -> impl IntoResponse {
ws.on_upgrade(move |socket| handle_socket(socket, state, client_id))
}
async fn handle_socket(socket: WebSocket, state: WSState, client_id: String) {
let (sender, mut receiver) = broadcast::channel::<String>(100);
{
let mut peers = state.peers.lock().await;
peers.insert(client_id.clone(), sender.clone());
}
let mut rx = state.tx.subscribe();
let (ws_sender, mut ws_receiver) = socket.split();
// Handle WebSocket messages
let writer = async {
while let Ok(msg) = rx.recv().await {
if ws_sender.send(axum::extract::ws::Message::Text(msg)).send().await.is_err() {
break;
}
}
};
let reader = async {
while let Ok(msg) = ws_receiver.next().await {
if let Some(Ok(axum::extract::ws::Message::Text(text))) = msg {
println!("Received: {}", text);
}
}
};
tokio::join!(writer, reader);
let mut peers = state.peers.lock().await;
peers.remove(&client_id);
}
WebSocket core challenges:
- Must maintain connection mapping table (HashMap + Mutex)
- Reconnection needs manual implementation
- Load balancing requires sticky sessions or WebSocket-aware LB
- Code size is 60-70% larger than SSE
One-Line Decision
Need to watch data, prioritizing scalability → SSE
Need to talk back, low-latency bidirectional → WebSocket
Most monitoring, notification, and real-time data stream scenarios can just use SSE.
Want more Rust async programming and real-time system architecture实战? Follow Rexai Programming on WeChat for weekly updates.
Also check out Rexai AI Programming Assistant — get AI coding tools into production.
