typescript

Node.js Scalability: Implementing a Node.js Cluster

Node.js runs on a single thread. One CPU core, one process, one event loop. If you have a 16-core machine, 15 cores sit idle.

29 Apr 2024

Node.js Scalability: Implementing a Node.js Cluster

Node.js runs on a single thread. One CPU core, one process, one event loop. If you have a 16-core machine, 15 cores sit idle.

The cluster module fixes this. It forks your process into multiple workers — one per CPU core — each handling requests independently.

Implementation

Typescript
import * as cluster from 'cluster';
import * as os from 'os';

if (cluster.isMaster) {
  const numCores = os.cpus().length;

  for (let i = 0; i < numCores; i++) {
    cluster.fork();
  }

  cluster.on('exit', (worker, code, signal) => {
    console.log(`Worker ${worker.process.pid} exited (code: ${code}, signal: ${signal})`);
    cluster.fork(); // Replace dead workers automatically
  });
} else {
  console.log(`Worker ${process.pid} started`);
  // Start your HTTP server here
}

The master process doesn't handle any requests. It spawns workers and restarts them if they crash. Each worker is a full copy of your application running in its own process.

What you get

  • Throughput scales with cores. Four cores, roughly four times the request capacity.
  • Fault isolation. One worker crashing doesn't take down the others.
  • Zero-downtime restarts. Roll workers one at a time for deployments.

The trade-off

Workers don't share memory. If your app relies on in-memory state (session stores, caches, counters), that state is per-worker. Two requests from the same user might hit different workers and see different data.

Solutions: use Redis or a database for shared state. Or use sticky sessions to pin users to specific workers (but that reduces the load-balancing benefit).

Also, for many production deployments, you might not need cluster at all. Container orchestrators like Kubernetes or process managers like PM2 handle multi-instance scaling at a higher level. Use cluster when you want process-level parallelism within a single machine without external tooling.