While building an application in today’s era; a developer would definitely wish to leverage underlying OS multi-core architecture for performance. Since [JavaScript] is single threaded application meaning that only one thread can run at any given time, hence by default JavaScript can only leverage single core CPU. To come about this, Node.JS has introduced Cluster API. In this article we will have a look at Cluster API, create child processes, and how to use it.

Clustering - Multi-core utilization for Node.JS

By default Node.JS can only utilize single core of underlying Operating System (OS). To take advantage of multi-core architecture, we have to leverage Cluster API of Node.JS to distribute load between them and run tasks in parallel.

Cluster module is comprehensively covered to define execution strategy of this built-in Node.JS module. The programmer only needs to worry about code.

Cluster API

To see how Node.JS cluster API can improve performance we need to first create a sample express app and then we transfer it to cluster API.

//server.js
const express = require("express");
const app = express();
const port = 3000;

app.get("/", (req, res) => {
  res.send("Hello World!");
});

app.get("/api/:n", function (req, res) {
  let n = parseInt(req.params.n);
  let count = 0;

  if (n > 5000000000) n = 5000000000;

  for (let i = 0; i <= n; i++) {
    count += i;
  }

  res.send(`Final count is ${count}`);
});

app.listen(port, () => {
  console.log(`App listening on port ${port}`);
});

Above example is not what we would find it in real world use, but just to show the difference between code without cluster and with cluster, I chose this. This would create a CPU intensive task so that we can visibly see the difference. Let’s dissect the code above

Code Explanation

Code above has two endpoints root / and /api/:n. We are interested in using /api/:n. This endpoints takes a number that is less than 5 billion and return addition of all number upto number provided. If number :n supplied is bigger than 5 billion it resets it to 5 billion.

Now, run that app using node server.js and hit http://localhost:3000/5000000000 in browser. The app will take few seconds before it can respond to that request. Try opening another/few browser tabs and request multiple requests. Each request will take few more second before they can be processed.

Master communication with workers

Communication between worker processes and master happens through IPC (Inter-process communication). The worker processes spawned can communicate with the parent via IPC (Inter Process Communication) channel which allows messages to be passed back and forth between the parent and child. Cluster module makes use of process.send() and process.on('message') to communicate between two processes.

AdvantagesDisadvantages
Auto child process spawn if child process diesSession Management not available
All cores are utilized, improving performanceManaging [IPC] is tedious
Resources wastage are reduced 
Easy to implement as in-built module 

How to use Node.JS Cluster API

Now let’s convert code in previous example to utilize Cluster API and see if that can help

//server_with_cluster.js
const express = require("express");
const port = 3000;
const cluster = require("cluster");
const CPUCount = require("os").cpus().length;

if (cluster.isMaster) {
  console.log(`Total no of CPU Cores available are:  ${CPUCount}`);
  console.log(`Master ${process.pid} is running`);

  // Fork child workers
  for (let i = 0; i < CPUCount; i++) {
    cluster.fork();
  }

  cluster.on("exit", (worker, code, signal) => {
    console.log(`worker ${worker.process.pid} died`);
    console.log("Let's fork another worker!");
    cluster.fork();
  });
} else {
  const app = express();
  console.log(`Worker ${process.pid} started`);

  app.get("/", (req, res) => {
    res.send("Hello World!");
  });

  app.get("/api/:n", function (req, res) {
    let n = parseInt(req.params.n);
    let count = 0;

    if (n > 5000000000) n = 5000000000;

    for (let i = 0; i <= n; i++) {
      count += i;
    }

    res.send(`Final count is ${count}`);
  });

  app.listen(port, () => {
    console.log(`App listening on port ${port}`);
  });
}

Code Dissection

In the above code snippet, we converted our single threaded code to use Node.JS Cluster API. Assuming that we have CPU having cores then the first cluster will work as master process that control remaining cores; running worker or forked child processes.

The express framework is only used in an example to show that the cluster API module does not bother about framework in an application. In fact, it works pretty well and independent of the framework (if implemented well).

Now, if you fire up browser window and browse same url on three tabs. You will see that most of them responded in similar time. Comparing to last implementation, second request have to wait for first request to be processed before it can be taken up.

Performance Metrics:

To see it in benchmark, you can use any Benchmarking app (i.e Apache Benchmark) to load test both applications. The example tested using two approaches, server without utilizing cluster and with a cluster. Results were great and performance jumped between 60-70% with 4 cores for CPU intensive tasks. Results may vary depending on what’s being executed and no. of cores being utilized.

Let’s run a load test against our two apps to see how much it improves with clustering. We’ll use the loadtest package for this.

Using loadtest package we can simulate a large number of concurrent connections against our applications endpoints to measure its performance. You can install loadtest, if not already installed using following command

npm install -g loadtest

Now run the app without cluster using node server.js and load test it. Now open another Terminal and run load test using following command

loadtest http://localhost:3000/api/500000 -n 1000 -c 100

The command above will simulate 1000 requests to our /api/:n endpoint with n set to 500000, with concurrency of 100 request. The following is the output from running the above command:

Requests: 0 (0%), requests per second: 0, mean latency: 0 ms

Target URL:          http://localhost:3000/api/500000
Max requests:        1000
Concurrency level:   100
Agent:               none

Completed requests:  1000
Total errors:        0
Total time:          1.268364041 s
Requests per second: 788
Mean latency:        119.4 ms

Percentage of the requests served within a certain time
  50%      121 ms
  90%      132 ms
  95%      135 ms
  99%      141 ms
 100%      142 ms (longest request)

We can observe that request with same n = 500000 the server without cluster; was able to handle 788 requests per second with a mean latency of 119.4 milliseconds (time taken to serve single request).

Now, stop the non-cluster app server.js, then run the clustered one node server_with_cluster.js and, then finally, run the load tests using following command

loadtest http://localhost:3000/api/500000 -n 1000 -c 100

Below are the results for load testing against clustered app server_with_cluster.js

Requests: 0 (0%), requests per second: 0, mean latency: 0 ms

Target URL:          http://localhost:3000/api/500000
Max requests:        1000
Concurrency level:   100
Agent:               none

Completed requests:  1000
Total errors:        0
Total time:          0.701446328 s
Requests per second: 1426
Mean latency:        65 ms

Percentage of the requests served within a certain time
  50%      61 ms
  90%      81 ms
  95%      90 ms
  99%      106 ms
 100%      112 ms (longest request)

We can observe that request with same n = 500000 the app that uses Node.JS Cluster API, was able to handle 1426 requests per second — a significant 80% increase, compared to the 788 requests per second of the app without clusters. The mean latency of the clustered app is 65 milliseconds, compared to 119.4 milliseconds of the app with no clusters. You can clearly see the improvement that clustering added to the app.

Conclusion:

Clustering only shines when it comes to CPU-intensive tasks. When your application is likely to run CPU intensive tasks, then clustering will offer an advantage in terms of the number of such tasks it can run at a given time and improve performance of application.

However, if your application is not running a lot of CPU-intensive tasks, then it might not be worth add. Such addition will only become the overhead to spawn up so many workers and manage them.

Remember, each process you create, has its own memory and V8 instance. Because of the additional resource allocations, spawning a large number of child Node.JS processes is not always recommended.

Further Reading

There is a tool that can help manage the process a bit better — the PM2 process manager. PM2 is a Production Process Manager for Node.JS applications with a built-in Load Balancer. When properly configured, PM2 will automatically run your app in cluster mode, spawn workers for you, and take care of spawning new workers when a worker dies. PM2 makes it easy to stop, delete, and start processes and it also has some monitoring tools that can help you to monitor and tweak your app’s performance. We will discuss about PM2 in details future articles.


About The Author

I am Pankaj Baagwan, a System Design Architect. A Computer Scientist by heart, process enthusiast, and open source author/contributor/writer. Advocates Karma. Love working with cutting edge, fascinating, open source technologies.

  • To consult Pankaj Bagwan on System Design, Cyber Security and Application Development, SEO and SMO, please reach out at me[at]bagwanpankaj[dot]com

  • For promotion/advertisement of your services and products on this blog, please reach out at me[at]bagwanpankaj[dot]com

Stay tuned <3. Signing off for RAAM