Event based model, its power and pitfalls
First of all let's understand what is an event. An event is a significant change that affects any system's state. Events can carry the state (the data associated with the event) or be identifiers that signal the occurrence of a state change.
The use of events offers several benefits, including improved agility, scalability, and the ability to react quickly to changes, allowing for real-time data processing and decision-making.
There are already a few famous web engines that implement an event-based model: Node.js, Nginx, Tornado, Jetty, Lighttpd, etc. The most bright representative of the events based model is anyway Node.js engine - an open-source and cross-platform JavaScript environment. So, the further talk will be based on Node.js engine mostly.
Node.js structure
Node.js is composed of few dependencies: V8, Libuv, http-parser, c-ares, OpenSSL, zlib, etc.
Normally Node.js engine can split into two main parts: V8 and Libuv. V8 is about 70% C++ and 30% JavaScript, while Libuv is almost completely written in C.
V8 implements ECMAScript and WebAssembly and intended to compiles and executes JavaScript source code, handles memory allocation for objects, and garbage collects objects.
ECMAScript is a new standardized version of JavaScript that was released in 2015 (also known as ECMAScript 2015 or ES6). ECMAScript is constantly evolving and naturally in parallel evolving and V8 engine (Node.js as well) so that to be able to support latest ECMAScript version. Thus, for example, ES2022 (ES13) features are almost fully supported in the current LTS version of Node.js (20.11.x), as it is built against Chrome's V8 engine (11.3.244.x).
Note: ECMAScript support in V8 engine is often on a feature-by-feature basis rather than a version-by-version basis. This means that even if a specific ECMAScript version is not explicitly mentioned, certain features from that version may still be supported (or not supported) if they are part of the V8 engine version used by Node.js (for detail see node.green). BTW, the following command will check the V8 engine version that ships with a specific Node.js version and give you an idea of the ECMAScript features supported by your Node.js installation:
Note: Here I have used latest LTS Node.js:
WebAssembly (WASM) is a high-performance assembly-like language that can be compiled from various languages (C and C++, Rust, Go, Perl, JavaScript, TypeScript, Java, Python, PHP, etc.). It's important to note that while WASM does not replace JavaScript, it complements it by allowing developers to use their preferred programming languages for web development, offering near-native performance.
Libuv is an open-source library that handles the thread-pool, doing signaling, inter process communications all other magic needed to make the asynchronous tasks work at all. It was originally originally written by using C language for Node.js itself, however, by now, many projects are already using it. Most people think libuv is the event loop itself, but this is not true, libuv implements a full featured event loop and also is the home of several other key parts of Node.js.
In Node.js server (event based model) every request will be put in a “event queue” which then is processed by the web server single thread named “event loop”.
The event loop continually checks the event queue for pending events or callbacks and executes them in the order they were added. It runs in a loop, handling events as they become available, which enables asynchronous programming in Node.js.
The perfect explanation of Event Loop working principle can be found in a few articles. Thus, to buy a time we will concentrate here on some more important aspects concerning Event Loop processing.
Event Loop Blocking
Node.js operates primarily on a single thread for executing JavaScript code, which is managed by the V8 engine. However, Node.js libuv library is so clever that can “recognize” some operations that can block event loop and uses additional threads for handling I/O operations and other tasks, making it effectively multithreaded for certain operations.
Node.js uses 7 threads by default per process:
Additional 4 threads uses for DNS lookup operations in case Node.js uses as Web client.
Since Nodejs gives support for asynchronous operations, there are still some synchronous tasks that can block the main thread until completed. The libuv (embedded library) provides a pool of other threads for some synchronous operations where it can distribute CPU loads.
The libuv library was created to abstract and handle asynchronous non-blocking I/O operations like:
This library is responsible also for providing Nodejs with multithreading or the ability to provide a pool of threads in a Nodejs process for synchronous tasks to ride on. The thread pool consists of four threads, created to handle heavy-duty tasks that shouldn’t be on the main thread. And with this set-up, our application is not blocked by these tasks.
Unfortunately, for many CPU-bound tasks, Node.js does not a chance to recognize them and automatically utilize additional threads. So, in some incorrect usage of JavaScript features (basically by using synchronous-methods) the event loop processing can be blocked completely.
Obviously, the easiest way to block Node.js application is to insert an infinite loop. Sometimes this kind of behavior is created faster than you might think, even by good programmers. Let see the example with while loop:
The issue with this code (left pane) is that the while loop will very intensively use CPU, so event loop queue will be filled by tons of output commands generated by while loop. This results we see infinitely repeated line "waiting", before the setTimeout callback have a chance to run. Naturally, event loop in such case will be blocked fully and no any another request can be processed at all.
In Node.js, handling asynchronous operations within loops, such as a "while" loop, can be challenging due to the nature of JavaScript's asynchronous execution model. However, there are some strategies to manage asynchronous operations within loops effectively. Right pane shows (as one of possible examples) how can be corrected the code by replacing synchronous “while” operator by asynchronous operation.
Node.js app developer should remember that while blocking code can be simpler to understand, it can lead to performance issues in Node.js applications and even to critical situations. This is because blocking operations prevent the event loop from continuing to handle other operations until the blocking operation is complete.
Notice also that Node.js based on JavaScript asynchronous nature actively used callbacks that allowing the application to continue executing other tasks without waiting for the completion of the current operation. However, the complexity of the operations performed in a callback can impact how quickly it completes and therefore lead to blocking Event Loop. Thus, operations with high computational complexity or those that involve blocking I/O operations (such as synchronous file system operations) can significantly slow down the execution of the callback. It's important to be mindful of the computational complexity of your callbacks and to avoid performing heavy operations directly in the callback.
Therefore, it's recommended to avoid whenever possible to use blocking operations in Node.js applications to take full advantage of its event-driven mode. Moreover, note that Node.js, used mostly as Web server, does not intended basically to do any CPU-intensive operations but mostly to operate with tons of Web requests. So, in server-side applications, it's highly recommended to use the asynchronous versions of synchronous methods (if exist, of course) to avoid blocking the event loop and to maintain high performance and scalability.
In Node.js, blocking operations are in most cases executed synchronously, meaning the program waits for each operation to finish before moving on to the next one. Here are some list of blocking operations in Node.js:
Recommended by LinkedIn
Worker Threads in Node.js
Node.js operates primarily on a single thread for executing JavaScript code, which is managed by the V8 engine. However, Node.js uses additional threads (thanks to libuv) for handling I/O operations and other tasks, making it effectively multithreaded for certain operations. In additions, the worker_threads module can be used to create additional threads for executing CPU-intensive tasks in parallel, without blocking the main thread of your application. This is particularly useful for applications that require heavy computation or data processing, as it can significantly improve performance by utilizing multiple CPU cores. The worker_threads module was introduced in Node.js v10.x and has been stable since 12.x.
Worker threads and OS threads (also known as Platform Threads) differ primarily in how they are managed and their relationship with the operating system. Worker threads are managed by the application or the runtime environment, whereas OS threads are managed by the operating system kernel. Node.js Worker Threads are similar to Java Virtual Threads and generally representative lightweight process. They can implement various concurrency models, such as green threads or fibers which allows for more flexible and efficient concurrency management, especially in scenarios where many threads are needed but blocking operations are common.
To use worker threads, the worker_threads module has to be import and create a new worker by calling the new Worker() constructor with the path to the JavaScript file you want to run in the worker thread. You can then use the worker's postMessage method to send data to the worker, and listen for messages from the worker using the on('message') event handler. Very easy, isn't it?
When should be used Worker Threads
Worker threads are particularly beneficial in scenarios where you need to perform CPU-intensive tasks without blocking the main thread. It's important to note that worker threads are not suitable for I/O-bound tasks, as Node.js already provides efficient mechanisms for handling asynchronous I/O operations.
Here are some common use cases where worker threads can provide significant benefits:
The slowdown in all these operations is caused by the CPU spending a lot of time executing code, as opposed to reading data from disk or the network. They're iterative tasks, so increasing the number of passes performed in parallel is the best route to a performance improvement. Worker threads are a mechanism for achieving this.
Pool of worker threads
The number of Node.js worker threads is not explicitly limited by the Node.js runtime itself. Instead, the practical limit is determined by the system's resources, such as available memory and CPU cores. Each worker thread consumes system resources, and creating too many threads can lead to performance degradation or even system instability.
Launching a separate worker thread is quite a heavy operation and creating a lot of worker threads can quickly turn into a bottleneck; this can be mitigated by creating a pool of workers (not to be confused with the libuv thread pool). So, it’s recommended practice to create a pool of worker threads rather than creating and destroying them for each task.
Sure, pool of Worker Threads can be created manually. Find below a simple example of how to implement a worker pool in Node.js:
First, In main thread the pool of working threads is created. Next, a numbers of tasks are sending to the working threads for fulfilling. As soon as working thread is finishing task processing it post message about this. Main thread is checking also the state of working threads in the pool and finishing the main thread as soon as all threads are in idle state (don’t have a task for processing).
This code result will have the following view:
Worker threads are the only way to get something similar to multithreading when you're programming with Node.js. CPU-intensive operations, background processing, and any parallel code execution can be implemented using worker threads. The module and the concept it implements come with several caveats though.
Important to understand that worker threads aren't true threads. They are mostly like a virtual threads in Java. Node.js worker threads work by spawning an isolated instance of Node's V8 JavaScript runtime. So, there's no implicit memory sharing between the main program and the worker thread. Instead, an event-based messaging system is provided so values can be exchanged between the processes. Additional simple examples for worker threads pool can be found in GitHub.
Sure, worker threads pool can be created manually (like we done above) but, fortunately a few popular libraries are exist already that provide a more convenient and agile interface for thread pooling.
Some primers of using above mentioned pool libraries can be found in a few article.
In summary, Node.js primarily operates on a single thread for JavaScript execution but uses additional threads for I/O operations and other tasks, making it effectively multithreaded for certain operations. The exact number of threads can vary based on the type of operations and whether network I/O is involved.
Wrapping up
Using worker threads in Node.js offers several advantages and disadvantages, which are crucial to consider when deciding whether to implement them in your application.
Advantages
Disadvantages
Worker threads give Node.js developers a way to run code in parallel and avoid event loop blocking but this isn't real multithreading.
Important to note, the worker_threads module, of course, is an invaluable part of the Node.js ecosystem because there's no other way of achieving multithreading and parallel processing within the confines JavaScript. However, it's important to recognize the limitations of worker threads so you can make an informed choice about when to use them. Adding a worker in the wrong situation could reduce your app's performance and increase resource utilization.
Naturally, highly CPU-intensive code where performance is critical will run more performantly when using real threads (unreachable in Node.js environment), however usage of worker threads are sufficient for most Node.js use cases.
Finally note that while worker threads in Node.js offer powerful capabilities for parallel execution and isolation, they come with significant overhead and complexity. It's essential to carefully consider the specific needs of your application and evaluate the performance implications before deciding to use worker threads.