Event based model, its power and pitfalls

Event based model, its power and pitfalls

First of all let's understand what is an event. An event is a significant change that affects any system's state. Events can carry the state (the data associated with the event) or be identifiers that signal the occurrence of a state change.

The use of events offers several benefits, including improved agility, scalability, and the ability to react quickly to changes, allowing for real-time data processing and decision-making.

There are already a few famous web engines that implement an event-based model: Node.js, Nginx, Tornado, Jetty, Lighttpd, etc. The most bright representative of the events based model is anyway Node.js engine - an open-source and cross-platform JavaScript environment. So, the further talk will be based on Node.js engine mostly.

Node.js structure

Node.js is composed of few dependencies: V8, Libuv, http-parser, c-ares, OpenSSL, zlib, etc.

Normally Node.js engine can split into two main parts: V8 and Libuv. V8 is about 70% C++ and 30% JavaScript, while Libuv is almost completely written in C.

V8 implements ECMAScript and WebAssembly and intended to compiles and executes JavaScript source code, handles memory allocation for objects, and garbage collects objects.


ECMAScript is a new standardized version of JavaScript that was released in 2015 (also known as ECMAScript 2015 or ES6). ECMAScript is constantly evolving and naturally in parallel evolving and V8 engine (Node.js as well) so that to be able to support latest ECMAScript version. Thus, for example, ES2022 (ES13) features are almost fully supported in the current LTS version of Node.js (20.11.x), as it is built against Chrome's V8 engine (11.3.244.x).

Note: ECMAScript support in V8 engine is often on a feature-by-feature basis rather than a version-by-version basis. This means that even if a specific ECMAScript version is not explicitly mentioned, certain features from that version may still be supported (or not supported) if they are part of the V8 engine version used by Node.js (for detail see node.green). BTW, the following command will check the V8 engine version that ships with a specific Node.js version and give you an idea of the ECMAScript features supported by your Node.js installation:

Note: Here I have used latest LTS Node.js:

WebAssembly (WASM) is a high-performance assembly-like language that can be compiled from various languages (C and C++, Rust, Go, Perl, JavaScript, TypeScript, Java, Python, PHP, etc.). It's important to note that while WASM does not replace JavaScript, it complements it by allowing developers to use their preferred programming languages for web development, offering near-native performance.


Libuv is an open-source library that handles the thread-pool, doing signaling, inter process communications all other magic needed to make the asynchronous tasks work at all. It was originally originally written by using C language for Node.js itself, however, by now, many projects are already using it. Most people think libuv is the event loop itself, but this is not true, libuv implements a full featured event loop and also is the home of several other key parts of Node.js.

In Node.js server (event based model) every request will be put in a “event queue” which then is processed by the web server single thread named “event loop”.

The event loop continually checks the event queue for pending events or callbacks and executes them in the order they were added. It runs in a loop, handling events as they become available, which enables asynchronous programming in Node.js.

The perfect explanation of Event Loop working principle can be found in a few articles. Thus, to buy a time we will concentrate here on some more important aspects concerning Event Loop processing.

Event Loop Blocking

Node.js operates primarily on a single thread for executing JavaScript code, which is managed by the V8 engine. However, Node.js libuv library is so clever that can “recognize” some operations that can block event loop and uses additional threads for handling I/O operations and other tasks, making it effectively multithreaded for certain operations.

Node.js uses 7 threads by default per process:

  • 1 thread for the event loop.
  • 1 thread for JavaScript execution,
  • 1 thread for garbage collection,
  • 4 threads for LIBUV (used for file system, zlib, crypto and other long-running operations)

Additional 4 threads uses for DNS lookup operations in case Node.js uses as Web client.

Since Nodejs gives support for asynchronous operations, there are still some synchronous tasks that can block the main thread until completed. The libuv (embedded library) provides a pool of other threads for some synchronous operations where it can distribute CPU loads.

The libuv library was created to abstract and handle asynchronous non-blocking I/O operations like:

  • Asynchronous file operations
  • Asynchronous DNS resolution
  • Child process
  • Signal handling
  • Named pipes
  • Timers
  • Asynchronous TCP and UDP sockets
  • Thread pooling

This library is responsible also for providing Nodejs with multithreading or the ability to provide a pool of threads in a Nodejs process for synchronous tasks to ride on. The thread pool consists of four threads, created to handle heavy-duty tasks that shouldn’t be on the main thread. And with this set-up, our application is not blocked by these tasks.

Unfortunately, for many CPU-bound tasks, Node.js does not a chance to recognize them and automatically utilize additional threads. So, in some incorrect usage of JavaScript features (basically by using synchronous-methods) the event loop processing can be blocked completely.

Obviously, the easiest way to block Node.js application is to insert an infinite loop. Sometimes this kind of behavior is created faster than you might think, even by good programmers. Let see the example with while loop:

The issue with this code (left pane) is that the while loop will very intensively use CPU, so event loop queue will be filled by tons of output commands generated by while loop. This results we see infinitely repeated line "waiting", before the setTimeout callback have a chance to run. Naturally, event loop in such case will be blocked fully and no any another request can be processed at all.

In Node.js, handling asynchronous operations within loops, such as a "while" loop, can be challenging due to the nature of JavaScript's asynchronous execution model. However, there are some strategies to manage asynchronous operations within loops effectively. Right pane shows (as one of possible examples) how can be corrected the code by replacing synchronous “while” operator by asynchronous operation.

Node.js app developer should remember that while blocking code can be simpler to understand, it can lead to performance issues in Node.js applications and even to critical situations. This is because blocking operations prevent the event loop from continuing to handle other operations until the blocking operation is complete.

Notice also that Node.js based on JavaScript asynchronous nature actively used callbacks that allowing the application to continue executing other tasks without waiting for the completion of the current operation. However, the complexity of the operations performed in a callback can impact how quickly it completes and therefore lead to blocking Event Loop. Thus, operations with high computational complexity or those that involve blocking I/O operations (such as synchronous file system operations) can significantly slow down the execution of the callback. It's important to be mindful of the computational complexity of your callbacks and to avoid performing heavy operations directly in the callback.

Therefore, it's recommended to avoid whenever possible to use blocking operations in Node.js applications to take full advantage of its event-driven mode. Moreover, note that Node.js, used mostly as Web server, does not intended basically to do any CPU-intensive operations but mostly to operate with tons of Web requests. So, in server-side applications, it's highly recommended to use the asynchronous versions of synchronous methods (if exist, of course) to avoid blocking the event loop and to maintain high performance and scalability.

In Node.js, blocking operations are in most cases executed synchronously, meaning the program waits for each operation to finish before moving on to the next one. Here are some list of blocking operations in Node.js:

  • File System Operations: The fs module provides both synchronous and asynchronous methods for file system operations. The synchronous methods, like fs.readFileSync(), are examples of blocking operations.
  • Network Requests: While Node.js is designed for non-blocking I/O operations, using synchronous methods for network requests can lead to blocking behavior. E.g., using the http module's synchronous methods like http.getSync() to make HTTP requests would block the execution until the request is completed.
  • CPU-Intensive Tasks: Any JavaScript code that performs CPU-intensive tasks without yielding control back to the event loop can be considered blocking.
  • Crypto Operations: The crypto module offers many synchronous methods for cryptographic operations. These methods block the execution until the cryptographic operation is finished.
  • Compression Operations: The zlib module provides synchronous methods for compression and decompression. These methods block the execution until the compression or decompression is complete.
  • Child Process Operations: The child_process module offers synchronous methods for spawning child processes. Naturally, these methods block the execution until the child process completes.
  • Database Operations: Interacting with databases are usually long operations and can significantly impact performance.
  • Streaming Data: Node.js supports streaming data, which is useful for handling large amounts of data that cannot be processed all at once.
  • Image and Video processing: Resizing and Compressing large video or images is one of the most taxing compute tasks around that can take several seconds and even more. This is common in applications that convert uploaded photos into thumbnails, as well as small and large formats.
  • Sorting and searching large amounts of data: Filtering and sorting data requires extensive iteration to compare each value.So what is the way to resolve the situation with blocking event loop? Sure, we will not talking about erroneous codes that fully block the event loop and therefore program execution but about long running, but strongly needed operations that leads to quite significant blocking. Unfortunately, there is no silver-bullet solution to this kind of Node.js problem, rather each case needs to be addressed individually. Anyway, basically, a few ways are known: Clustering, child processes and worker threads in Node.js are mechanisms designed to avoid blocking the event loop, thereby improving the performance and scalability of Node.js applications.
  • Clustering: The Node.js cluster module allows you to create multiple worker processes (child processes) that run simultaneously, sharing the same server port. Each worker process has its own event loop, memory, and V8 instance. This setup enables the application to take advantage of multi-core systems by distributing the load among the worker processes. This approach prevents the event loop from being blocked by CPU-intensive operations, as each worker process can handle requests independently of the others. There are many articles about clustering and one of these can be found there.
  • Child Processes: Node.js provides the child_process module, which allows you to spawn child processes. These child processes run in their own separate processes, not threads, and have their own main thread. This means that when you spawn asynchronously a child process, it does not block the event loop of your Node.js application. The parent Node.js process can continue executing other tasks while the child process runs independently. This is because the child process runs in its own process space, separate from the parent process. Some explanation can be found here or here.
  • Worker threads: Node.js worker threads are designed to help avoid blocking the event loop by offloading CPU-intensive tasks to separate threads which allows the main event loop to remain responsive to I/O operations and other asynchronous tasks, thereby improving the overall performance and scalability of Node.js applications.

Worker Threads in Node.js

Node.js operates primarily on a single thread for executing JavaScript code, which is managed by the V8 engine. However, Node.js uses additional threads (thanks to libuv) for handling I/O operations and other tasks, making it effectively multithreaded for certain operations. In additions, the worker_threads module can be used to create additional threads for executing CPU-intensive tasks in parallel, without blocking the main thread of your application. This is particularly useful for applications that require heavy computation or data processing, as it can significantly improve performance by utilizing multiple CPU cores. The worker_threads module was introduced in Node.js v10.x and has been stable since 12.x.

Worker threads and OS threads (also known as Platform Threads) differ primarily in how they are managed and their relationship with the operating system. Worker threads are managed by the application or the runtime environment, whereas OS threads are managed by the operating system kernel. Node.js Worker Threads are similar to Java Virtual Threads and generally representative lightweight process. They can implement various concurrency models, such as green threads or fibers which allows for more flexible and efficient concurrency management, especially in scenarios where many threads are needed but blocking operations are common.

To use worker threads, the worker_threads module has to be import and create a new worker by calling the new Worker() constructor with the path to the JavaScript file you want to run in the worker thread. You can then use the worker's postMessage method to send data to the worker, and listen for messages from the worker using the on('message') event handler. Very easy, isn't it?

When should be used Worker Threads

Worker threads are particularly beneficial in scenarios where you need to perform CPU-intensive tasks without blocking the main thread. It's important to note that worker threads are not suitable for I/O-bound tasks, as Node.js already provides efficient mechanisms for handling asynchronous I/O operations.

Here are some common use cases where worker threads can provide significant benefits:

  • CPU-Intensive Operations: Worker threads are ideal for offloading compute-heavy operations to separate threads, allowing the main thread to remain responsive.
  • Image resizing: Resizing large images can take several seconds, and the delays add up quickly if you need to generate multiple sizes.
  • Video compression: Video compression is one of the most taxing compute tasks around. Worker threads can accelerate it by processing multiple frames in parallel.
  • Cryptography: Cryptographic operations are intentionally complex. Encrypting and decrypting files, generating secret keys, and performing signature verification can all create perceptible delays in a program.
  • Sorting and searching: Filtering and sorting data requires extensive iteration to compare each value.
  • Complex mathematical operations: Mathematical computation - such as generating primes, factorizing large numbers, and complex data analysis - is inherently CPU-intensive.

  • Parallel Processing: If you have a large process that can be broken down into smaller, independent tasks, worker threads can execute these tasks in parallel.
  • Improving Performance: By offloading CPU-intensive tasks to worker threads, the main thread can be freed up to handle other tasks, such as serving requests in a web application. This can lead to improved performance and responsiveness in applications that are CPU-bound.
  • Handling Large Amounts of Data: When dealing with large datasets that require processing, worker threads can be used to process these datasets in chunks, thereby reducing memory usage and improving the overall efficiency of the application.

The slowdown in all these operations is caused by the CPU spending a lot of time executing code, as opposed to reading data from disk or the network. They're iterative tasks, so increasing the number of passes performed in parallel is the best route to a performance improvement. Worker threads are a mechanism for achieving this.


Pool of worker threads

The number of Node.js worker threads is not explicitly limited by the Node.js runtime itself. Instead, the practical limit is determined by the system's resources, such as available memory and CPU cores. Each worker thread consumes system resources, and creating too many threads can lead to performance degradation or even system instability.

Launching a separate worker thread is quite a heavy operation and creating a lot of worker threads can quickly turn into a bottleneck; this can be mitigated by creating a pool of workers (not to be confused with the libuv thread pool). So, it’s recommended practice to create a pool of worker threads rather than creating and destroying them for each task.

Sure, pool of Worker Threads can be created manually. Find below a simple example of how to implement a worker pool in Node.js:

First, In main thread the pool of working threads is created. Next, a numbers of tasks are sending to the working threads for fulfilling. As soon as working thread is finishing task processing it post message about this. Main thread is checking also the state of working threads in the pool and finishing the main thread as soon as all threads are in idle state (don’t have a task for processing).

This code result will have the following view:

Worker threads are the only way to get something similar to multithreading when you're programming with Node.js. CPU-intensive operations, background processing, and any parallel code execution can be implemented using worker threads. The module and the concept it implements come with several caveats though.

Important to understand that worker threads aren't true threads. They are mostly like a virtual threads in Java. Node.js worker threads work by spawning an isolated instance of Node's V8 JavaScript runtime. So, there's no implicit memory sharing between the main program and the worker thread. Instead, an event-based messaging system is provided so values can be exchanged between the processes. Additional simple examples for worker threads pool can be found in GitHub.

Sure, worker threads pool can be created manually (like we done above) but, fortunately a few popular libraries are exist already that provide a more convenient and agile interface for thread pooling.

  • Piscina Piscina simplifies working with pools of workers by providing a convenient interface around the worker threads API. It allows you to create task queues, track their completion, and cancel tasks if necessary.
  • Bree Bree is a job scheduler for Node.js that uses worker threads internally to run task code outside the main loop. It allows you to execute async tasks at specified intervals with concurrency limits, retry support, and cancellation.
  • Poolifier This library allows you to create fixed or dynamic thread pools for executing tasks in parallel. It simplifies the management of worker threads by providing a pool of threads that can be reused, reducing the overhead of creating and destroying threads for each task.
  • Node-worker-threads-pool Simple worker threads pool which supports StaticPool and DynamicPool.
  • Workerpool Offers an easy way to create a pool of workers for both dynamically offloading computations as well as managing a pool of dedicated workers.

Some primers of using above mentioned pool libraries can be found in a few article.

In summary, Node.js primarily operates on a single thread for JavaScript execution but uses additional threads for I/O operations and other tasks, making it effectively multithreaded for certain operations. The exact number of threads can vary based on the type of operations and whether network I/O is involved.

Wrapping up

Using worker threads in Node.js offers several advantages and disadvantages, which are crucial to consider when deciding whether to implement them in your application.


Advantages

  • Parallel Execution: Worker threads allow for the parallel execution of CPU-intensive tasks, which can significantly improve the performance of Node.js applications by offloading heavy computations from the main thread.
  • Isolation: Each worker thread runs in its own V8 instance, which means it does not share memory with the main thread. This isolation prevents race conditions and makes it easier to manage memory.
  • Flexibility: Worker threads can be used for a wide range of tasks beyond CPU-bound operations, including background processing and parallel code execution.
  • Communication: Worker threads can communicate with the main thread using message passing, which is a powerful feature for coordinating tasks and sharing data between threads.

Disadvantages

  • Overhead: Creating and managing worker threads can introduce significant overhead, especially if not managed properly.
  • Not Suitable for I/O Tasks: Worker threads are not suitable for I/O-bound tasks. Node.js already provides efficient asynchronous I/O operations, and using worker threads for these tasks can be wasteful and inefficient.
  • Complexity: Implementing worker threads can add complexity to your application, as you need to manage the lifecycle of worker threads, handle communication between threads, and ensure proper error handling.
  • Memory Management: Each worker thread has its own memory space, which means sharing data between threads requires serialization and deserialization.


Worker threads give Node.js developers a way to run code in parallel and avoid event loop blocking but this isn't real multithreading.

Important to note, the worker_threads module, of course, is an invaluable part of the Node.js ecosystem because there's no other way of achieving multithreading and parallel processing within the confines JavaScript. However, it's important to recognize the limitations of worker threads so you can make an informed choice about when to use them. Adding a worker in the wrong situation could reduce your app's performance and increase resource utilization.

Naturally, highly CPU-intensive code where performance is critical will run more performantly when using real threads (unreachable in Node.js environment), however usage of worker threads are sufficient for most Node.js use cases.

Finally note that while worker threads in Node.js offer powerful capabilities for parallel execution and isolation, they come with significant overhead and complexity. It's essential to carefully consider the specific needs of your application and evaluate the performance implications before deciding to use worker threads.


To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics