The massive popularity of the term "asynchronous" was in the wave of Web 2.0, which swept the Web with Javascript and AJAX. But asynchronous is rare in most high-level programming languages. PHP best reflects this feature: it not only blocks asynchronously, but also does not provide multiple threads. PHP is executed in a synchronous blocking manner. Such advantages are beneficial for programmers to write business logic in sequence, but in complex network applications, blocking causes it to fail to be more concurrent.
On the server side, I/O is very expensive, and distributed I/O is more expensive. Only when the backend can quickly respond to resources can the front-end experience become better. Node.js is the first platform to use asynchronous as the main programming method and design concept. Accompanied by asynchronous I/O, event-driven and single-threading, they form the tone of Node. This article will introduce how Node implements asynchronous I/O.
1. Basic concepts
"Async" and "non-blocking" sound the same thing, and in terms of actual results, both achieve the purpose of parallelism. But from the perspective of computer kernel I/O, there are only two ways: blocking and non-blocking. So asynchronous/synchronous and blocking/nonblocking are actually two different things.
1.1 Blocking I/O and non-blocking I/O
One feature of blocking I/O is that after calling, you must wait until all operations are completed at the system kernel level before the call is finished. Taking reading a file on the disk as an example, this call ends after the system kernel completes disk search, reads data, and copies data into memory.
Blocking I/O causes the CPU to wait for I/O, wasting waiting time, and the CPU's processing power cannot be fully utilized. The characteristic of non-blocking I/O is that it will return immediately after the call, and the CPU time slice can be used to handle other transactions after the return. Since the complete I/O is not completed, what is immediately returned is not the data expected by the business layer, but just the status of the current call. In order to obtain the complete data, the application needs to repeatedly call the I/O operation to confirm whether it is completed (i.e. polling). Polling techniques need the following:
1.read: Checking the I/O status by repeated calls is the most original and lowest performance method
2.select: Improvements to read, judge the event status on the file descriptor. The disadvantage is that the maximum number of file descriptors is limited.
3.poll: Improvements to select, using linked lists to avoid maximum number limit, but when there are many descriptors, the performance is still very low
4.epoll: If no I/O event is checked during polling, it will sleep until the event occurs and wake it up. This is the most efficient I/O event notification mechanism under Linux.
Polling meets the need for non-blocking I/O to ensure full data acquisition, but for applications it can still count as a kind of synchronization because it still needs to wait for the I/O to return completely. During the wait, the CPU is either used to traverse the state of the file descriptor or to hibernate waiting for events to occur.
1.2 Asynchronous I/O in Ideal and Reality
Perfect asynchronous I/O should be the application that initiates a non-blocking call, and can directly handle the next task without polling, just pass the data to the application through a signal or callback after the I/O is completed.
In reality, asynchronous I/O has different implementations under different operating systems. For example, *nix platform adopts a custom thread pool, while Windows platform adopts an IOCP model. Node provides libuv as an abstract encapsulation layer to encapsulate platform compatibility judgments, and ensures that the implementation of asynchronous I/O of the upper Node and lower platforms is independent. It should be emphasized that we often mention that Node is single-threaded, which only means that Javascript execution is in a single thread, and there are other thread pools that actually complete I/O tasks within Node.
2. Node's asynchronous I/O
2.1 Event loop
Node's execution model is actually an event loop. When the process starts, Node creates an infinite loop, and each process of executing the loop body becomes a Tick. Each Tick process is to check whether there are events waiting to be processed. If so, the events and their related callback functions will be removed. If there are associated callback functions, they will be executed, and then the next loop will be entered. If there is no more event processing, exit the process.
2.2 Observer
There are several observers in each event loop, and by asking these observers, we can determine whether there are events to be processed. The event loop is a typical producer/consumer model. In Node, events mainly come from network requests, file I/O, etc. These events have corresponding network I/O observers, file I/O observers, etc. The event loop takes out the event from the observer and processes it.
2.3 Request object
During the transition from Javascript to the kernel performing I/O operations, there is an intermediate product called the request object. Taking the simplest method of fs.open() in Windows (open a file and obtain a file descriptor according to the specified path and parameters) as an example, from JS calls to built-in modules, the system calls through libuv is actually called the uv_fs_open() method. During the call process, an FSReqWrap request object is created, and the parameters and methods passed from the JS layer are encapsulated in this request object. The callback function we are most concerned about is set on the oncompete_sym property of this object. After the object is wrapped, push the FSReqWrap object into the thread pool and wait for execution.
At this point, the JS call returns immediately, and the JS thread can continue to perform subsequent operations. The current I/O operation is waiting for execution in the thread pool, which completes the first stage of asynchronous call.
2.4 Execute callbacks
Callback notification is the second phase of asynchronous I/O. After the I/O operation in the thread pool is called, the obtained results will be stored, and then IOCP is notified that the current object operation has been completed and the thread returns the thread pool. During each Tick execution, the I/O observer of the event loop will call the relevant method to check whether there are completed requests in the thread pool. If it exists, the request object will be added to the I/O observer's queue and then processed as an event.
3. Non-I/O asynchronous API
There are also some asynchronous APIs that are not related to I/O in Node, such as timers setTimeout(), setInterval(), process.nextTick() and setImmdiate() that immediately execute tasks asynchronously, etc., which will be briefly introduced here.
3.1 Timer API
The APIs on the browser side of setTimeout() and setInterval() are consistent. Their implementation principle is similar to asynchronous I/O, but they do not require the participation of the I/O thread pool. The timer created by calling the timer API will be inserted into a red and black tree inside the timer observer. Each event loop's Tick will iterate out the timer object from the red and black tree to check whether the time time has exceeded. If it exceeds, an event will be formed and the callback function will be executed immediately. The main problem with a timer is that its timing time is not particularly accurate (milliseconds, within tolerance).
3.2 Asynchronous task execution API
Before Node appeared, many people might call this in order to immediately perform a task asynchronously:
The code copy is as follows:
setTimeout(function() {
// TODO
}, 0);
Due to the characteristics of event loops, the timer is not accurate enough, and the use of a red and black tree requires the use of a timer, and the complexity of various operation time is O(log(n)). The process.nextTick() method will only put the callback function into the queue and take it out and execute it in the next round of Tick. The complexity is O(1) and it is more efficient.
In addition, there is a setImmediate() method similar to the above method, both delaying execution of the callback function. However, the former has higher priority than the latter, because the event loop checks the observer in sequence. In addition, the former callback function is saved in an array, and each round of Tick will execute all callback functions in the array; the latter result is saved in a linked list, and each round of Tick will only execute one callback function.
4. Event-driven and high-performance servers
The previous example illustrates how Node implements asynchronous I/O. In fact, Node also applies asynchronous I/O for network socket processing, which is also the basis for Node to build a web server. Classic server models are:
1. Synchronous: Only one request can be processed at a time, and the rest of the requests are in a waiting state
2. Per process/per request: Start one process for each request, but the system resources are limited and do not have scalability.
3. Per thread/per request: Start one thread for each request. Threads are lighter than processes, but each thread occupies a certain amount of memory. When large concurrent requests arrive, the memory will run out soon.
The famous Apache adopts the per-thread/per-request form, which is why it is difficult to cope with high concurrency. Node handles requests through event-driven methods, which can save the overhead of creating and destroying threads. At the same time, the operating system has fewer threads when scheduling tasks, and the cost of context switching is also very low. Node can handle requests in an orderly manner even with a large number of connections.
The well-known server Nginx also abandons the multi-threading method and adopts the same event-driven method as Node. Now Nginx is in a big way to replace Apache. Nginx is written in pure C and has high performance, but it is only suitable for web servers, used for reverse proxying or load balancing, etc. Node can build the same functions as Nginx, and can also handle various specific businesses, and its own performance is also good. In actual projects, we can combine them to achieve the best performance of the application.