In traditional programming modules, I/O operations are like an ordinary local function call: the program is blocked before the function is executed and cannot continue to run. Blocked I/O originated from the earlier time slice model, where each process is like an independent person, with the purpose of distinguishing everyone, and everyone usually can only do one thing at the same time, and must wait for the previous thing to be done before deciding what to do next. However, this model of "one user, one process" that is widely used on computer networks and the Internet is very scalable. When managing multiple processes, it consumes a lot of memory and context switching will also occupy a lot of resources. These are a huge burden on the operating system, and as the number of processes increases, the system performance will decay sharply.
Multithreading is an alternative. A thread is a lightweight process that shares memory with other threads in the same process. It is more like an extension of the traditional model, which is used to execute multiple threads concurrently. When one thread is waiting for I/O operations, other threads can take over the CPU. When the I/O operation is completed, the thread waiting in front will be awakened. That is to say, a running thread can be interrupted and then resumed later. In addition, threads can run in parallel under different cores of multi-core CPUs under some systems.
Programmers do not know what time the thread will run. They must be careful to handle concurrent access to shared memory, so they must use some synchronization primitives to synchronize access to a certain data structure, such as using locks or semaphores, to force threads to execute in specific behaviors and plans. Applications that rely heavily on shared state between threads can easily have some strange problems with strong randomness and difficulty in finding.
Another way is to use multi-threaded collaboration, where you are responsible for explicitly releasing the CPU and handing over CPU time to other threads. Because you personally control the thread's execution plan, the need for synchronization is reduced, but it also increases the complexity of the program and the chance of errors, and does not avoid the problems of multi-threading.
What is event-driven programming
Event-driven programming is a programming style, where events determine the execution process of a program. Events are handled by event handlers or event callbacks. Event callbacks are functions called when a specific event occurs, such as the database returns the query result or the user clicks a button.
Recall that in the traditional blocked I/O programming mode, database queries may look like this:
The code copy is as follows:
result = query('SELECT * FROM posts WHERE id = 1');
do_something_with(result);
The query function above will keep the current thread or process in a waiting state until the underlying database completes the query operation and returns.
In the event-driven model, this query will become like this:
The code copy is as follows:
query_finished = function(result) {
do_something_with(result);
}
query('SELECT * FROM posts WHERE id = 1', query_finished);
First you define a function called query_finished, which contains what to do after the query is completed. Then pass this function as a parameter to the query function. Query_finished will be called after query execution, instead of just returning the query result.
When an event you are interested in occurs, the function you define will be called instead of simply returning the result value. This programming model is called event-driven programming or asynchronous programming. This is one of the most obvious features of Node. This programming model means that the current process will not be blocked when executing I/O operations. Therefore, multiple I/O operations can be executed in parallel, and the corresponding callback function will be called after the operation is completed.
The underlying layer of event-driven programming relies on event loops. Event loops are basically a structure in which event detection and event processor triggers the continuous loop call of these two functions. In each loop, the event loop mechanism needs to detect which events occurred. When the event occurs, it finds the corresponding callback function and calls it.
The event loop is just a thread running in the process. When an event occurs, the event processor can run alone and will not be interrupted, that is:
1. At most one event callback function is running at a specific moment
2. No event processor is interrupted when running
With this, developers can no longer have headaches about thread synchronization and concurrent modification of shared memory.
A well-known secret:
A long time ago, people in the system programming community knew that event-driven programming was the best way to create high concurrency services because it didn't have to save a lot of context, so it saved a lot of memory, not that much context switch, and saved a lot of execution time.
Slowly, this concept permeated other platforms and communities, and some famous event loop implementations emerged, such as Ruby's Event machine, Perl's AnyEvnet, and Python's Twisted. In addition to these, there are many other implementations and languages.
To develop these frameworks, you need to learn specific knowledge related to the framework and framework-specific class libraries. For example, when using Event Machine, in order to enjoy the benefits of non-blocking, you have to avoid using synchronous class libraries and can only use Event Machine's asynchronous class libraries. If you use any blocking library (such as most Ruby's standard library), your server loses its optimal scalability because the event loop will still be blocked constantly, blocking the processing of I/O events from time to time.
Node was originally designed as a non-blocking I/O server platform, so in general, you should expect all the code running on it to be non-blocking. Because JavaScript is very small and it does not force any I/O model (because it does not have a standard I/O class library), Node is built in a very pure environment and there will be no legacy issues.
How Node and JavaScript simplify asynchronous applications
Node's author Ryan Dahl initially used C to develop this project, but found that the context of maintaining function calls was too complex, resulting in high code complexity. Then he switched to Lua, but Lua already has several blocking I/O libraries. The mixing of blocking and non-blocking may confuse developers and thus prevent many people from building scalable applications. Therefore, Lua was also abandoned by Dahl. Finally he turned to JavaScript, closures in JavaScript and functions of first-level objects, which make JavaScript very suitable for event-driven programming. The magic of JavaScript is one of the main reasons why Node is so popular.
What is a closure
A closure can be understood as a special function, but it can inherit and access variables in the scope it is defined. When you pass a callback function as a parameter to another function, it will be called later. The magic is that when this callback function is called later, it actually remembers the context in which it defines itself and the variables in the parent context, and can also access them normally. This powerful feature is the core of Node's success.
The following example will show how JavaScript closures work in a web browser. If you want to listen for a stand-alone event on a button, you can do this:
The code copy is as follows:
var clickCount = 0;
document.getElementById('myButton').onclick = function() {
clickCount += 1;
alert("clicked " + clickCount + " times.");
};
This is how when using jQuery:
The code copy is as follows:
var clickCount = 0;
$('button#mybutton').click(function() {
clickedCount ++;
alert('Clicked ' + clickCount + ' times.');
});
In JavaScript, functions are the first type of objects, which means that you can pass functions as parameters to other functions. In the above two examples, the former assigns a function to another function, and the latter passes the function as a parameter to another function. The click event processing function (callback function) can access each variable under the code block where the function defines it. In this example, it can access the clickCount variable defined in its parent closure.
The clickCount variable is in the global scope (the outermost scope in JavaScript), which saves the number of times the user clicks a button. It is usually a bad habit to store variables under the global scope, because it is easy to conflict with other code, and you should put variables in the local scope where you use them. Most of the time, just wrapping the code with one function is equivalent to creating another closure, which can easily avoid polluting the global environment, just like this:
The code copy is as follows:
(function() {
var clickCount = 0;
$('button#mybutton').click(function() {
clickCount ++;
alert('Clicked ' + clickCount + ' times.');
});
}());
Note: The seventh line of the above code defines a function and calls it immediately. This is a common design pattern in JavaScript: create a new scope by creating a function.
How closures help asynchronous programming
In the event-driven programming model, first write the code to run after the event occurs, then put the code into a function, and finally pass the function as a parameter to the caller, and then call it by the caller function later.
In JavaScript, a function is not an isolated definition. It also remembers the context of the scope it is declared. This mechanism allows JavaScript functions to access the context in which the function definition is located and all variables in the parent context.
When you pass a callback function as a parameter to the caller, the function will be called at some later point. Even if the scope that defines the callback function has ended, when the callback function is called, it can still access all variables in the ended scope and its parent scope. Like the last example, the callback function is called inside the click() of jQuery, but it can still access the clickCount variable.
The magic of closures is shown earlier. Passing state variables to a function allows you to perform event-driven programming without maintaining states. JavaScript's closure mechanism will help you maintain them.
summary
Event-driven programming is a programming model that determines the program execution process through event triggering. Programmers register callback functions for events they are interested in (usually called event handlers), and the system then calls the registered event handler when the event occurs. This programming model has many advantages that traditional blocking programming models do not have. In the past, to implement similar features, multi-process/multi-threading must be used.
JavaScript is a powerful language because of its first type of object's function and closure properties, making it very suitable for event-driven programming.