Introduction
If you have heard of Node, or read some articles that claim how great Node is, you might be thinking, “What exactly is Node?” Although not for everyone, Node may be the right choice for some people.
To try to explain what Node.js is, this article explores the problems it can solve, how it works, how to run a simple application, and finally, when Node is and when is not a good solution. This article does not cover how to write a complex Node application, nor is it a comprehensive Node tutorial. Reading this article should help you decide whether you should learn Node in order to use it for your business.
What is Node designed to solve?
Node's publicly claimed goal is "to provide a simple way to build scalable network programs." What are the problems with the current server program? Let's do a math problem. In languages like Java™ and PHP, each connection generates a new thread, which may require 2 MB of companion memory per new thread. On a system with 8 GB of RAM, the theoretical maximum number of concurrent connections is 4,000 users. As your client base grows, you want your web application to support more users, so you have to add more servers. Of course, this increases business costs, especially server costs, shipping costs and labor costs. In addition to these cost increases, there is a technical problem: users may use different servers for each request, so any shared resource must be shared among all servers. For example, in Java, static variables and caches need to be shared between JVMs on each server. This is the bottleneck in the entire web application architecture: the maximum number of concurrent connections a server can handle.
Node's solution to this problem is to change the way the connection connects to the server. Each connection creates a process that does not require a companion memory block, instead of generating a new OS thread for each connection (and allocating some companion memory to it). Node claims that it will never deadlock because it does not allow locks at all, and it will not block I/O calls directly. Node also claims that the server running it can support tens of thousands of concurrent connections. In fact, Node changes the server face by changing bottlenecks throughout the system from the maximum number of connections to traffic for a single system.
Now that you have a program that can handle tens of thousands of concurrent connections, what can you actually build with Node? If you have a web application that needs to handle so many connections, it would be a "horrible" thing! That's a "if you have this problem, it's not a problem at all". Before answering the above question, let's take a look at how Node works and how it is designed to work.
Node is definitely not
Yes, Node is a server program. However, it certainly doesn't look like Apache or Tomcat. Those servers are standalone server products that allow applications to be installed and deployed immediately. With these products, you can get a server up and running in one minute. Node is certainly not this kind of product. Apache can add a PHP module to allow developers to create dynamic web pages, and programmers using Tomcat can deploy JSPs to create dynamic web pages. Node is certainly not this type.
In the early stages of Node (currently version 0.4.6), it was not a "run-ready" server program, and you can't install it, place files into it, and have a fully functional web server. Even if you want to implement the basic function of the web server up and running after installation is completed, a lot of work is still required.
How Node works
Node itself runs V8 JavaScript. Wait, JavaScript on the server? That's right, you read it right. Server-side JavaScript is a relatively new concept mentioned about two years ago when discussing the Aptana Jaxer product on developerWorks (see Resources). Although Jaxer has never been really popular, the concept itself is not out of reach - why can't we use the programming language used on the client on the server?
What makes V8? The V8 JavaScript engine is the underlying JavaScript engine Google uses for their Chrome browser. Few people think about what JavaScript actually does on the client? In fact, the JavaScript engine is responsible for interpreting and executing the code. Using V8, Google created a superfast interpreter written in C++ that has another unique feature; you can download the engine and embed it into any application. It's not limited to running in one browser. So, Node actually uses the V8 JavaScript engine written by Google and rebuilds it for use on the server. Too perfect! Now that there is a good solution available, why create a new language?
Event-driven programming
Many programmers have educated them to believe that object-oriented programming is the perfect programming design and are dismissive of other programming methods. Node uses a so-called event-driven programming model.
Listing 1. Event-driven programming using jQuery on the client
The code copy is as follows:
// jQuery code on the client-side showing how Event-Driven programming works
// When a button is pressed, an Event occurs - deal with it
// directly right here in an anonymous function, where all the
// necessary variables are present and can be referenced directly
$("#myButton").click(function(){
if ($("#myTextField").val() != $(this).val())
alert("Field must match button text");
});
In fact, there is no difference between the server and the client. Yes, there is no button click operation, nor is there an action to type into a text field, but at a higher level, the event is happening. A connection is established - event! Data is received through connection - Events! Data stops through connection - event!
Why is this setting type ideal for Node? JavaScript is a great event-driven programming language because it allows anonymous functions and closures, and more importantly, anyone who has written code is familiar with its syntax. The callback function called when an event occurs can be written at the capture event. In this way, the code is easy to write and maintain, without complex object-oriented frameworks, without interfaces, and without the potential to structure anything on it. Just listen to the event, write a callback function, and then event-driven programming will take care of everything!
Sample Node Application
Finally, let’s look at some code! Let's combine everything we've discussed and create our first Node application. Since we already know that Node is ideal for handling high-traffic applications, we will create a very simple web application - an application built for maximum speed. Here are the specific requirements for our sample application explained by "Boss": Create a random number generator RESTful API. This application should accept an input: a parameter named "number". The application then returns a random number between 0 and the parameter and returns the generated number to the caller. Since "Boss" wants it to be a widely popular application, it should be able to handle 50,000 concurrent users. Let's take a look at the code:
Listing 2. Node random number generator
The code copy is as follows:
// these modules need to be imported in order to use them.
// Node has several modules. They are like any #include
// or import statement in other languages
var http = require("http");
var url = require("url");
// The most important line in any Node file. This function
// does the actual process of creating the server. Technically,
// Node tells the underlying operating system that whenever a
// connection is made, this particular callback function should be
// executed. Since we're creating a web service with REST API,
// we want an HTTP server, which requires the http variable
// we created in the lines above.
// Finally, you can see that the callback method receives a 'request'
// and 'response' object automatically. This should be familiar
// to any PHP or Java programmer.
http.createServer(function(request, response) {
// The response needs to handle all the headers, and the return codes
// These types of things are handled automatically in server programs
// like Apache and Tomcat, but Node requires everything to be done yourself
response.writeHead(200, {"Content-Type": "text/plain"});
// Here is some unique-looking code. This is how Node retrives
// parameters passed in from client requests. The url module
// handles all these functions. The parse function
// deconstructs the URL, and places the query key-values in the
// query object. We can find the value for the "number" key
// by referencing it directly - the beauty of JavaScript.
var params = url.parse(request.url, true).query;
var input = params.number;
// These are the generic JavaScript methods that will create
// our random number that gets passed back to the caller
var numInput = new Number(input);
var numOutput = new Number(Math.random() * numInput).toFixed(0);
// Write the random number to response
response.write(numOutput);
// Node requires us to explicitly end this connection. This is because
// Node allows you to keep a connection open and pass data back and forth,
// though that advanced topic isn't discussed in this article.
response.end();
// When we create the server, we have to explicitly connect the HTTP server to
// a port. Standard HTTP port is 80, so we'll connect it to that one.
}).listen(80);
// Output a String to the console once the server starts up, letting us know everything
// starts up correctly
console.log("Random Number Generator Running...");
Put the above code into a file called "random.js". Now, to start the application and run it (and then create an HTTP server and listen for connections on port 80), just enter the following command in your command prompt: % node random.js. Here is what it looks like when the server is already up and running:
The code copy is as follows:
root@ubuntu:/home/moila/ws/mike# node random.js
Random Number Generator Running...
Access the application
The application is up and running. Node is listening for any connection, let's test it. Since we created a simple RESTful API, we can use our web browser to access this application. Type the following address (make sure you completed the above steps): http://localhost/?number=27.
Your browser window will change to a random number between 0 and 27. Click the Reload button on your browser and you will get another random number. That's it, this is your first Node app!
What is Node good for?
So far, you should be able to answer the question "What is Node", but you may not be clear when you should use it. This is an important question to ask, because Node is good for some things, but conversely, Node may not be a good solution for others at the moment. You need to be careful to decide when to use Node, as using it in the wrong situation can result in a redundant encoded LOT.
What is it good for?
As you've seen before, Node is perfect for situations where you expect to have high traffic, and server-side logic and processing requirements are not necessarily huge before responding to the client. Typical examples of Node's outstanding performance include:
1.RESTful API
A web service that provides the RESTful API receives several parameters, parses them, combines a response, and returns a response (usually less text) to the user. This is the ideal situation for Node, as you can build it to handle tens of thousands of connections. It doesn't require a lot of logic yet; it just looks up some values from a database and combines a response. Since the response is a small amount of text and a small amount of text on inbound requests, the traffic is not high, and a machine can handle even the API needs of the busiest companies.
2.Twitter Queue
Imagine a company like Twitter, which has to receive tweets and write them to a database. In fact, almost thousands of tweets are reached per second, and it is impossible for the database to process the number of writes required during peak periods in a timely manner. Node has become an important part of the solution to this problem. As you can see, Node can handle tens of thousands of inbound tweets. It quickly and easily writes them to a memory queue mechanism (e.g. memcached) where another separate process can write them to the database. Node's role here is to quickly collect tweets and pass this information to another process responsible for writing. Imagine another design - a regular PHP server attempts to handle the write to the database itself - each tweet will cause a brief delay when writing to the database, because the database call is blocking the channel. Due to database latency, a machine designed like this may only handle 2000 inbound tweets per second. 1 million tweets per second requires 500 servers. Instead, Node handles every connection without blocking the channel, thus capturing as many tweets as possible. A Node machine that can handle 50,000 tweets requires only 20 servers.
3. Image file server
A company with large distributed websites, such as Facebook or Flickr, may decide to use all machines only for service images. Node would be a good solution to this problem, as the company can use it to write a simple file retriever and then process tens of thousands of connections. Node will look for the image file, return the file or a 404 error, and then do nothing. This setup will allow such distributed websites to reduce the number of servers they need to serve static files such as images, .js and .css files.
What is it bad for?
Of course, in some cases, Node is not ideal. Here are areas that Node is not good at:
1. Dynamically created pages
Currently, Node does not provide a default method to create dynamic pages. For example, when using JavaServer Pages (JSP) technology, you can create an index.jsp page that contains a loop in such a JSP code snippet. Node does not support such dynamic, HTML-driven pages. Similarly, Node is not very suitable as a web server like Apache and Tomcat. Therefore, if you want to provide such a server-side solution in Node, you have to write the entire solution yourself. PHP programmers don't want to write a PHP converter for Apache every time they deploy a web application, and so far this is exactly what Node asks you to do.
2. Relational Database Heavy Applications
The purpose of Node is fast, asynchronous and non-blocking. Databases do not necessarily share these goals. They are synchronous and blocking, because calls to the database during read and write will block the channel until the result is generated. Therefore, a web application that requires a lot of database calls, lots of reads, lots of writes per request is very unsuitable for Node, because the relational database itself can offset the many advantages of Node. (The new NoSQL database is more suitable for Node, but that's another topic altogether.)
Conclusion
The question is "What is Node.js?" should have been answered. After reading this article, you should be able to answer this question in a few clear and concise sentences. If so, you've come to the forefront of many coders and programmers. I've talked about Node with many people, but they've been confused about what exactly Node is. Understandably, they have Apache's mindset - a server is an application that puts HTML files into it and everything will work properly. And Node is purpose driven. It is a software program that uses JavaScript to allow programmers to easily and quickly create fast, scalable web servers. Apache is ready to run, while Node is encoding ready to run.
Node accomplishes its goal of providing a highly scalable server. Instead of allocating a "one thread per connection" model, it uses a "one process per connection" model to create only the memory required for each connection. It uses a very fast JavaScript engine from Google: the V8 engine. It uses an event-driven design to keep the code minimal and easy to read. All of these factors contribute to the ideal goal of Node – it is easier to write a highly scalable solution.
As important as understanding what Node is, understanding what it is not. Node is not a replacement for Apache, which is designed to make PHP web applications more scalable. This is indeed the case. At this initial stage of Node, it is unlikely that a large number of programmers will use it, but in scenarios where it works, it performs very well.
What should I expect from Node in the future? This is perhaps the most important question that this article brings up. Now that you know what it does now, you should be wondering what it will do next. Over the next year, I look forward to Node providing better integration with existing third-party support libraries. Many third-party programmers have now developed plug-ins for Node, including adding file server support and MySQL support. Hopefully Node starts integrating them into its core functionality. Finally, I also want Node to support some kind of dynamic page module so that you can do what you do in PHP and JSP (perhaps an NSP, a Node server page) in the HTML file. Finally, hopefully one day a "deployment-ready" Node server will appear, which can be downloaded and installed, just put your HTML files in it, just like using Apache or Tomcat. Node is still in its initial stages, but it is developing very quickly and may soon be in your vision.