This article is a recent idea I have developed during the Node.js learning process, and I will discuss it with you.
Node.js HTTP server
Using Node.js can be used to implement an http service very easily. The simplest example is like the example of an official website:
The code copy is as follows:
var http = require('http');
http.createServer(function (req, res) {
res.writeHead(200, {'Content-Type': 'text/plain'});
res.end('Hello World/n');
}).listen(1337, '127.0.0.1');
This quickly builds a web service that listens to all http requests on port 1337.
However, in a real production environment, we generally rarely use Node.js directly as the front-end web server for users. The main reasons are as follows:
1. Based on the single-threaded feature of Node.js, its robustness guarantee is relatively high for developers.
2. Other http services on the server may already occupy port 80, and web services that are not port 80 are obviously not user friendly enough to users.
3.Node.js does not have much advantage in file IO processing. For example, as a regular website, it may require you to respond to file resources such as pictures at the same time.
4. Distributed load scenarios are also a challenge.
Therefore, using Node.js as a web service may be more likely to be a game server interface and other similar scenarios, mostly to deal with services that do not require direct user access and only perform data exchange.
Node.js web service based on Nginx as a front-end machine
Based on the above reasons, if it is a website-shaped product built with Node.js, the conventional way of using it is to place another mature http server on the front end of Node.js' web service, such as Nginx is the most commonly used.
Then use Nginx as a reverse proxy to access the Node.js-based web service. like:
The code copy is as follows:
server{
listen 80;
server_name yekai.me;
root /home/andy/wwwroot/yekai;
location / {
proxy_pass http://127.0.0.1:1337;
}
location ~ /.(gif|jpg|png|swf|ico|css|js)$ {
root /home/andy/wwwroot/yekai/static;
}
}
This will better solve the several problems raised above.
Communication using FastCGI protocol
However, there are some things that are not very good about the above proxy method.
One is possible scenarios that require direct http access to the Node.js web service that needs to be controlled later. However, if you want to solve the problem, you can also use your own services or rely on firewall to block it.
Another reason is that the proxy method is a solution at the network application layer after all, and it is not very convenient to directly obtain and process data interacting with the client http, such as processing keep-alive, trunk and even cookies. Of course, this is also related to the capabilities and functional perfection of the proxy server itself.
So, I was thinking of trying another way to deal with it. The first thing I thought of is the FastCGI method that is commonly used in php web applications now.
What is FastCGI
Fast Common Gateway Interface (FastCGI) is a protocol that allows interactive programs to communicate with web servers.
The background generated by FastCGI is used as an alternative to cgi web applications. One of the most obvious features is that a FastCGI service process can be used to handle a series of requests. The web server will connect the environment variables and this page request to the web server through a socket such as FastCGI process. The connection can be connected to the web server by Unix Domain Socket or a TCP/IP connection. For more background knowledge, please refer to the entry of Wikipedia.
FastCGI implementation of Node.js
So in theory we only need to use Node.js to create a FastCGI process, and then specify that Nginx's listening request is sent to this process. Since Nginx and Node.js are both event-driven service models, they should be "theoretical" solutions to match the world. Let's do it yourself.
In Node.js, the net module can be used to establish a socket service. For the sake of convenience, we choose the unix socket method.
With a slight modification of the Nginx configuration:
The code copy is as follows:
...
location / {
fastcgi_pass unix:/tmp/node_fcgi.sock;
}
...
Create a new file node_fcgi.js, with the following content:
The code copy is as follows:
var net = require('net');
var server = net.createServer();
server.listen('/tmp/node_fcgi.sock');
server.on('connection', function(sock){
console.log('connection');
sock.on('data', function(data){
console.log(data);
});
});
Then run (because of permissions, please make sure that Nginx and node scripts run with the same user or account with mutual permissions, otherwise you will encounter permission problems when reading and writing sock files):
node node_fcgi.js
When accessing the browser, we see that the terminal running the node script normally receives the data content, such as this:
The code copy is as follows:
connection
< Buffer 01 01 00 01 00 08 00 00 00 01 00 00 00 00 00 00 01 04 00 01 01 87 01...>
This proves that our theoretical foundation has achieved the first step. Next, we only need to figure out how to parse the content of this buffer.
FastCGI protocol foundation
A FastCGI record consists of a fixed length prefix followed by a variable number of content and padded bytes. The record structure is as follows:
The code copy is as follows:
typedef struct {
unsigned char version;
unsigned char type;
unsigned char requestIdB1;
unsigned char requestIdB0;
unsigned char contentLengthB1;
unsigned char contentLengthB0;
unsigned char paddingLength;
unsigned char reserved;
unsigned char contentData[contentLength];
unsigned char paddingData[paddingLength];
} FCGI_Record;
version: FastCGI protocol version, now use 1 by default
type: The record type can actually be regarded as a different state, and it will be discussed in detail later
requestId: request id, it must correspond when returning. If it is not a case of multiplexing concurrency, just use 1 here
contentLength: content length, the maximum length here is 65535
paddingLength: The padding length is used to fill the long data into an integer multiple of the full 8 bytes. It is mainly used to process data that is aligned more effectively, mainly for performance considerations
reserved: reserved bytes for subsequent expansion
contentData: real content data, let's talk about it in detail later
paddingData: Fill in the data, it is 0 anyway, just ignore it directly.
For specific structure and description, please refer to the official website document (http://www.fastcgi.com/devkit/doc/fcgi-spec.html#S3.3).
Request part
It seems very simple, just parse and get the data in one go. However, there is a pit here, that is, what is defined here is the structure of the data unit (record), not the entire buffer structure. The entire buffer consists of one record and one record. At the beginning, it may not be easy for students who are used to front-end development, but this is the basis for understanding the FastCGI protocol, and we will see more examples later.
Therefore, we need to parse a record and distinguish the records based on the type we obtained before. Here is a simple function to get all records:
The code copy is as follows:
function getRcds(data, cb){
var rcds = [],
start = 0,
length = data.length;
return function (){
if(start >= length){
cb && cb(rcds);
rcds = null;
return;
}
var end = start + 8,
header = data.slice(start, end),
version = header[0],
type = header[1],
requestId = (header[2] << 8) + header[3],
contentLength = (header[4] << 8) + header[5],
paddingLength = header[6];
start = end + contentLength + paddingLength;
var body = contentLength ? data.slice(end, contentLength) : null;
rcds.push([type, body, requestId]);
return arguments.callee();
}
}
//use
sock.on('data', function(data){
getRcds(data, function(rcds){
})();
}
Note that this is just a simple process. If there are complex situations such as uploading files, this function is not suitable for the simplest demonstration. At the same time, the requestId parameter is also ignored. If it is multiplexing, it cannot be ignored, and the processing will need to be much more complicated.
Next, different records can be processed according to the type. The definition of type is as follows:
The code copy is as follows:
#define FCGI_BEGIN_REQUEST 1
#define FCGI_ABORT_REQUEST 2
#define FCGI_END_REQUEST 3
#define FCGI_PARAMS 4
#define FCGI_STDIN 5
#define FCGI_STDOUT 6
#define FCGI_STDERR 7
#define FCGI_DATA 8
#define FCGI_GET_VALUES 9
#define FCGI_GET_VALUES_RESULT 10
#define FCGI_UNKNOWN_TYPE 11
#define FCGI_MAXTYPE (FCGI_UNKNOWN_TYPE)
Next, you can parse the real data according to the recorded type. I will only use the most commonly used FCGI_PARAMS, FCGI_GET_VALUES, and FCGI_GET_VALUES_RESULT to illustrate. Fortunately, their analysis methods are consistent. The parsing of other types records has its own different rules, and you can refer to the definition of the specification to implement it. I won’t go into details here.
FCGI_PARAMS, FCGI_GET_VALUES, FCGI_GET_VALUES_RESULT are all "encoded name-value" type data. The standard format is: transmitted in the form of a name length, followed by the length of the value, followed by the name, followed by the value, where 127 bytes or less can be encoded in one byte, while longer lengths are always encoded in four bytes. The high bit of the first byte of length indicates how the length is encoded. A high bit of 0 means a byte encoding method, and 1 means a four-byte encoding method. Let’s look at a comprehensive example, such as the case of long names and short values:
The code copy is as follows:
typedef struct {
unsigned char nameLengthB3; /* nameLengthB3 >> 7 == 1 */
unsigned char nameLengthB2;
unsigned char nameLengthB1;
unsigned char nameLengthB0;
unsigned char valueLengthB0; /* valueLengthB0 >> 7 == 0 */
unsigned char nameData[nameLength
((B3 & 0x7f) << 24) + (B2 << 16) + (B1 << 8) + B0];
unsigned char valueData[valueLength];
} FCGI_NameValuePair41;
Corresponding implementation js method example:
The code copy is as follows:
function parseParams(body){
var j = 0,
params = {},
length = body.length;
while(j < length){
var name,
value,
nameLength,
valueLength;
if(body[j] >> 7 == 1){
nameLength = ((body[j++] & 0x7f) << 24) + (body[j++] << 16) + (body[j++] << 8) + body[j++];
} else {
nameLength = body[j++];
}
if(body[j] >> 7 == 1){
valueLength = ((body[j++] & 0x7f) << 24) + (body[j++] << 16) + (body[j++] << 8) + body[j++];
} else {
valueLength = body[j++];
}
var ret = body.asciiSlice(j, j + nameLength + valueLength);
name = ret.substring(0, nameLength);
value = ret.substring(nameLength);
params[name] = value;
j += (nameLength + valueLength);
}
return params;
}
This implements a simple method to obtain various parameters and environment variables. Improve the previous code and demonstrate how we can get the client ip:
The code copy is as follows:
sock.on('data', function(data){
getRcds(data, function(rcds){
for(var i = 0, l = rcds.length; i < l; i++){
var bodyData = rcds[i],
type = bodyData[0],
body = bodyData[1];
if(body && (type === TYPES.FCGI_PARAMS || type === TYPES.FCGI_GET_VALUES || type === TYPES.FCGI_GET_VALUES_RESULT)){
var params = parseParams(body);
console.log(params.REMOTE_ADDR);
}
}
})();
}
So far we have understood the basics of the FastCGI request part, and then we will implement the response part and finally complete a simple echo reply service.
Response part
The response part is relatively simple. In the simplest case, you only need to send two records, that is, FCGI_STDOUT and FCGI_END_REQUEST.
I won’t describe the specific content of the entity, just look at the code:
The code copy is as follows:
var res = (function(){
var MaxLength = Math.pow(2, 16);
function buffer0(len){
return new Buffer((new Array(len + 1)).join('/u0000'));
};
function writeStdout(data){
var rcdStdoutHd = new Buffer(8),
contentLength = data.length,
paddingLength = 8 - contentLength % 8;
rcdStdoutHd[0] = 1;
rcdStdoutHd[1] = TYPES.FCGI_STDOUT;
rcdStdoutHd[2] = 0;
rcdStdoutHd[3] = 1;
rcdStdoutHd[4] = contendLength >> 8;
rcdStdoutHd[5] = contendLength;
rcdStdoutHd[6] = paddingLength;
rcdStdoutHd[7] = 0;
return Buffer.concat([rcdStdoutHd, data, buffer0(paddingLength)]);
};
function writeHttpHead(){
return writeStdout(new Buffer("HTTP/1.1 200 OK/r/nContent-Type:text/html; charset=utf-8/r/nConnection: close/r/n/r/n"));
}
function writeHttpBody(bodyStr){
var bodyBuffer = [],
body = new Buffer(bodyStr);
for(var i = 0, l = body.length; i < l; i += MaxLength + 1){
bodyBuffer.push(writeStdout(body.slice(i, i + MaxLength)));
}
return Buffer.concat(bodyBuffer);
}
function writeEnd(){
var rcdEndHd = new Buffer(8);
rcdEndHd[0] = 1;
rcdEndHd[1] = TYPES.FCGI_END_REQUEST;
rcdEndHd[2] = 0;
rcdEndHd[3] = 1;
rcdEndHd[4] = 0;
rcdEndHd[5] = 8;
rcdEndHd[6] = 0;
rcdEndHd[7] = 0;
return Buffer.concat([rcdEndHd, buffer0(8)]);
}
return function(data){
return Buffer.concat([writeHttpHead(), writeHttpBody(data), writeEnd()]);
};
})();
In the simplest case, this will allow you to send a full response. Change our final code:
The code copy is as follows:
var visitors = 0;
server.on('connection', function(sock){
visitors++;
sock.on('data', function(data){
...
var querys = querystring.parse(params.QUERY_STRING);
var ret = res('Welcome,' + (querys.name || 'Dear Friend') + '! You are the number ' + visitors + 'Document~');
sock.write(ret);
ret = null;
sock.end();
...
});
Open the browser and visit: http://domain/?name=yekai, and you can see something like "Welcome, yekai! You are the 7th user of this site~".
At this point, we successfully implemented the simplest FastCGI service using Node.js. If it needs to be used as a real service, we only need to compare the protocol specifications to improve our logic.
Comparative test
Finally, the question we need to consider is whether this solution is specifically feasible? Some students may have seen the problem, so I will put the simple pressure test results first:
The code copy is as follows:
//FastCGI method:
500 clients, running 10 sec.
Speed=27678 pages/min, 63277 bytes/sec.
Requests: 3295 susceed, 1318 failed.
500 clients, running 20 sec.
Speed=22131 pages/min, 63359 bytes/sec.
Requests: 6523 susceed, 854 failed.
//proxy method:
500 clients, running 10 sec.
Speed=28752 pages/min, 73191 bytes/sec.
Requests: 3724 susceed, 1068 failed.
500 clients, running 20 sec.
Speed=26508 pages/min, 66267 bytes/sec.
Requests: 6716 susceed, 2120 failed.
//Directly access Node.js service method:
500 clients, running 10 sec.
Speed=101154 pages/min, 264247 bytes/sec.
Requests: 15729 susceed, 1130 failed.
500 clients, running 20 sec.
Speed=43791 pages/min, 115962 bytes/sec.
Requests: 13898 susceed, 699 failed.
Why is the proxy method better than the FastCGI method? That's because under the proxy scheme, the backend service is run directly by the Node.js native module, and the FastCGI scheme is implemented by ourselves using JavaScript. However, it can also be seen that there is not a big gap in efficiency between the two solutions (of course, the comparison here is just a simple situation. If the gap is larger in real business scenarios), and if Node.js natively supports FastCGI services, the efficiency should be better.
postscript
If you are interested in continuing to play, you can check the source code of the examples I implemented in this article. I have studied the protocol specifications in the past two days, but it is not difficult.
At the same time, I will go back and play with uWSGI, but the official said that v8 is already ready to directly support it.
I have a very shallow game. If there is any mistake, please correct me and communicate.