Node has a set of data stream APIs that can process files like processing network streams, which is very convenient to use, but it only allows files to be processed sequentially and cannot read and write files randomly. Therefore, some underlying file system operations need to be used.
This chapter covers the basics of file processing, including how to open a file, read a part of a file, write data, and close a file.
Many of Node's file APIs are almost a replica of the corresponding file API in UNIX (POSIX). For example, the way to use file descriptors is used. Just like in UNIX, the file descriptor is also an integer number in Node, representing the index of an entity in the process file descriptor table.
There are 3 special file descriptors - 1, 2 and 3. They represent standard input, standard output and standard error file descriptor, respectively. Standard input, as the name implies, is a read-only stream, which processes use to read data from the console or process channel. Standard output and standard errors are file descriptors used only to output data. They are often used to output data to consoles, other processes or files. Standard errors are responsible for error message output, while standard output is responsible for ordinary process output.
Once the process is started, these file descriptors can be used, and they do not actually have corresponding physical files. You can't read and write data at a random location. (Translator's note: The original text is You can write to and read from specific positions within the file. Depending on the context, the author may have written less "not"), and can only read and output in sequence like operating network data streams, and the written data cannot be modified.
Ordinary files are not subject to this limitation. For example, in Node, you can create files that can only append data to the tail, and you can also create files that read and write random locations.
Almost all file-related operations involve processing file paths. This chapter will first introduce these tool functions, and then explain in-depth file reading, writing and data operations.
Process file path
File paths are divided into two types: relative paths and absolute paths, and they are used to represent specific files. You can merge file paths, extract file name information, and even detect whether the file exists.
In Node, you can use strings to manipulate file paths, but that will complicate the problem. For example, you want to connect different parts of the path, some parts end with "/" but some do not, and the path splitter may also be different in different operating systems, so when you connect them, the code will be very wordy and troublesome.
Fortunately, Node has a module called path that can help you standardize, connect, parse paths, convert from absolute paths to relative paths, extract various parts of information from the paths, and detect whether the file exists. In general, the path module is actually just a string processing, and it will not go to the file system for verification (path.exists function exception).
Standardization of paths
Normalizing them before storing or using paths is usually a good idea. For example, file paths obtained by user input or configuration files, or paths connected by two or more paths, should generally be standardized. A path can be normalized using the normalize function of the path module, and it can also handle "..", "."//". for example:
The code copy is as follows:
var path = require('path');
path.normalize('/foo/bar//baz/asdf/quux/..');
// => '/foo/bar/baz/asdf'
Connection path
Using the path.join() function, you can concatenate as many path strings. You can just pass all path strings to the join() function in sequence:
The code copy is as follows:
var path = require('path');
path.join('/foo', 'bar', 'baz/asdf', 'quux', '..');
// => '/foo/bar/baz/asdf'
As you can see, path.join() will automatically normalize the path inside.
Path
Use path.resolve() to resolve multiple paths into an absolute path. Its function is like "cd" operations one by one on these paths. Unlike the parameters of the cd command, these paths can be files, and they do not have to exist in real life - the path.resolve() method will not access the underlying file system to determine whether the path exists, it is just some string operations.
for example:
The code copy is as follows:
var path = require('path');
path.resolve('/foo/bar', './baz');
// => /foo/bar/baz
path.resolve('/foo/bar', '/tmp/file/');
// => /tmp/file
If the parsing result is not an absolute path, path.resolve() will append the current working directory as a path to the parsing result, for example:
The code copy is as follows:
path.resolve('wwwroot', 'static_files/png/', '../gif/image.gif');
// If the current working directory is /home/myself/node, it will return
// => /home/myself/node/wwwroot/static_files/gif/image.gif'
Calculate the relative paths of two absolute paths
path.relative() can tell you if you jump from one absolute address to another absolute address, for example:
The code copy is as follows:
var path = require('path');
path.relative('/data/orandea/test/aaa', '/data/orandea/impl/bbb');
// => ../../impl/bbb
Extract data from path
Take the path "/foo/bar/myfile.txt" as an example. If you want to get all the contents of the parent directory (/foo/bar), or read other files of the same level directory, for this, you must use path.dirname(filePath) to get the directory part of the file path, such as:
The code copy is as follows:
var path = require('path');
path.dirname('/foo/bar/baz/asdf/quux.txt');
// => /foo/bar/baz/asdf
Or, if you want to get the file name from the file path, that is, the last part of the file path, you can use the path.basename function:
The code copy is as follows:
var path = require('path');
path.basename('/foo/bar/baz/asdf/quux.html')
// => quux.html
The file path may also contain a file extension, usually the part of the string after the last "." character in the file name.
path.basename can also accept an extension name string as the second parameter, so that the returned file name will automatically remove the extension, and only return the name part of the file:
The code copy is as follows:
var path = require('path');
path.basename('/foo/bar/baz/asdf/quux.html', '.html');
// => quux
To do this, you must first know the file extension. You can use path.extname() to get the extension:
The code copy is as follows:
var path = require('path');
path.extname('/a/b/index.html');
// => '.html'
path.extname('/a/bc/index');
// => ''
path.extname('/a/bc/.');
// => ''
path.extname('/a/bc/d.');
// => '.'
Check if the path exists
So far, the path processing operations mentioned above have nothing to do with the underlying file system, but are just some string operations. However, sometimes you need to determine whether a file path exists. For example, sometimes you need to determine whether a file or directory exists. If it does not exist, you can use path.exsits():
The code copy is as follows:
var path = require('path');
path.exists('/etc/passwd', function(exists) {
console.log('exists:', exists);
// => true
});
path.exists('/does_not_exist', function(exists) {
console.log('exists:', exists);
// => false
});
Note: Starting from Node0.8 version, exists moved from the path module to the fs module and became fs.exists. Except for the namespace, nothing else has changed:
The code copy is as follows:
var fs = require('fs');
fs.exists('/does_not_exist', function(exists) {
console.log('exists:', exists);
// => false
});
path.exists() is an I/O operation. Because it is asynchronous, a callback function is required. When the I/O operation returns, the callback function is called and the result is passed to it. You can also use its synchronous version of path.existsSync(), which has the same function, except that it does not call the callback function, but returns the result directly:
The code copy is as follows:
var path = require('path');
path.existsSync('/etc/passwd');
// => true
Introduction to the fs module
The fs module contains all related functions for file query and processing. Using these functions, you can query file information, read, write and close files. Import the fs module like this:
The code copy is as follows:
var fs = require('fs')
Query file information
Sometimes you may need to know file information such as file size, creation date or permissions. You can use the fs.stath function to query the meta information of a file or directory:
The code copy is as follows:
var fs = require('fs');
fs.stat('/etc/passwd', function(err, stats) {
if (err) { throw err;}
console.log(stats);
});
This code snippet will have an output similar to the following
The code copy is as follows:
{ dev: 234881026,
ino: 95028917,
mode: 33188,
nlink: 1,
uid: 0,
gid: 0,
rdev: 0,
size: 5086,
blksize: 4096,
blocks: 0,
atime: Fri, 18 Nov 2011 22:44:47 GMT,
mtime: Thu, 08 Sep 2011 23:50:04 GMT,
ctime: Thu, 08 Sep 2011 23:50:04 GMT }
1. The fs.stat() call will pass an instance of the stats class as a parameter to its callback function. You can use the stats instance like the following:
2.stats.isFile() - Return true if it is a standard file, not a directory, socket, symbolic link or device, otherwise false
3.stats.isDiretory() - if it is a directory, return tue, otherwise false
4.stats.isBlockDevice() - Return true if it is a block device. In most UNIX systems, the block device is usually in the /dev directory.
5.stats.isChracterDevice() - Return true if it is a character device
6.stats.isSymbolickLink() - Return true if it is a file link
7.stats.isFifo() - if it is a FIFO (a special type of UNIX named pipe) returns true
8.stats.isSocket() - If it is a UNIX socket (TODO: googe it)
Open the file
Before reading or processing a file, you must first use the fs.open function to open the file, and then the callback function you provide will be called and get the descriptor of the file. Later, you can use this file descriptor to read and write the opened file:
The code copy is as follows:
var fs = require('fs');
fs.open('/path/to/file', 'r', function(err, fd) {
// got fd file descriptor
});
The first parameter of fs.open is the file path, and the second parameter is some tags used to indicate in which mode the file is opened. These tags can be r, r+, w, w+, a or a+. Below is an explanation of these tags (from the fopen page of the UNIX documentation)
1.r - Open the file in read-only mode, the initial location of the data stream begins at the file
2.r+ - Open the file in a readable and writeable manner, and the initial location of the data stream begins at the file
3.w - If the file exists, clear the file length by 0, that is, the contents of the file will be lost. If it does not exist, try to create it. The initial location of the data stream begins at the file
4.w+ - Open the file in a readable and writeable manner. If the file does not exist, try to create it. If the file exists, clear the file length by 0, that is, the contents of the file will be lost. The initial location of the data stream begins at the file
5.a - Open the file in a write-only manner. If the file does not exist, try to create it. The initial location of the data stream is at the end of the file. Each subsequent write operation appends the data to the back of the file.
6.a+ - Open the file in a readable and writeable manner. If the file does not exist, try to create it. The initial location of the data stream is at the end of the file. Each subsequent write operation will append the data to the back of the file.
Read the file
Once the file is opened, you can start reading the file contents, but before you start, you have to create a buffer to place the data. This buffer object will be passed to the fs.read function as a parameter and will be filled with data by fs.read.
The code copy is as follows:
var fs = require('fs');
fs.open('./my_file.txt', 'r', function opened(err, fd) {
if (err) { throw err }
var readBuffer = new Buffer(1024),
bufferOffset = 0,
bufferLength = readBuffer.length,
filePosition = 100;
fs.read(fd,
readBuffer,
bufferOffset,
bufferLength,
filePosition,
function read(err, readBytes) {
if (err) { throw err; }
console.log('just read ' + readBytes + ' bytes');
if (readBytes > 0) {
console.log(readBuffer.slice(0, readBytes));
}
});
});
The above code tries to open a file. After successfully opening (calling the opened function), it starts requesting to read the next 1024 bytes of data from the 100th byte of the file stream (line 11).
The last parameter of fs.read() is a callback function (line 16). When the following three situations occur, it will be called:
1. An error occurred
2. The data was successfully read
3. No data to read
If an error occurs, the first parameter (err) will provide the callback function with an object containing the error message, otherwise this parameter is null. If the data is successfully read, the second parameter (readBytes) will indicate the size of the data being read in the buffer. If the value is 0, it means that the end of the file has been reached.
Note: Once the buffer object is passed to fs.open(), the control of the buffer object is transferred to the read command. Only when the callback function is called will the control of the buffer object be returned to you. So before this, do not read and write or let other function calls use this buffer object; otherwise, you may read incomplete data, and worse, you may write data into this buffer object concurrently.
Write a file
Pass a buffer object containing data by passing it to fs.write(), and write data to an open file:
The code copy is as follows:
var fs = require('fs');
fs.open('./my_file.txt', 'a', function opened(err, fd) {
if (err) { throw err; }
var writeBuffer = new Buffer('writing this string'),
bufferPosition = 0,
bufferLength = writeBuffer.length, filePosition = null;
fs.write( fd,
writeBuffer,
bufferPosition,
bufferLength,
filePosition,
function write(err, written) {
if (err) { throw err; }
console.log('wrote ' + written + ' bytes');
});
});
In this example, the second line of code attempts to open a file in append mode (a), and then the seventh line of code (translator's note: original text is 9) writes data to the file. The buffer object needs to be accompanied by several information as parameters:
1. Buffer data
2. Where does the data to be written start in the buffer
3. The length of data to be written
4. Where to write the data to the file
5. The callback function called after the operation is finished wrote
In this example, the filePostion parameter is null, which means that the write function will write the data to the current location of the file pointer. Because it is a file opened in append mode, the file pointer is at the end of the file.
Like read operations, do not use which incoming buffer object is used during fs.write execution. Once fs.write starts executing, it gains control over that buffer object. You can only wait until the callback function is called before reusing it.
Close the file
You may have noticed that so far, all the examples in this chapter have no code to close the file. Because they are just small and simple examples when used only once, the operating system ensures that all files are closed when the Node process ends.
However, in the actual application, once a file is opened you want to make sure you end up closing it. To do this, you need to trace all those open file descriptors and then call fs.close(fd[,callback]) when they are no longer used to eventually close them. If you don't pay attention, it's easy to miss a certain file descriptor. The following example provides a function called openAndWriteToSystemLog, which shows how to carefully close files:
The code copy is as follows:
var fs = require('fs');
function openAndWriteToSystemLog(writeBuffer, callback){
fs.open('./my_file', 'a', function opened(err, fd) {
if (err) { return callback(err); }
function notifyError(err) {
fs.close(fd, function() {
callback(err);
});
}
var bufferOffset = 0,
bufferLength = writeBuffer.length,
filePosition = null;
fs.write( fd, writeBuffer, bufferOffset, bufferLength, filePosition,
function write(err, written) {
if (err) { return notifyError(err); }
fs.close(fd, function() {
callback(err);
});
}
);
});
}
openAndWriteToSystemLog(
new Buffer('writing this string'),
function done(err) {
if (err) {
console.log("error while opening and writing:", err.message);
return;
}
console.log('All done with no errors');
}
);
Here, a function called openAndWriteToSystemLog is provided, which accepts a buffer object containing the data to be written, and a callback function called after the operation is completed or an error occurs. If an error occurs, the first parameter of the callback function will contain this error object.
Note that internal function notifyError, which closes the file and reports an error that occurs.
Note: Until then, you know how to use the underlying atomic operations to open, read, write and close files. However, Node also has a more advanced set of constructors that allow you to process files in a simpler way.
For example, you want to use a safe way to allow two or more write operations to append data to a file concurrently, and you can use WriteStream.
Also, if you want to read a certain area of a file, you can consider using ReadStream. These two use cases will be introduced in Chapter 9 "Reading and writing streaming of data".
summary
When you use a file, you need to process and extract file path information in most cases. By using the path module, you can connect paths, standardize paths, calculate path differences, and convert relative paths into absolute paths. You can extract path components such as extensions, file names, directories, etc.
Node provides a set of underlying APIs in the fs module to access the file system, and the underlying API uses file descriptors to manipulate files. You can open the file with fs.open, write the file with fs.write, read the file with fs.read, and close the file with fs.close.
When an error occurs, you should always use the correct error handling logic to close the file - to ensure that those open file descriptors are closed before the call returns.