Detailed explanation of using Buffer to encode and decode binary data in Node.js

Author：Eve Cole Update Time：2025-03-28 18:16:01

JavaScript is good at handling strings, but because it was originally designed to handle HTML documents, it is not very good at handling binary data. JavaScript has no byte type, no structured types, and even no byte arrays, only numbers and strings. (Original text: JavaScript doesn't have a byte type ― it just has numbers ― or structured types, or http://skylitecellars.com/ even byte arrays: It just has strings.)

Because Node is based on JavaScript, it can naturally handle text protocols like HTTP, but you can also use it to interact with databases, process images or file uploads, etc. It is conceivable how difficult it would be to do these things with just strings. Earlier, Node handled binary data by encoding byte into text characters, but this method later proved unfeasible, wasting resources, slow, inflexible, and difficult to maintain.

Node has a binary buffer implementation Buffer. This pseudo-class provides a series of APIs for processing binary data, simplifying tasks that require processing binary data. The length of the buffer is determined by the length of the byte data, and you can randomly set and get the byte data in the buffer.

Note: The Buffer class has a special place. The memory occupied by the byte data in the buffer is not allocated in JavaScrp.

It VM memory heap, that is, these objects will not be processed by JavaScript's garbage collection algorithm, and will be replaced by a permanent memory address that will not be modified, which also avoids CPU waste caused by memory copying of buffered content.

Create a buffer

You can create a buffer with a UTF-8 string like this:

The code copy is as follows:

var buf = new Buffer('Hello World!');

You can also create a buffer with a specified encoding:

The code copy is as follows:

var buf = new Buffer('8b76fde713ce', 'base64');

Acceptable character encoding and identification are as follows:

1.ascii-ASCI, only applicable to ASCII character sets.

2.utf8 - UTF-8, this variable-wide encoding is suitable for any character in the Unicode character set. It has become the preferred encoding in the web world and is also the default encoding type of Node.

3.base64 - Base64, this encoding is based on 64 printable ASCII characters to represent binary data. Base64 is usually used to embed binary data that can be converted into strings in a character document, and can be converted back to the original binary format intact and losslessly when needed.

If there is no data to initialize the buffer, you can create an empty buffer with the specified capacity size:

The code copy is as follows:

var buf = new Buffer(1024); // Create a 1024-byte buffer

Get and set buffered data

After creating or receiving a buffered object, you may want to view or modify its contents. You can access a certain byte of the buffer through the [] operator:

The code copy is as follows:

var buf = new Buffer('my buffer content');

// Access the 10th byte in the buffer

console.log(buf[10]); // -> 99

Note: When you create an initialized buffer (using the buffer capacity size), be sure to note that the buffered data is not initialized to 0, but is random data.

The code copy is as follows:

var buf = new Buffer(1024);

console.log(buf[100]); // -> 5 (some random value)

You can modify the data anywhere in the buffer like this:

The code copy is as follows:

buf[99] = 125; // Set the value of the 100th byte to 125

Note: In some cases, some buffering operations do not produce errors, such as:

1. The maximum value of bytes in the buffer is 255. If a byte is assigned a number greater than 256, it will be modulated with 256 and then assign the result to this byte.

2. If a buffered byte is assigned to 256, its actual value will be 0 (Translator's note: It is actually repeated with the first one, 256%256=0)

3. If you use a floating point number to assign a value to a byte in the buffer, such as 100.7, the actual value will be the integer part of the floating point number - 100

4. If you try to assign a value to a position that exceeds the buffer capacity, the assignment operation will fail and the buffer will not be modified.

You can use the length attribute to get the buffer length:

The code copy is as follows:

var buf = new Buffer(100);

console.log(buf.length); // -> 100

You can also iterate over the buffered contents using the buffer length to read or set each byte:

The code copy is as follows:

var buf = new Buffer(100);

for(var i = 0; i < buf.length; i++) {

buf[i] = i;

}

The above code creates a new buffer containing 100 bytes and sets each byte in the buffer from 0 to 99.

Slice buffered data

Once a buffer is created or received, you may need to extract a portion of the buffered data. You can split the existing buffer by specifying the starting position to create another smaller buffer:

The code copy is as follows:

var buffer = new Buffer("this is the content of my buffer");

var smallerBuffer = buffer.slice(8, 19);

console.log(smallerBuffer.toString()); // -> "the content"

Note that when splitting a buffer, no new memory is allocated or copied. The new buffer uses the parent buffer memory, which is just a reference to a certain piece of data (specified by the starting position). This passage has several meanings.

First, if your program modifies the contents of the parent buffer, these modifications will also affect the relevant child buffers, because the parent buffer and child buffer are different JavaScript objects, so it is easy to ignore this problem and lead to some potential bugs.

Secondly, when you create a smaller child buffer from the parent buffer in this way, the parent buffer object will still be retained after the operation is finished and will not be garbage collected. If you are not careful, it will easily cause memory leakage.

Note: If you are worried about memory leakage, you can use the copy method instead of slice operation. The following will introduce copy.

Copy buffered data

You can use copy to copy a portion of the buffer to another buffer like this:

The code copy is as follows:

var buffer1 = new Buffer("this is the content of my buffer");

var buffer2 = new Buffer(11);

var targetStart = 0;

var sourceStart = 8;

var sourceEnd = 19;

buffer1.copy(buffer2, targetStart, sourceStart, sourceEnd);

console.log(buffer2.toString()); // -> "the content"

In the above code, copy the 9th to 20th bytes of the source buffer to the start position of the target buffer.

Decode buffered data

The buffered data can be converted into a UTF-8 string like this:

The code copy is as follows:

var str = buf.toString();

The buffered data can also be decoded into any encoding type data by specifying the encoding type. For example, if you want to decode a buffer into a base64 string, you can do this:

The code copy is as follows:

var b64Str = buf.toString("base64");

Using the toString function, you can also transcode a UTF-8 string into a base64 string:

The code copy is as follows:

var utf8String = 'my string';

var buf = new Buffer(utf8String);

var base64String = buf.toString('base64')

summary

Sometimes you have to deal with binary data, but native JavaScript does not have a clear way to do this, so Node provides a Buffer class that encapsulates some operations for continuous memory blocks. You can split or copy memory data between two buffers.

You can also convert a buffer into some kind of encoded string, or in turn, convert a string into a buffer to access or process each bit.