Preface
In the previous article, we introduced the relevant content in Java's file byte streaming framework, while our article will focus on the relevant content of file character streaming.
First of all, it should be clear that byte stream processing files are based on bytes, while character stream processing files are based on characters as basic units.
But in fact, the essence of character stream operation is the encapsulation of the two processes of "byte stream operation" + "encoding". Do you think so? Whether you are writing a character to a file, you need to encode the characters into binary, and then write them to the file in bytes as the basic unit, or you read a character to memory, you need to read it in bytes as the basic unit and then transcode it into characters.
It is important to understand this, which will determine your overall understanding of character streams. Let's take a look at the design of related APIs together.
Base class Reader/Writer
Before formally learning the character stream base class, we need to know how a character is represented in Java.
First of all, the default character encoding in Java is: UTF-8, and we know that UTF-8 encoded characters are stored using 1 to 4 bytes, and the more commonly used characters, the fewer bytes are used.
The char type is defined as two byte sizes, that is, for ordinary characters, a char can store a character, but for some complementary character sets, two chars are often used to represent a character.
Reader is the base class for reading character streams, and it provides the most basic character reading operations. Let's take a look together.
Let's take a look at its constructor first:
protected Object lock;protected Reader() { this.lock = this;}protected Reader(Object lock) { if (lock == null) { throw new NullPointerException(); } this.lock = lock;}Reader is an abstract class, so there is no doubt that these constructors are called to subclasses and are used to initialize lock lock objects, which we will explain in detail later.
public int read() throws IOException { char cb[] = new char[1]; if (read(cb, 0, 1) == -1) return -1; else return cb[0];}public int read(char cbuf[]) throws IOException { return read(cbuf, 0, cbuf.length);}abstract public int read(char cbuf[], int off, int len)The basic character reading operation is all here. The first method is used to read a character. If it has been read to the end of the file, it will return -1. The same is received with int as the return value type, why not use char? The reason is the same, all because of the uncertainty of the interpretation of the value -1.
The second method is similar to the third method, reading characters of a specified length from the file and placing them into the target array. The third method is an abstract method, which needs to be implemented by subclasses, while the second method is based on it.
There are some other methods that are similar:
These methods are actually well-known and are generally similar to our InputStream, and they have no core implementation. I won't go into details here, you can roughly know what's inside it.
Writer is a written character stream, which is used to write one or more characters into a file. Of course, the specific write method is still an abstract method and is to be implemented by subclasses, so we will not repeat it here.
Adapter InpustStramReader/OutputStreamWriter
Adapter character streams inherit from the base class Reader or Writer, which are very important members of the character stream system. The main function is to convert a byte stream into a character stream. Let's first take the read adapter as an example.
First of all, its core members:
private final StreamDecoder sd;
StreamDecoder is a decoder used to convert various operations of bytes into corresponding operations of characters. We will mention it continuously in the subsequent introduction, and we will not explain it uniformly here.
Then there is the constructor:
public InputStreamReader(InputStream in) { super(in); try { sd = StreamDecoder.forInputStreamReader(in, this, (String)null); } catch (UnsupportedEncodingException e) { throw new Error(e); }}public InputStreamReader(InputStream in, String charsetName) throws UnsupportedEncodingException{ super(in); if (charsetName == null) throw new NullPointerException("charsetName"); sd = StreamDecoder.forInputStreamReader(in, this, charsetName);}The purpose of these two constructors is to initialize this decoder. The method forInputStreamReader is called, but the parameters are different. Let's take a look at the implementation of this method:
This is a typical static factory pattern. There is nothing to say about the three parameters, var0 and var1, representing the byte stream instance and the adapter instance respectively.
The parameter var2 actually represents a character encoding name. If it is null, the system default character encoding will be used: UTF-8.
Finally we can get an instance of the decoder.
Almost all the methods introduced next are implemented by relying on this decoder.
public String getEncoding() { return sd.getEncoding();}public int read() throws IOException { return sd.read();}public int read(char cbuf[], int offset, int length){ return sd.read(cbuf, offset, length);}public void close() throws IOException { sd.close();}The implementation code of related methods in the decoder is still relatively complex. We will not conduct in-depth research here, but the general implementation idea is: the process of "byte stream reading + decoding".
Of course, there must be an opposite StreamEncoder instance in OutputStreamWriter for encoding characters.
Apart from this, the rest of the operations are no different, either written to the file through a character array, written to the file through a string, or written to the file through the lower 16 bits of int.
File character stream FileReader/Writer
The character stream of a file can be said to be very simple. There is no other method except the constructor, and it depends entirely on the file byte stream.
Let's take FileReader as an example.
FileReader inherits from InputStreamReader, and has only the following three constructors: public FileReader(String fileName) throws FileNotFoundException { super(new FileInputStream(fileName));}public FileReader(File file) throws FileNotFoundException { super(new FileInputStream(file));}public FileReader(FileDescriptor fd) { super(new FileInputStream(fd));}In theory, all character streams should be based on our adapter, because only it provides character-to-byte conversion, whether you write or read, it is inseparable from it.
Our FileReader does not extend any of its own methods. The pre-implemented character operation method in the parent class InputStreamReader is enough for him. He only needs to pass in a corresponding byte stream instance.
The same is true for FileWriter, I won't go into details here.
Character array stream CharArrayReader/Writer
Character arrays and byte array streams are similar, both for solving the situation where there is uncertain file size and requires reading a large amount of content.
Since they provide a dynamic expansion mechanism internally, they can not only accommodate the target files, but also control the array size so as not to allocate too much memory and waste a lot of memory space.
Take CharArrayReader as an example
protected char buf[];public CharArrayReader(char buf[]) { this.buf = buf; this.pos = 0; this.count = buf.length;}public CharArrayReader(char buf[], int offset, int length){ //..}The core task of the constructor is to initialize a character array into the internal buf attribute. All subsequent read operations on the character array stream instance will be based on the buf character array.
Regarding the other methods of CharArrayReader and CharArrayWriter, I will not repeat them here, which are basically similar to the byte array stream in the previous article.
In addition, there is also a StringReader and StringWriter involved. In fact, it is essentially the same as a character array stream. After all, the essence of String is a char array.
BufferedReader/Writer
Similarly, BufferedReader/Writer is a buffer stream, and is also a decorator stream, used to provide buffering functions. Generally similar to our byte buffer stream, let's briefly introduce it here.
private Reader in;private char cb[];private static int defaultCharBufferSize = 8192;public BufferedReader(Reader in, int sz){..}public BufferedReader(Reader in) { this(in, defaultCharBufferSize);}cb is a character array that caches some characters read from the file stream. You can initialize the length of this array in the constructor, otherwise the default value of 8192 will be used.
public int read() throws IOException {..}public int read(char cbuf[], int off, int len){...}Regarding read, it depends on the read method of member attribute in. As a Reader type, in is often read method of an InputStream instance that is internally relied on.
Therefore, almost all character streams cannot be separated from a byte stream instance.
I won't repeat it here about BufferedWriter. It's basically similar, except that one is reading and the other is writing, and it revolves around the internal character array.
Standard printout stream
There are two main types of printout streams, PrintStream and PrintWriter. The former is a byte stream and the latter is a character stream.
These two streams are considered to integrate streams under their respective categories. There are rich internal encapsulation methods, but the implementation is also a bit complicated. Let's first look at the PrintStream byte stream:
There are several main constructors:
Obviously, simple constructors will rely on complex constructors, which is already considered an "old routine" for jdk design. What distinguishes it from other byte streams is that PrintStream provides a flag autoFlush that specifies whether to automatically refresh the cache.
Next is the writing method of PrintStream:
In addition, PrintStream also encapsulates a large number of print methods and writes different types of content into files, such as:
Of course, these methods do not really write numeric binary to a file, but simply write their corresponding strings to a file, for example:
print(123);
The final file is not the binary statement corresponding to 123, but just the string 123, which is the print stream.
The buffered character stream used by PrintStream implements all printing operations. If automatic refresh is specified, the buffer will be automatically refreshed when encountering the newline symbol "/n".
So, PrintStream integrates all output methods in byte streams and character streams, where the write method is used for byte stream operations and the print method is used for character stream operations, which needs to be clarified.
As for PrintWriter, it is a full character stream that operates completely against characters. Whether it is the write method or the print method, it is a character stream operation.
To sum up, we spent three articles explaining the byte stream and character stream operations in Java. Byte streaming completes data transmission between disk and memory based on bytes. The most typical one is file character stream, and its implementations are all local methods. With basic byte transfer capabilities, we can also improve efficiency through buffering.
The most basic implementation of character streams is InputStreamReader and OutputStreamWriter. In theory, they can already complete basic character stream operations, but they are only limited to the most basic operations. What is necessary to construct their instances is "a byte stream instance" + "a encoding format".
Therefore, the relationship between a character stream and a byte stream is just like the above equation. The necessary step for writing a character into a disk file is to encode the character in the specified encoding format, and then use the byte stream to write the encoded character binary to the file. The read operation is the opposite.
All codes, images, and files in the article are stored in the cloud on my GitHub:
(https://github.com/SingleYam/overview_java)
You can also choose to download locally.
Summarize
The above is the entire content of this article. I hope that the content of this article has certain reference value for everyone's study or work. If you have any questions, you can leave a message to communicate. Thank you for your support to Wulin.com.