When reading the file stream, it is often encountered by garbled. The cause of garbled code is certainly one. Here is the main introduction of garbled problems caused by the file encoding format. First, clearly, the concepts and differences between text files and binary files.
Text files are character -based files. Common codes include ASCII encoding, Unicode encoding, ANSI encoding, and so on. Binary files are based on value -based encoding files. You can specify a certain value according to the specific application (such a process can be regarded as a custom encoding.)
Therefore, it can be seen that the text file is basically coded (there are also non-fixed codes such as UTF-8). And binary files can be seen as long -coded, because it is a value encoding, how many bites represent a value, which is determined by you.
For binary files, you must not use a string, because the string will use the system default coding when the string defaults to initialization. Can be read, operate, and write by byte flow.
For text files, because the encoding is fixed, as long as you read the file, use the file itself to analyze the file, and then obtain the byte. Then, by initializing the string of the specified format, the text will not be garbled. Although binary files can also get its text encoding format, it is inaccurate, so it cannot be said at the same time.
The specific operation is as follows:
1) Get the format of the text file
Public Static String Getfileencode (String Path) {String Charset = "ASCI"; byte [] first3bytes = New Byte [3]; bufferedInputStream bis = null; try {Boo Lean Checkd = False; BIS = New BuffuredInputStream (New FileInputStream (PATH)) ; bis.mark (0); int Read = bis.read (first3bytes, 0, 3); if (read == -1) Return charset; if (first3bytes [0] == (byte) 0xFF && FIRST3bytes [1] == (byte) 0xfe) {charset = "unicode"; // UTF-16LE Checked = TRUE;} Else if (first3bytes [0] == (byte) 0xfe && first3bytes [1] == (byte) 0xfff ) {{ Charset = "Unicode"; // UTF-16BE Check = TRUE;} Else If (first3bytes [0] == (byte) 0xef && FIRST3BYTES [1] == (byte) 0xbb && first3bytes [2 2 ] == (byte) 0xbf) {charset = "utf8"; checked = true;} bis.reset (); if (! Checked) {int Len = 0; int Loc = 0; space ((read = bis.read ())! = - 1) {loc ++; if (read> = 0xf0) break; if (0x80 <= read && read <= 0xbf) // Solocent BF, it is also GBK BREAK; if (0xc0 <= Read && Read <= 0xdf <= 0xdf <= 0xdf <= 0xdf <= 0xdf <= 0xdf ) {read = bis.read (); if (0x80 <= read && read <= 0xbf) // dual bytes (0xc0-0xdf) (0x80-0xbf), or may be continue; else break;} ELSE If (0xe0 <= Read && Read <= 0xef) {// It may also make an error, but the probability is less. (); if (0x80 <= Read && Read <= 0xbf) {charset = "utf-8"; Break;} els break;} els break;} //textLogger.getLogger ().info (loc + "" "" "" "" "" + Integer.Tohexstring (read);}} Catch (Exception E) {e.printstacktrace ();} Finally {if (bis! = Null) {try {bis.close ();} Catch pTion ex) {} }} Return Charset;} Private Static String Getencode (int Flag1, int Flag2, int Flag3) {string enCode = "" ""; // de), // Fe, FF (Unicode Big Endian), EF, BB, BF (UTF-8) if (Flag1 == 255 && FLAG2 == 254) {ENCODE = "Unicode"; == 255) {ENCODE = "UTF-16";} Else If (FLAG1 == 239 && FLAG2 == 187 && FLAG3 == 191) {encode = "UTF8";} Else {encode = "asci"; // ASCII code} Return encode;}2) Read the file flow through the encoding format of the file
/*** Get the content of the file through the path. This method is used as a carrier because of the string as a carrier. In order to correctly read the file (not garbled), you can only read text files and security methods! */ public static String readFile(String path){ String data = null; // 判断文件是否存在File file = new File(path); if(!file.exists()){ return data; } // 获取文件编码Format String Code = Fileencode.getFileencode (PATH); InputStreamReader isr = null; Try {// Analyze the file if ("asci" (Code)) {// uses GBK codes without environmental coding. Format, Because the default encoding of the environment does not mean the operating system encoding // code = system.getProperty ("file.encoding"); code = "gbk";} isr = New InputStreamReamRead (New FileinputStream (File), Code); Part Content int length = -1; char [] buffer = new char [1024]; stringbuffer sb = new stringbuffer (); while ((length = isr.read (buffer, 0, 1024))! = -1) {sb. append (buffer, 0, length);} Data = New String (sb);} Catch (Exception E) {e.printstacktrace (); LOG.INFO ("Getfile IO Exception:"+E.Getmessage () );} Finally {try {if (isr! = null) {isr.Close ();}} Catch (IOEXCEPTION E) {e.printstacktrace (); log.info ("Getfile IO:"+E.GetMessag e ()); }} Return Data;}3) Write the file through the format specified by the file
/** * Save the file content according to the specified path and encoding format. This method is used as a carrier because the string is used as a carrier. In order to correctly write the file (not garbled), it can only be written to the text content, security method * * @param data * The byte data that will be written in the file * @Param Path * File path, contains file name * @Return Boolean * When the writing is completed, return to True; */ Public Static Boolean Writefile (byte data [], String Path, String CODE) {boolean flag = true; outputStreamwriter Osw = null; try {file file = new file (path); if (! File.exist ()) {file = new file (file.Get Parent ()); I f (! File .exists ()) {file.mkdirs ();}}} if ("asci" .equals (code)) {code = "gbk";} OSW = New OutputStreamwriter (New FileoutPutstream (Path), Code); OS w.write (New String (data, code)); Osw.flush ();} Catch (Exception E) {e.printstacktrace (); log.info ("Tofile IO Exception:"+E.getMessage ()); Flag = False ;} Finally {try {if (osw! = null) {osw.close ();}} Catch (IOEXCEPTION E) {e.printstacktrace (); log.info ("Tofile IO Exception:"+E.Getmes sage () );;}} Return Flag;}4) For binary files and there are very few content, such as WORD documentation, etc., you can read and write files in the following ways
/** * Read files from the specified path to the byte array, you can choose this method for some non -text format content * 457364578634785634534 * @Param Path * file path, including file name * @Return byte [] * file byte byte byte byte byte byte by byte by byte by byte by byte by byte by byte by byte byte byte by byte byte byte by byte by byte by byte by byte. Array * */ Public Static byte [] Getfile (String Path) Throws IOEXCEPTION {FileInputStream Stream = New FileInputStream (PATH); byte data [] = new byte [size]; stream.read (stream.read ( Data); stream.close (); stream = null; return data;} /*** Write the byte content into the corresponding file. This method can be used for some non -text files. * @param data * 将要写入到文件中的字节数据* @param path * 文件路径,包含文件名* @return boolean isOK 当写入完毕时返回true; * @throws Exception */ public static boolean toFile( byte data [], String Path) Throws Exception {FileoutPutStream Out = New FileoutPutstream (PATH); Out.write (data); out.Clush (); out.Close (); out. Ull; Return true;}The above is all the contents of this article. I hope it will be helpful to everyone's learning.