When working on projects, you often encounter some problems that you need to judge whether the characters are Chinese, so you have collected code snippets for judging Chinese characters. Here are some sharing for your reference.
I posted the code directly, with detailed comments in it.
package com.coder4j.main;import java.util.regex.Pattern;/*** Java judges Chinese characters* * @author Chinaxiang* @date 2015-08-11**/public class CheckChinese {public static void main(String[] args) {// Pure English String s1 = "Hello,Tom.!@#$%^&*()_+-={}|[];':/"?";// Pure Chinese (excluding Chinese punctuation) String s2 = "Hello, China". "": ""''; ()[]! ¥, ";// Korean String s4 = "한국어난";// Japanese String s5 = "ぎじゅつ";// Special character String s6 = "��";String s7 = "╃";String s8 = "╂";// Traditional Chinese String s9 = "old";// 1 Use character range to determine whether System.out.println("s1 contains Chinese: " + has ChineseByRange(s1));// falseSystem.out.println("s2 contains Chinese: " + has ChineseByRange(s2));// trueSystem.out.println("s3 contains Chinese: " + has ChineseByRange(s3));// trueSystem.out.println("s4 contains Chinese: " + has ChineseByRange(s4));// falseSystem.out.println("s5 contains Chinese: " + has ChineseByRange(s5));// falseSystem.out.println("s6 contains Chinese: " + has ChineseByRange(s6));// falseSystem.out.println("s7 contains Chinese: " + has ChineseByRange(s7));// falseSystem.out.println("s8 contains Chinese: " + has ChineseByRange(s8));// falseSystem.out.println("s9 contains Chinese: " + has ChineseByRange(s9));// trueSystem.out.println("-------------------"); System.out.println("s1 is all Chinese: " + is ChineseByRange(s1));// falseSystem.out.println("s2 is all Chinese: " + is ChineseByRange(s2));// trueSystem.out.println("s3 is all Chinese: " + isChineseByRange(s3));// false Chinese punctuation is not within the range System.out.println("s4 is all Chinese: " + isChineseByRange(s4));// falseSystem.out.println("s5 is all Chinese: " + isChineseByRange(s5));// falseSystem.out.println("s6 is all Chinese: " + isChineseByRange(s6));// falseSystem.out.println("s7 is all Chinese: " + isChineseByRange(s7));// falseSystem.out.println("s8 is all Chinese: " + isChineseByRange(s8));// falseSystem.out.println("s9 is all Chinese: " + isChineseByRange(s9));// trueSystem.out.println("---------------");// 2 Use character range regular judgment (the result is the same as 1) System.out.println("s1 contains Chinese: " + has ChineseByReg(s1));// falseSystem.out.println("s2 contains Chinese: " + hasChineseByReg(s2));// trueSystem.out.println("s3 contains Chinese: " + has ChineseByReg(s3));// trueSystem.out.println("s4 contains Chinese: " + has ChineseByReg(s4));// falseSystem.out.println("s5 contains Chinese: " + has ChineseByReg(s5));// falseSystem.out.println("s6 contains Chinese: " + has ChineseByReg(s6));// falseSystem.out.println("s7 contains Chinese: " + has ChineseByReg(s7));// falseSystem.out.println("s8 contains Chinese: " + has ChineseByReg(s8));// falseSystem.out.println("s9 contains Chinese: " + has ChineseByReg(s9));// trueSystem.out.println("-------------------"); System.out.println("s1 is all Chinese: " + is ChineseByReg(s1));// falseSystem.out.println("s2 is all Chinese: " + is ChineseByReg(s2));// trueSystem.out.println("s3 is all Chinese: " + isChineseByReg(s3));// false Chinese punctuation is not within the range System.out.println("s4 is all Chinese: " + isChineseByReg(s4));// falseSystem.out.println("s5 is all Chinese: " + isChineseByReg(s5));// falseSystem.out.println("s6 is all Chinese: " + isChineseByReg(s6));// falseSystem.out.println("s7 is all Chinese: " + isChineseByReg(s7));// falseSystem.out.println("s8 is all Chinese: " + isChineseByReg(s8));// falseSystem.out.println("s9 is all Chinese: " + isChineseByReg(s9));// trueSystem.out.println("----------------");// 3 Use the CJK character set to determine whether System.out.println("s1 contains Chinese: " + hasChinese(s1));// falseSystem.out.println("s2 contains Chinese: " + hasChinese(s2));// trueSystem.out.println("s3 contains Chinese: " + has Chinese(s3));// trueSystem.out.println("s4 contains Chinese: " + has Chinese(s4));// falseSystem.out.println("s5 contains Chinese: " + has Chinese(s5));// falseSystem.out.println("s6 contains Chinese: " + has Chinese(s6));// falseSystem.out.println("s7 contains Chinese: " + has Chinese(s7));// falseSystem.out.println("s8 contains Chinese: " + hasChinese(s8));// falseSystem.out.println("s9 contains Chinese: " + hasChinese(s9));// trueSystem.out.println("-------------------");System.out.println("s1 is all Chinese: " + isChinese(s1));// falseSystem.out.println("s2 is all Chinese: " + isChinese(s2));// trueSystem.out.println("s3 is all Chinese: " + isChinese(s3));// true Chinese punctuation is also included in System.out.println("s4 is all Chinese: " + isChinese(s4));// falseSystem.out.println("s5 is all Chinese: " + isChinese(s5));// falseSystem.out.println("s6 is all Chinese: " + isChinese(s6));// falseSystem.out.println("s7 is all Chinese: " + isChinese(s7));// falseSystem.out.println("s8 is all Chinese: " + isChinese(s8));// falseSystem.out.println("s9 is all Chinese: " + isChinese(s9));// true}/*** Whether it contains Chinese characters<br>* contains Chinese punctuation<br>* * @param str* @return*/public static boolean has Chinese(String str) {if (str == null) {return false;}char[] ch = str.toCharArray();for (char c : ch) {if (isChinese(c)) {return true;}}return false;}/*** whether it is all Chinese characters<br>* contains Chinese punctuation<br>* * @param str* @return*/public static boolean isChinese(String str) {if (str == null) {return false;}char[] ch = str.toCharArray();for (char c : ch) {if (!isChinese(c)) {return false;}}return true;}/*** Whether it is a Chinese character<br>* contains Chinese punctuation<br>* * @param c* @return*/private static boolean isChinese(char c) {Character.UnicodeBlock ub = Character.UnicodeBlock.of(c);if (ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS) {return true;} else if (ub == Character.UnicodeBlock.CJK_COMPATIBILITY_IDEOGRAPHS) {return true;} else if (ub == Character.UnicodeBlock.CJK_SYMBOLS_AND_PUNCTUATION) {return true;} else if (ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_A) {return true;} else if (ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_B) {return true;} else if (ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_C) {return true;} else if (ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_D) {return true;} else if (ub == Character.UnicodeBlock.GENERAL_PUNCTUATION) {return true;} else if (ub == Character.UnicodeBlock.HALFWIDTH_AND_FULLWIDTH_FORMS) {return true;}return false;}/*** Whether Chinese characters are included<br>* Judge based on the encoding range of Chinese characters<br>* CJK Unified Chinese characters (not including Chinese, 《》() "''", !¥, etc.)<br>* * @param str* @return*/public static boolean has ChineseByReg(String str) {if (str == null) {return false;}Pattern pattern = Pattern.compile("[//u4E00-//u9FBF]+");return pattern.matcher(str).find();}/*** Whether all Chinese characters are Chinese characters<br>* Judge based on the encoding range of Chinese characters<br>* CJK Unified Chinese characters (not including Chinese, "《》() "''", !¥ and other symbols)<br>* * @param str* @return*/public static boolean isChineseByReg(String str) {if (str == null) {return false;}Pattern pattern = Pattern.compile("[//u4E00-//u9FBF]+");return pattern.matcher(str).matches();}/*** Whether Chinese characters are included<br>* Judge based on the encoding range of Chinese characters<br>* CJK Unified Chinese characters (not including Chinese, "《》() "''", !¥ and other symbols)<br>* * @param str* @return*/public static boolean has ChineseByRange(String str) {if (str == null) {return false;}char[] ch = str.toCharArray();for (char c : ch) {if (c >= 0x4E00 && c <= 0x9FBF) {return true;}}return false;}/*** Whether all Chinese characters are Chinese characters<br>* Judge based on the encoding range of Chinese characters<br>* CJK Unified Chinese characters (not including Chinese, "" () "''", !¥, etc.)<br>* * @param str* @return*/public static boolean isChineseByRange(String str) {if (str == null) {return false;}char[] ch = str.toCharArray();for (char c : ch) {if (c < 0x4E00 || c > 0x9FBF) {return false;}}return true;}}If you only judge whether it is Chinese without judging Chinese punctuation, it is recommended to use regular matching, which may be more efficient.
The above code content introduces to you the example code of Java to judge characters as Chinese (super useful), and I hope it will be helpful to you.