Preface
This article will introduce the Pattern class and Matcher class in Java regular expressions. First of all, we need to clearly understand that the regular expression specified as a string must first be compiled as an instance of the pattern class. Therefore, how to better understand these two classes is something that programmers must know.
Let’s take a look at these two categories:
1. The concept of capturing group
The capture group can be numbered by calculating its open brackets from left to right, which starts with 1. For example, in the expression ((A)(B(C))), there are four such groups:
1 ((A)(B(C)))2 (A)3 (B(C))4 (C)
Group zeros always represent the entire expression. Groups starting with (?) are pure non-capturing groups that do not capture text and do not count against combo counts.
The capture input associated with a group is always the subsequence that matches the group most recently. If the group is calculated again due to quantization, its previously captured value will be retained on the second calculation failure (if any). For example, matching the string "aba" to the expression (a(b)?)+ will set the second group to "b". At the beginning of each match, all captured inputs are discarded.
2. Detailed explanation of Pattern and Matcher classes
Java regular expressions are implemented through the Pattern class and Matcher class under the java.util.regex package (it is recommended that you open the java API document when reading this article. When introducing which method is introduced, check the method description in the java API, and the effect will be better).
The Pattern class is used to create a regular expression, or it can be said to create a matching pattern. Its construction method is private and cannot be created directly, but it can create a regular expression through the simple factory method of Pattern.complie(String regex)
Java code example:
Pattern p=Pattern.compile("//w+"); p.pattern();//Return/w+ pattern() returns the string form of a regular expression, which is actually the regex parameter of Pattern.complile(String regex)
1.Pattern.split(CharSequence input)
Pattern has a split(CharSequence input) method, which is used to separate strings and returns a String[]. I guess String.split(String regex) is implemented through Pattern.split(CharSequence input) .
Java code example:
Pattern p=Pattern.compile("//d+"); String[] str=p.split("My QQ is: 456456 My phone is: 0532214 My email is: [email protected]");Result: str[0]="My QQ is:" str[1]="My phone is:" str[2]="My email is: [email protected]"
2. Pattern.matcher(String regex,CharSequence input) is a static method used to quickly match strings. This method is suitable for matching only once and matching all strings.
Java code example:
Pattern.matches("//d+","2223");//Return true Pattern.matches("//d+","2223aa");//Return false, all strings need to be matched to return true, here aa cannot match Pattern.matches("//d+","22bb23");//Return false, all strings need to be matched to return true, here bb cannot match3.Pattern.matcher(CharSequence input)
After saying so much, it is finally the Matcher class's turn to debut. Pattern.matcher(CharSequence input) returns a Matcher object.
The constructor method of the Matcher class is also private and cannot be created at will. It can only obtain instances of this class through Pattern.matcher(CharSequence input) method.
The Pattern class can only do some simple matching operations. In order to get stronger and more convenient regular matching operations, it is necessary to cooperate with Pattern and Matcher. The Matcher class provides grouping support for regular expressions and multiple matching support for regular expressions.
Java code example:
Pattern p=Pattern.compile("//d+"); Matcher m=p.matcher("22bb23"); m.pattern();//Return p that is to return which Pattern object was created by the Matcher object.4.Matcher.matches()/Matcher.lookingAt()/Matcher.find()
The Matcher class provides three matching operation methods. All three methods return boolean type. Return true when the match is reached. If there is no match, it returns false.
matches() matches the entire string, and returns true only if the entire string matches
Java code example:
Pattern p=Pattern.compile("//d+"); Matcher m=p.matcher("22bb23"); m.matches();//Return false, because bb cannot be matched by /d+, resulting in the matching of the entire string unsuccessful. Matcher m2=p.matcher("2223"); m2.matches();//Return true, because /d+ matches the entire string Let's look back at Pattern.matcher(String regex,CharSequence input) , which is equivalent to the following code
Pattern.compile(regex).matcher(input).matches()
lookingAt() matches the previous string, and returns true only if the matching string is in the front.
Java code example:
Pattern p=Pattern.compile("//d+"); Matcher m=p.matcher("22bb23"); m.lookingAt();//Return true, because /d+ matches the previous 22 Matcher m2=p.matcher("aa2223"); m2.lookingAt();//Return false, because /d+ cannot match the previous aa find() matches the string, and the matching string can be anywhere.
Java code example:
Pattern p=Pattern.compile("//d+"); Matcher m=p.matcher("22bb23"); m.find();//Return true Matcher m2=p.matcher("aa2223"); m2.find();//Return true Matcher m3=p.matcher("aa2223bb"); m3.find();//Return true Matcher m4=p.matcher("aabb"); m4.find();//Return false5.Mathcer.start()/Matcher.end()/Matcher.group()
After using matches() , lookingAt() , and find() to perform matching operations, you can use the above three methods to obtain more detailed information.
start() returns the index position of the matching substring in the string.
end() returns the index position of the last character of the matched substring in the string.
group() returns the matching substring
Java code example:
Pattern p=Pattern.compile("//d+"); Matcher m=p.matcher("aaa2223bb"); m.find();//match 2223 m.start();//Return 3 m.end();//Return 7, the index number after 2223 m.group();//Return 2223 Mathcer m2=m.matcher("2223bb"); m.lookingAt(); //Match 2223 m.start(); //Return 0, since lookingAt() can only match the previous string, when using lookingAt() to match, the start() method always returns 0 m.end(); //Return 4 m.group(); //Return 2223 Matcher m3=m.matcher("2223bb"); m.matches(); //Match the entire string m.start(); //Return 0, I believe everyone knows the reason m.end(); //Return 6, I believe everyone knows the reason, because matches() needs to match all strings m.group(); //Return 2223bb Having said so much, I believe everyone understands the use of the above methods. We should talk about how regular expression grouping is used in Java.
There is an overloaded method for start() , end() , and group() They are start(int i) , end(int i) , group(int i) specifically for group operations. The Mathcer class also has a groupCount() to return how many groups there are.
Java code example:
Pattern p=Pattern.compile("([az]+)(//d+)"); Matcher m=p.matcher("aaa2223bb"); m.find(); //Match aaa2223 m.groupCount(); //Return 2, because there are 2 groups of m.start(1); //Return 0 Returns the index number of the first group of matched substrings in the string m.start(2); //Returns 3 m.end(1); //Returns 3 Returns the index position of the last character of the first group of matched substrings in the string. m.end(2); //Returns 7 m.group(1); //Returns aaa, return the first group of matched substrings m.group(2); //Return 2223, return the second set of matching substrings Now let's use a slightly higher-level regular matching operation, for example, there is a piece of text with many numbers in it, and these numbers are separated. Now we need to take out all the numbers in the text. It is so simple to use Java regular operations.
Java code example:
Pattern p=Pattern.compile("//d+"); Matcher m=p.matcher("My QQ is: 456456 My phone is: 0532214 My email is: [email protected]"); while(m.find()) { System.out.println(m.group()); }Output:
456456 0532214 123
If you replace the above while() loop with
while(m.find()) { System.out.println(m.group()); System.out.print("start:"+m.start()); System.out.println(" end:"+m.end()); }Then output:
456456 start:6 end:12 0532214 start:19 end:26 123 start:36 end:39
Now everyone should know that after each matching operation, the values of the three methods start() , end() , and group() will be changed, and will be changed into the information of the matching substring, and their overloading methods will also be changed into the corresponding information.
Note: Only when the matching operation is successful can you use the three methods start() , end() , and group() , otherwise java.lang.IllegalStateException will be thrown, that is, when any of the methods matches() , lookingAt() , find() return true, it can only be used.
Summarize
The above is all the content of this article. I hope that the content of this article will be of some help to your study or work. If you have any questions, you can leave a message to communicate. Thank you for your support to Wulin.com.