I wrote an article about getting started with regular expressions before. I thought I was relatively familiar with regular expressions, but today I encountered another pitfall. Maybe it was because I was not careful enough. Today I will focus on sharing with you the grouping in javascript regular expressions. If you don't understand JS regular expressions enough, you can click here to learn more.
Grouping is quite widely used in regular rules. The grouping I understand is a pair of brackets (), and each pair of brackets represents a group.
Grouping can be divided into:
Capturing grouping will obtain the results of the corresponding grouping in the form of a second term and a third term in a function such as match exec. Let's take a look at an example first
var reg = /test(/d+)/; var str = 'new test001 test002'; console.log(str.match(reg));//["test001", "001", index: 4, input: "new test001 test002"]
In the code (/d+) is a group (some people also call it sub-pattern), but they all represent the same meaning. In the example above, test001 is the result of the exact match.
However, the matching of the group is to find characters matching the sub-pattern/d+ from the entire exact match result (that is, test001). Here it is obviously 001.
But this is what I encountered today
var reg = /test(/d)+/; var str = 'new test001 test002'; console.log(str.match(reg));//["test001", "1", index: 4, input: "new test001 test002"]
The difference is that (/d+) is changed to (/d)+, and the entire matching result is still test001, but the result of the first group matching is different.
Let's take a look at their differences
(/d+) This is a grouping situation, because by default the matching patterns are greedy patterns, that is, as many matches as possible
The result of all /d+ matches is 001. Then a pair of brackets is added outside, which means a group, so the result of matching in the first group is 001.
Let's look at (/d)+ in the second example. This is also a greedy pattern. First, it will first match 0 and then 0 and will also match to the end of 1.
It seems that there is no difference from the match in the first example, but the grouping (/d) here means matching a single number.
According to my previous understanding, the result that the match was 0 at the beginning was 0, but this understanding is wrong. Since the whole match is a greedy pattern, match as much as possible
(/d) in the group will capture the last matched result 1
If it is a non-greedy pattern, it will match as little as possible
var reg = /test(/d)+?/; var str = 'new test001 test002'; console.log(str.match(reg));//["test001", "0", index: 4, input: "new test001 test002"]
In this way, the matching result (/d) is 0. Although there are still results that can be matched later, here is to match as few as possible
Non-capturing grouping
var reg = /test(?:/d)+/; var str = 'new test001 test002'; console.log(str.match(reg));//["test001", index: 4, input: "new test001 test002"]
Non-capturing grouping means that a pair of brackets is needed in some places, but they don't want it to be a capture grouping, which means they don't want this group to be obtained by functions like macth exec.
Usually, adding ?: (?:pattern) inside the bracket becomes a non-capture group.
In this way, there will be no content matching in the match result of the match, which means that the second item is missing.
This article focuses on explaining the difference between (/d+) and (/d)+, which is also the pit I have stepped on today. If there are any mistakes, please feel free to correct them.