Clever talk about JavaScript regular expression syntax

Author：Eve Cole Update Time：2025-08-01 02:16:01

There are two ways to define regular expressions in JavaScript.

1.RegExp constructor

var pattern = new RegExp("[bc]at","i");

It receives two parameters: one is the string pattern to match, and the other is the optional flag string.

2. Literal

var pattern = /[bc]at/i;

The matching pattern of regular expressions supports three flag strings:

g:global, global search mode, which will be applied to all strings, instead of stopping search when the first match is searched;

i:ingore case, ignore letter case, that is, ignore pattern and string case when determining matches;

m:multiple lines, multiline pattern, that is, when the search reaches the end of a line of text, it will continue to look for whether there is a match on the next line.

The difference between these two methods of creating regular expressions is that the regular expression literal always shares the same RegExp instance, and each new RegExp instance created using the constructor is a new instance.

Metacharacter

Metacharacters are characters with special meanings. The main metacharacters of regular expressions are:

( [ { / ^ $ | ) ? * + .

Metachars have different meanings in different combinations.

Predefined special characters

Character class simple class

Generally, a regular expression has a character corresponding to a character in a string, but we can use [] to build a simple class to represent a class of characters that match a certain feature. For example:

[abc] can match characters in brackets a, b, c or any combination thereof.

Reverse class

Since [] can build a class, you will naturally think of the corresponding class that does not contain the content in brackets. This class is called a reverse class. For example, [^abc] can match characters that are not a or b or c.

Scope category

Sometimes it is too troublesome to match characters one by one and the type of match is the same. At this time, we can use the "-" connection line to represent the content between a certain closed interval. For example, matching all lowercase letters can use [az], as follows:

Matching all 0 to 9 can be expressed using [0-9]:

Predefined classes

For several classes we created above, regular expressions provide us with several commonly used predefined classes to match common characters, as follows:

character	Equivalent category	meaning
.	[^/n/r]	Match all characters except carriage return and line break
/d	[0-9]	Number characters
/D	[^0-9]	Non-numeric characters
/s	[/t/n/x0B/f/r]	Whitespace characters
/S	[^/t/n/x0B/f/r]	Non-whitespace characters
/w	[a-zA-Z_0-9]	Word characters (letters, numbers and underscores)
/W	[^a-zA-Z_0-9]	Non-word characters

quantifier

The above method matches characters are one-to-one. If a character appears multiple times in succession, it will be very troublesome to match according to the above method. Therefore, we wonder if there are other methods that can directly match characters that appear repeatedly. Regular expressions provide us with some quantifiers, as follows:

character	meaning
?	Zero or once (up to once)
+	Appear once or more times (at least once)
*	Zero or multiple occurrences (any time)
{n}	Appear n times
{n,m}	Appear n to m times
{n,}	Appear at least n times

Greedy and non-greedy modes

For the matching method of {n,m}, should n or m be matched? This involves the issue of matching patterns. By default, quantifiers are as many matching characters as possible, which is called greedy mode, for example:

 var num = '123456789'; num.match(//d{2,4}/g); //[1234], [5678], [9]

For the right and non-greedy mode, you only need to add "?" after the quantifier. For example, {n,m}?, it is to match with the least characters, as follows:

 var num = '123456789'; num.match(//d{2,4}?/g); //[12], [34], [56], [78], [9]

Grouping

Quantifiers can only be matched multiple times for a single character. What if we want to match a certain set of characters multiple times? In regular expressions, brackets can define a string as a whole as a group.

If we want to match the word apple appears 4 times, we can match (apple){4} like this, as follows:

If you want to match apple or orange appear 4 times, you can insert the pipe character "|", for example:

(apple|orange){4}

If multiple brackets appear in a regular expression using grouping, i.e. multiple groups, the matching result will also group and number the matches, for example:

(apple)/d+(orange)

If we do not want to capture certain packets, we just need to follow a question mark and a colon immediately before the brackets of the packet, for example:

(?:apple)/d+(orange)

boundary

Regular expressions also provide us with several commonly used boundary matching characters, such as:

character	meaning
^	Start with xx
$	Ending with xx
/b	Word boundary, referring to characters other than [a-zA-Z_0-9]
/B	Non-word boundary

The word boundary matches a position, one side of this position is the characters that make up the word, but the other side is the beginning or end position of a non-word character or string.

Preview

Lookahead is used to match the next occurrence of a specific character set or not.

expression	meaning
exp1(?=exp2)	The match is followed by exp2's exp1
exp1(?!exp2)	Match exp1 which is not exp2 afterwards

See an example:

apple(?=orange)

 (/apple(?=orange)/).test('appleorange123'); //true (/apple(?=orange)/).test('applepear345'); //false

Let's take a look at another example:

apple(?!orange)

 (/apple(?!orange)/).test('appleorange123'); //false (/apple(?!orange)/).test('applepear345'); //true

The above article is a cliché about JavaScript regular expression syntax. This is all the content I share with you. I hope you can give you a reference and I hope you can support Wulin.com more.