Regular expression character matching
This is some notes after reading “Javascript Regular Expression Mini Book”.
Regular expressions are matching patterns that can match characters and positions.
The following mainly introduces the situation of matching characters, and I am also learning the situation of matching positions.
Two kinds of fuzzy matching:
1. Horizontal fuzzy matching: The length of a regular string that can be matched is not fixed of. This is achieved by using quantifiers. For example, {m,n} means that the character appears at least m times and at most n times.
For example, /ab{2,5}c/ means to match such a string: the first character is “a”, followed by 2 to 5 characters “b”, and finally the character “c “.
For example: (You can try it manually and think about the results you will get)
var regex = /ab{2,5 }c/g; var string = "abc abbc abbbc abbbbc abbbbbc abbbbbbc"; console.log( string.match(regex) );
g is a modifier, which means global matching, which is to find all the strings that meet the matching conditions in order in the string.
2. Vertical fuzzy matching: A regular matchable string is specific to a certain character, it may not be a definite character, and there may be many possibilities. Its implementation is to use character groups
such as /a[123]b/ to match such a string: the first character is a, the second character can be ‘1’, ‘2’, ‘3’, but only one.
Quantifier (repetition)
1. Common abbreviations:
(1) {m,} means at least m times
(2){m} means it appears m times
(3)? Equivalent to {0,1} means it appears or not
(4)+ is equivalent to { 1,} means at least one occurrence
(5)* is equivalent to {0,} means any number of occurrences, may not appear, or several times
2. Greedy matching And lazy matching
(1) Greedy matching: /\d{2,5}/ indicates that the number appears 2-5 times in a row, and will match as many as possible
var regex = /\d{2,5}/g; var string = "123 1234 12345 123456"; console. log( string. match(regex) ); // => ["123", "1234", "12345", "12345"]
(2) Lazy matching: /\d{2,5}?/ means although 2-5 times Both are fine, but when 2 is enough, don’t try again.
var regex = /\d{2,5}?/g; var string = "123 1234 12345 123456"; console. log( string. match(regex) ); // => ["12", "12", "34", "12", "34", "12", "34", "56"]
Character group
1. Range representation:
(1) Use the hyphen “-” to omit the abbreviation, such as [123456abcdefGHIJKLM] can be written as [1-6a-fG-M ].
(2) Note: If there is a hyphen in the matched string, either put it at the beginning, or at the end, or escape -.
2. Exclude character groups: such as [^abc]
means that a certain character can be anything, but not a, b, c. The ^ caret means negation, and there is also a corresponding range representation.
3. Common abbreviations:
(1) \d
means [0-9]
. It is a single digit
(2) \D
means [\^0-9]
. Any character except numbers.
(3) \w
means [0-9a-zA-Z_]
. Numbers, uppercase and lowercase letters, and underscores. Also known as word characters
(4) \W
means [^0-9a-zA-Z_]
. Non-word characters
(5) \s
means [ \t\v\n\r\f]
. Represents blank characters, including spaces, horizontal tabs, vertical tabs, newlines, carriage returns, and form feeds.
(6) \S
means [^ \t\v\n\r\f]
. Non-blank characters
(7).
means [^\n\r\u2028\u2029]
. Wildcards, which represent almost any character. Exceptions are line feeds, carriage returns, line separators, and segment separators.
To match any character, you can use [\d\D]
, [\w\W]
, and [^]
.
Multi-choice branch
A pattern can realize horizontal and vertical fuzzy matching, and a multi-choice branch can support multiple sub-patterns to choose one of.
Specific form: (p1|p2|p3
) p1, p2, p3 are sub-patterns.
Pay attention to the following question.
var regex = /good|goodbye/g; var string = "goodbye"; console.log( string.match(regex) );
The result obtained in the above example is “good”
var regex = /goodbye|good/g; var string = "goodbye"; console.log( string.match(regex) );
This example gets “goodbye”
We come to the conclusion that the branch structure is also lazy, That is, when the previous match is found, the latter will not be tried again.
Recommended tutorial: “JS Tutorial”
The above is the detailed content of character matching of JS regular expressions. For more, please pay attention to other related articles on 1024programmer.com!