What is a Regular Expression#
A regular expression is code that records text rules.
Testing Regular Expressions#
Online testing tool: wegester
Metacharacters#
Code | Description |
---|---|
. | Matches any character except a newline |
\w | Matches a letter, digit, underscore, or Chinese character |
\s | Matches any whitespace character |
\d | Matches a digit |
\b | Matches the beginning or end of a word |
^ | Matches the beginning of a string |
$ | Matches the end of a string |
Example:
Target | Regular Expression |
---|---|
A Lucy following not far behind hi | \bhi\b.*\bLucy\b |
Starts with 0, followed by two digits, then a hyphen "-", and finally eight digits | 0\d\d-\d\d\d\d\d\d\d\d or 0\d{2}-\d{8} |
Matches words starting with the letter a | \ba\w*\b |
One or more consecutive digits | \d+ |
Exactly six-character words | \b\w{6}\b |
The entire string is 5 to 12 digits | ^\d{5,12}$ |
Character Escaping#
Example:
Target | Regular Expression |
---|---|
deerchao.cn | deerchao\.cn |
C:\Windows | C:\\Windows |
Repetition#
Code/Syntax | Description |
---|---|
* | Matches zero or more times |
+ | Matches one or more times |
? | Matches zero or one time |
{n} | Matches exactly n times |
{n,} | Matches n or more times |
{n,m} | Matches between n and m times |
Example:
Regular Expression | Target |
---|---|
Windows\d+ | Windows followed by one or more digits |
^\w+ | The first word of a line |
Character Classes#
Example:
Regular Expression | Target |
---|---|
[aeiou] | Any one English vowel letter |
[0-9] | One digit |
(?0\d{2}[) -]?\d{8} | Several formats of phone numbers |
Branch Conditions#
When matching branch conditions, each condition will be tested from left to right. If one branch is satisfied, the other conditions will not be considered.
Example:
Regular Expression | Target |
---|---|
0\d{2}-\d{8}|0\d{3}-\d{7} | Three-digit area code, followed by an eight-digit local number or a four-digit area code, followed by a seven-digit local number |
\d{5}-\d{4}|\d{5} | The rule for US zip codes is either five digits or nine digits separated by a hyphen |
Grouping#
You can use parentheses to specify subexpressions (also known as groups).
Example:
Regular Expression | Target |
---|---|
(\d{1,3}.){3}\d{1,3} | Simple IP address matching |
((2[0-4]\d|25[0-5]|[01]?\d\d?).){3}(2[0-4]\d|25[0-5]|[01]?\d\d?) | Correct IP address |
Negation#
Code | Description |
---|---|
\W | Matches any character that is not a letter, digit, underscore, or Chinese character |
\S | Matches any character that is not a whitespace character |
\D | Matches any character that is not a digit |
\B | Matches a position that is not the beginning or end of a word |
[^x] | Matches any character except x |
[^aeiou] | Matches any character except the specified vowels |
Example:
Regular Expression | Target |
---|---|
\S+ | Strings without whitespace |
<a[^>]+> | Strings starting with "a" enclosed in angle brackets |