Regular Expression

Metacharacter	Meaning
`.`	Matches any character except newline (unless “dotall”/“singleline” mode is enabled).
`^`	Matches the start of a line/string.
`$`	Matches the end of a line/string.
`\b`	Word boundary (transition between `\w` and `\W`).
`\B`	Non-word boundary (opposite of `\b`).
`\d`	Digit, equivalent to `[0-9]`.
`\D`	Non-digit, equivalent to `[^0-9]`.
`\w`	Word character, equivalent to `[A-Za-z0-9_]`.
`\W`	Non-word character, equivalent to `[^A-Za-z0-9_]`.
`\s`	Whitespace (space, tab, newline, vertical tab, form feed, etc.).
`\S`	Non-whitespace (anything not matched by `\s`).
`\t`	Tab character.
`\n`	Newline character (LF).
`\r`	Carriage return (CR).
`\f`	Form feed.
`\v` (flavor-specific)	Vertical tab (supported in some flavors).
`\\`	Literal backslash.
`\.`	Literal dot.
`\*`	Literal asterisk.
`\+`	Literal plus.
`\?`	Literal question mark.
`\{` `\}`	Literal braces.
`\[` `\]`	Literal square brackets.
`\(` `\)`	Literal parentheses.
`\\|`	Literal pipe/alternation operator.
`\^`	Literal caret (start-of-string anchor).
`\$`	Literal dollar sign (end-of-string anchor).

Quantifier	Meaning
`*`	Match the preceding token zero or more times (greedy).
`+`	Match the preceding token one or more times (greedy).
`?`	Match the preceding token zero or one time (makes it optional; also used for lazy quantifiers).
`{n}`	Match the preceding token exactly n times.
`{n,}`	Match the preceding token n or more times (no upper limit).
`{n,m}`	Match the preceding token at least n but not more than m times.
`*?`, `+?`, `??`, `{n,m}?`	The “lazy” or “non-greedy” versions of the above quantifiers (match as few characters as possible).

Syntax	Matches
`[abc]`	Any one of the characters `a`, `b`, or `c`.
`[^abc]`	Any character except `a`, `b`, or `c`.
`[a-z]`	Any lowercase letter from `a` to `z`.
`[A-Z]`	Any uppercase letter from `A` to `Z`.
`[0-9]`	Any digit from `0` to `9`.
`[A-Za-z0-9_]`	Any “word” character (often equivalent to `\w`).
`[[:alnum:]]` (POSIX)	Letters and digits (POSIX-style). Useable in some flavors (e.g., GNU, PCRE with `[[:alnum:]]`).
`[[:space:]]`, `[[:digit:]]`, etc.	POSIX character classes (flavor support varies).

Anchor	Meaning
`^`	Start of string (or line, with multiline mode).
`$`	End of string (or line, with multiline mode).
`\A`	Start of string (ignores multiline mode).
`\z` or `\Z`	End of string (`\z` is absolute; `\Z` allows an optional trailing newline).
`\b`	Word boundary (between `\w` and `\W`).
`\B`	Non-word boundary (opposite of `\b`).

Construct	Meaning
`(?= … )`	Positive lookahead: what follows must match `…`.
`(?! … )`	Negative lookahead: what follows must not match `…`.
`(?<= … )`	Positive lookbehind: what precedes must match `…`.
`(?<! … )`	Negative lookbehind: what precedes must not match `…`.

Sequence	Meaning
`\d`	Digit (same as `[0-9]`).
`\D`	Non-digit (same as `[^0-9]`).
`\w`	Word character `[A-Za-z0-9_]`.
`\W`	Non-word character `[^A-Za-z0-9_]`.
`\s`	Whitespace `[ \t\r\n\f\v]`.
`\S`	Non-whitespace.
`\t`, `\n`, `\r`, `\f`	Tab, newline, carriage return, form feed.
`\0`	Null character (U+0000).
`\xhh`	Character with hex code `hh` (two hex digits).
`\uhhhh`	Unicode character with code `hhhh` (four hex digits). (Flavor-specific: .NET, Java, JavaScript with `u` flag, etc.)
`\cX`	Control character (e.g., `\cA` is U+0001).
`\Q … \E`	In some flavors (like .NET), `\Q` starts a quoting section (treat everything up to `\E` literally).

Flag	Meaning	Typical Syntax
`i`	Case-insensitive matching (`RegexOptions.IgnoreCase` in .NET).	In-line: `(?i)pattern`.
`m`	Multiline mode: `^` and `$` match start/end of lines, not just string start/end.	In .NET: `RegexOptions.Multiline` or `(?m)pattern`.
`s`	Single-line or dot-all mode: `.` matches newline as well.	In .NET: `RegexOptions.Singleline` or `(?s)pattern`.
`x`	Free-spacing/comment mode: ignores whitespace and allows `# comments`.	In .NET: `RegexOptions.IgnorePatternWhitespace` or `(?x)pattern`.
`U`	Ungreedy mode (PCRE): swap greedy and lazy behaviors (rarely used).	In PCRE: `(?U)pattern`.
`u`	Unicode mode (flavor-specific).	In JavaScript with `/u` flag, or `RegexOptions.CultureInvariant` in .NET.

¶ Regular Expression - A Comprehensive Guide

¶ 1. Introduction

¶ 2. Live Tester

¶ 3. Practical Syntax

¶ 3.1 Literal Characters and Simple Matches

¶ 3.2 Character Classes ([]) and Ranges

¶ Example 1: Matching a Four-Digit Year

¶ Example 2: Extracting Log Levels

¶ 3.3 Quantifiers: Repetition Control

¶ Example 3: Matching Timestamps

¶ Example 4: Optional File Extension

¶ 3.4 Anchors: Positioning Matches

¶ Example 5: Lines Starting with “Error”

¶ 3.5 Groups and Capturing

¶ Example 6: Capturing Timestamps and Levels

¶ 3.6 Common Metacharacters and Escape Sequences

¶ 3.7 Practical Example: Extracting Email Addresses

¶ Testing It

¶ 3.8 Practical Example: Validating North American Phone Numbers

¶ 3.9 Practical Example: Parsing Log Files

¶ 3.10 Testing and Debugging Regex

¶ 4. Comprehensive Reference

¶ 4.1 Basic Metacharacters

¶ 4.2 Quantifiers

¶ Example: Greedy versus Lazy

¶ 4.3 Character Classes and Sets

¶ Example: Matching Hexadecimal Digits

¶ 4.4 Alternation (|) and Grouping

¶ Example: Matching Multiple File Extensions

¶ 4.5 Anchors and Boundaries

¶ 4.6 Lookarounds (Zero-Width Assertions)

¶ Example: Matching Passwords without Digits

¶ 4.7 Backreferences

¶ Example: Matching HTML/XML Opening and Closing Tags

¶ 4.8 Common Shorthands and Escape Sequences

¶ 4.9 Quantifier Possessive (+ after quantifier) and Atomic Groups (Flavor-Specific)

¶ 4.10 Flags (Modifiers) and Modes

¶ Example: Inline Flags

¶ 4.11 Replacement Patterns

¶ Example: Swap “Last, First” to “First Last”

¶ 5. Conclusion

¶ References

¶ 3.2 Character Classes (`[]`) and Ranges

¶ 4.4 Alternation (`|`) and Grouping

¶ 4.9 Quantifier Possessive (`+` after quantifier) and Atomic Groups (Flavor-Specific)