C# Regex Character Classes

Summary: in this tutorial, you will learn how to use character classes in regular expressions using C#.

Introduction to the C# Regex Character Classes

Character classes define a set of characters. For example:

  • Digits (0 to 9 )
  • Alphabets (a to z)
  • Whitespace (tab, space, …)

Character classes allow you to match characters from a specified set of characters.

\d: digit character class

The \d represents the digit character class that matches any single digit from 0 to 9. The following example uses the \d character class to find all digits in a string:

using System.Text.RegularExpressions;
using static System.Console;


var text = "7 Awesome New Features in C# 12";
var pattern = @"\d";

var matches = Regex.Matches(text, pattern);

foreach (var match in matches)
{
    WriteLine(match);
}Code language: C# (cs)

Output:

3
1
2

To match two digits, you use \d\d like this:

using System.Text.RegularExpressions;
using static System.Console;


var text = "3 Awesome New Features in C# 12";
var pattern = @"\d\d";

var matches = Regex.Matches(text, pattern);

foreach (var match in matches)
{
    WriteLine(match);
}Code language: C# (cs)

Output:

12

In this example, the \d\d matches 12, not 3.

Notice that you’ll learn how to use the quantifiers to make the character class more concise like this \d{2}

\w: the word character class

The \w represents the word character class that matches a single ASCII character including the alphabets, digits, underscores (_)

The following example shows how to use the \w character class to match all word characters from a string:

using System.Text.RegularExpressions;
using static System.Console;


var text = "C# is awesome";
var pattern = @"\w";

var matches = Regex.Matches(text, pattern);

foreach (var match in matches)
{
    WriteLine(match);
}
Code language: C# (cs)

Output:

C
i
s
a
w
e
s
o
m
eCode language: plaintext (plaintext)

\s: whitespace character class

The \s represents the whitespace character class that includes newline, tab, vertical tab, space, etc. The following example uses the \s character class to match whitespace characters in a string:

using System.Text.RegularExpressions;
using static System.Console;


var text = "C# is awesome!";
var pattern = @"\s";

var matches = Regex.Matches(text, pattern);

WriteLine($"{matches.Count} matches found");

foreach (var match in matches)
{
    WriteLine(match);
}Code language: C# (cs)

Output:

2 matches found


It returns two matches that correspond to the two spaces in the string.

Inverse character classes

Inverse character classes are also called negated character classes. They allow you to match any character that is not a specified set of characters. For example, an inverse character class of the digit character class matches any single character except for a digit.

The flowing table displays character classes and their inverse versions:

Character classInverse character classDescription
\d\DMatch any character, excluding digits
\w\WMatch any character that is not a word character
\s\SMatch any character, excluding whitespaces

For example, the following uses the \D character class to match the non-digit character of a phone number:

using System.Text.RegularExpressions;
using static System.Console;


var phone = "+1-(408)-555-6666";
var pattern = @"\D";

var matches = Regex.Matches(phone, pattern);


foreach (var match in matches)
{
    WriteLine(match);
}Code language: C# (cs)

Output:

+
-
(
)
-
-Code language: plaintext (plaintext)

To turn the phone number from +1-(408)-555-6666 format to 14085556666, you use the Replace() method of the Regex class:

using System.Text.RegularExpressions;
using static System.Console;


var phone = "+1-(408)-555-6666";
var pattern = @"\D";

var result = Regex.Replace(phone, pattern, "");

WriteLine(result);Code language: C# (cs)

Output:

14085556666

Summary

  • A character class defines a set of characters.
  • \d represents the digit character class.
  • \w represents the word character class.
  • \s represents the whitespace character class.
  • \D, \W, \S are inverse character classes of the \d, \w, \s, which matches any character excluding the characters defined in the set.
Was this tutorial helpful ?