Lazy Quantifiers

Summary: in this tutorial, you will learn about the lazy quantifiers to find the smallest match in an input string.

Introduction to lazy quantifiers

Lazy quantifiers, also known as non-greedy quantifiers, are a feature in regular expressions that modify the behavior of quantifiers to match as little as possible. They provide the smallest possible match that satisfies the regular expression pattern.

By default, quantifiers in regular expressions are greedy, meaning they match as much as possible. However, lazy quantifiers work in the opposite way. They match as little as possible while still allowing the overall pattern to be satisfied.

Lazy quantifiers are denoted by appending a question mark (?) to the standard quantifiers.

The following table shows the greedy quantifiers, lazy quantifiers, and the meanings of the lazy quantifiers:

Greedy QuantifiersLazy QuantifiersLazy Quantifier Meaning
**?Match zero or more occurrences (as few as possible)
++?Match one or more occurrences (as few as possible)
???Match zero or one occurrence (preferably zero)
{n}{n}?Match exactly n occurrences
{n,}{n,}?Match n or more occurrences
{n,m}{n,m}?Match between n and m occurrences (as few as possible)

Lazy quantifiers example

The following example illustrates how to use a lazy quantifier to extract attribute values of an input tag:

using System.Text.RegularExpressions;
using static System.Console;


var html = """<input type="submit" values="Send">""";

var pattern = """ 
              ".+?"
              """;

var matches = Regex.Matches(html, pattern);
foreach (var match in matches)
{
    WriteLine(match);
}Code language: C# (cs)

Output:

"submit"
"Send"Code language: JSON / JSON with Comments (json)

How it works.

  1. The program begins with the necessary using directive to include the Regex class from the System.Text.RegularExpressions namespace.
  2. The program then includes using static System.Console; to allow the usage of the WriteLine method without explicitly specifying the Console class.
  3. The HTML string is defined as html using a raw string (""") that contains an HTML input element with the attribute type="submit" and values="Send".
  4. The regular expression pattern is defined as pattern using a raw string with triple quotes (""") to avoid escaping the ” inside the regular expression. The pattern ".+?" matches the attributes of the input HTML tag including quotes (“). The lazy quantifier ? ensures that the match is as small as possible.
  5. The program uses the Regex.Matches() method to find all matches of the pattern in the HTML string html. It takes the html string and the pattern as arguments and returns a collection of Match objects representing the matches found.
  6. The program then iterates over each Match object in the matches collection using a foreach loop and use the WriteLine method to print each match to the console inside the loop.

Summary

  • A lazy quantifier in regular expressions matches as little as possible while still satisfying the pattern.
Was this tutorial helpful ?