C# AsParallel

Summary: in this tutorial, you’ll learn how to use the C# AsParallel() method to run LINQ queries in parallel.

Introduction to C# Parallel LINQ

When you have a large data set, it takes time to process them. To speed up the calculation, you can divide the large dataset into smaller chunks and process them simultaneously.

To do that, you use the parallel LINQ or PLINQ in short. PLINQ allows you to execute LINQ queries in parallel across multiple CPU cores.

PLINQ automatically divides the large data into smaller chunks, distributes them across multiple CPU cores, and aggregates the result back into a single result set.

PLINQ is useful for CPU-bound operations that require large-scale data processing or complex computations. But it may not be suitable for I/O bound operations like reading files or accessing data via API.

Since PLINQ executes the queries in parallel, you need to ensure your code is thread-safe and doesn’t introduce race conditions.

Creating a parallel query using the C# AsParallel() method

To create a parallel query, you follow these steps:

First, start with a standard LINQ query that operates on an IEnumrable<T> or IQueryable<T> data source:

IEnumerable<T> source = ...Code language: C# (cs)

Second, call the AsParallel() extension method on the data source to create a parallel query:

ParallelQuery<T> query = source.AsParallel();Code language: C# (cs)

The AsParallel() method returns a parallel query.

Third, call an operator to execute the query and get the result. For example:

var result = query.Sum()Code language: C# (cs)

The following program demonstrates how to create a simple parallel query that returns the sum a sequence of integers:

using static System.Console;
using System.Diagnostics;

static void Measure(Func<int> f, string name)
{
    var stopwatch = Stopwatch.StartNew();
    stopwatch.Start();

    f();

    stopwatch.Stop();
    WriteLine($"The method {name} took {stopwatch.ElapsedMilliseconds} ms to run");
}

static int SequentialSum()
{
    return Enumerable.Range(0, 101)
        .Select(x => {
            Thread.Sleep(10);
            return x;
        })
        .Sum();
}

static int ParallelSum()
{
    return Enumerable.Range(0, 101).AsParallel()
        .Select(x => {
            Thread.Sleep(10);
            return x;
         })
        .Sum();
}

Measure(SequentialSum, "SequentialSum");
Measure(ParallelSum, "ParallelSum");Code language: C# (cs)

How it works.

First, define the Measure method that takes two parameters: the parameter f of the type Func<int> and the parameter name of the type string.

The Func<int> is a delegate type that represents a function that takes no arguments and returns an integer value. The name parameter represents the method name.

The Measure() method measures the execution time of a function f using the StopWatch object:

static void Measure(Func<int> f, string name)
{
    var stopwatch = Stopwatch.StartNew();
    stopwatch.Start();

    f();

    stopwatch.Stop();

    WriteLine($"The method {name} took {stopwatch.ElapsedMilliseconds} ms to run");
}Code language: C# (cs)

Second, define the SequentialSum() method that calculates the sum of numbers sequentially from 0 to 10001:

static int SequentialSum()
{
    return Enumerable.Range(0, 101)
        .Select(x => {
            Thread.Sleep(10);
            return x;
        })
        .Sum();
}Code language: C# (cs)

Note that we use the Thread.Sleep(10) to simulate time-consuming operations.

Third, define the ParallelSum() method that calculates the sum of numbers from 0 to 10001 in parallel:

static int ParallelSum()
{
    return Enumerable.Range(0, 101).AsParallel()
        .Select(x => {
            Thread.Sleep(10);
            return x;
         })
        .Sum();
}Code language: C# (cs)

The ParallelSum() uses the AsParallel() method to convert a query to a parallel query and return the sum of the numbers in the sequence.

Finally, execute both SequentialSum() and ParallelSum() method and measure the time they took:

Measure(SequentialSum, "SequentialSum");
Measure(ParallelSum, "ParallelSum");Code language: C# (cs)

Output:

The method SequentialSum took 1578 ms to run
The method ParallelSum took 293 ms to runCode language: C# (cs)

The output shows that executing the LINQ query in parallel is faster compared to running it sequentially in this case.

Important notes of PLINQ

As mentioned earlier, parallel LINQ queries only run faster than regular LINQ queries with a large data set or complex computations.

If you have a small data set with simple calculations, parallel queries may run slower than regular LINQ queries because PLINQ does have some overheads.

For example, if you remove the Thread.Sleep(10) of the lambda expression in the Select() method, you’ll see that the SequentialSum() method runs faster than ParallelSum() method:

using static System.Console;
using System.Diagnostics;

static void Measure(Func<int> f, string name)
{
    var stopwatch = Stopwatch.StartNew();
    stopwatch.Start();

    f();

    stopwatch.Stop();
    
    WriteLine($"The method {name} took {stopwatch.ElapsedMilliseconds} ms to run");
}

static int SequentialSum()
{
    return Enumerable.Range(0, 101)
        .Select(x => {
            //Thread.Sleep(10);
            return x;
        })
        .Sum();
}

static int ParallelSum()
{
    return Enumerable.Range(0, 101).AsParallel()
        .Select(x => {
            //Thread.Sleep(10);
            return x;
         })
        .Sum();
}

Measure(SequentialSum, "SequentialSum");
Measure(ParallelSum, "ParallelSum");Code language: C# (cs)

Output:

The method SequentialSum took 2 ms to run
The method ParallelSum took 74 ms to runCode language: C# (cs)

Summary

  • PLINQ allows you to execute LINQ queries in parallel across multiple CPU cores.
  • Use AsParallel() method to execute parallel LINQ queries.
Was this tutorial helpful ?