Optimizing Sum, Count, Min, Max and Average with LINQ

LINQ is a great tool for C# programmers letting you use familiar syntax with a variety of back-end systems without having to learn another language or paradigm for many query operations.

Ensuring that the queries still perform well can be a bit of a chore and one set that fails quite badly are the aggregate operations when you want more than one.

Multiple sequential queries (bad)

var count = db.Invoices.Count();
var total = db.Invoices.Sum(i => i.Paid);
var average = db.Invoices.Average(i => i.Paid);

Will issue three separate requests. There is nothing a LINQ provider can do to optimize that pattern as they are three discrete statements.

Background

If we wanted these values by country we could do this in LINQ:

var a = db.Invoices.GroupBy(i => i.Country)
      .Select(g => new { Country = g.Key,
           Count = g.Count(),
           Total = g.Sum(i => i.Paid),
           Average = g.Average(i => i.Paid) });

Which gets us everything in a single statement broken down by country. In SQL this is:

SELECT Country, Count(*), Sum(Paid), Average(Paid)
    FROM Invoices GROUP BY Country

Many data sources including SQL are happy to provide aggregate values without a group by so how do we generate that from LINQ?

In the absence of a Group method that doesn’t take a property we need to fake it and because of the way many LINQ providers optimize out parts of the tree we can:

Single optimized query (good)

Replacing the property in a GroupBy with a constant value gives us an optimized single query:

var a = db.Invoices.GroupBy(i => 1)
    .Select(g => new { Count = g.Count(),
               Total = g.Sum(i => i.Paid),
               Average = g.Average(i => i.Paid) });

Here are the providers I’ve tried:

  • LINQ to Objects (Works although constant is likely evaluated)
  • LINQ to SQL (Works although passes 1 parameter to SQL)
  • Entity Framework 6 (Works although query is a little obscure)
  • ElasticLINQ (Works and optimizes out totally)

Count+Where optimizations

If we are performing counts with a predicate or against a where we can also optimize these.

var high = db.Invoices.Count(i => i.Paid >= 1000);
var low = db.Invoices.Where(i => i.Paid < 1000).Count();
var sum = db.Invoices.Sum(i => i.Paid);

Then we can express this as:

var a = db.Invoices.GroupBy(g => 1)
    .Select(g => new { High = g.Count(i => i.Paid >= 1000),
                   Low = g.Count(i => i.Paid < 1000),
                   Sum = g.Sum(i => i.Paid) });

[)amien

2 responses to Optimizing Sum, Count, Min, Max and Average with LINQ

  1. Avatar for

    Information is only used to show your comment. See my Privacy Policy.

  2. Avatar for Damien Guard

    LINQ to SQL does support that syntax, not sure about other providers but could be worth trying.

  3. Avatar for Victor Kornov
    Victor Kornov

    I’m wondering if you can do

    var invoices = db.Invoices;
    var result = invoices.Select(x => new { count = invoices.Count(), sum = ... });