Optimizing Sum, Count, Min, Max and Average with LINQ
- 📅
- 📝 498 words
- 🕙 3 minutes
- 📦 .NET
- 🏷️ Elasticsearch, LINQ, Entity Framework, C#
- 💬 2 responses
LINQ is a great tool for C# programmers letting you use familiar syntax with a variety of back-end systems without having to learn another language or paradigm for many query operations.
Ensuring that the queries still perform well can be a bit of a chore and one set that fails quite badly are the aggregate operations when you want more than one.
Multiple sequential queries (bad)
var count = db.Invoices.Count();
var total = db.Invoices.Sum(i => i.Paid);
var average = db.Invoices.Average(i => i.Paid);
Will issue three separate requests. There is nothing a LINQ provider can do to optimize that pattern as they are three discrete statements.
Background
If we wanted these values by country we could do this in LINQ:
var a = db.Invoices.GroupBy(i => i.Country)
.Select(g => new { Country = g.Key,
Count = g.Count(),
Total = g.Sum(i => i.Paid),
Average = g.Average(i => i.Paid) });
Which gets us everything in a single statement broken down by country. In SQL this is:
SELECT Country, Count(*), Sum(Paid), Average(Paid)
FROM Invoices GROUP BY Country
Many data sources including SQL are happy to provide aggregate values without a group by so how do we generate that from LINQ?
In the absence of a Group method that doesn’t take a property we need to fake it and because of the way many LINQ providers optimize out parts of the tree we can:
Single optimized query (good)
Replacing the property in a GroupBy with a constant value gives us an optimized single query:
var a = db.Invoices.GroupBy(i => 1)
.Select(g => new { Count = g.Count(),
Total = g.Sum(i => i.Paid),
Average = g.Average(i => i.Paid) });
Here are the providers I’ve tried:
- LINQ to Objects (Works although constant is likely evaluated)
- LINQ to SQL (Works although passes 1 parameter to SQL)
- Entity Framework 6 (Works although query is a little obscure)
- ElasticLINQ (Works and optimizes out totally)
Count+Where optimizations
If we are performing counts with a predicate or against a where we can also optimize these.
var high = db.Invoices.Count(i => i.Paid >= 1000);
var low = db.Invoices.Where(i => i.Paid < 1000).Count();
var sum = db.Invoices.Sum(i => i.Paid);
Then we can express this as:
var a = db.Invoices.GroupBy(g => 1)
.Select(g => new { High = g.Count(i => i.Paid >= 1000),
Low = g.Count(i => i.Paid < 1000),
Sum = g.Sum(i => i.Paid) });
[)amien
2 responses to Optimizing Sum, Count, Min, Max and Average with LINQ
LINQ to SQL does support that syntax, not sure about other providers but could be worth trying.
I’m wondering if you can do