Include for LINQ to SQL (and maybe other providers)
It’s quite common that when you issue a query you’re going to want to join some additional tables.
In LINQ this can be a big issue as associations are properties and it’s easy to end up issuing a query every time you hit one. This is referred to as the SELECT N+1 problem and tools like EF Profiler can help you find them.
An example
Consider the following section of C# code that displays a list of blog posts and also wants the author name.
foreach(Post post in db.Posts)
Console.WriteLine("{0} {1}", post.Title, post.Author.Name);
This code looks innocent enough and will issue a query like “SELECT * FROM [Posts]” but iterating over the posts causes the lazy-loading of the Author property to trigger and each one may well issue a query similar to “SELECT * FROM [Authors] WHERE [AuthorID] = 1”.
In the case of LINQ to SQL it’s not always an extra load as it will check the posts AuthorID foreign key in its internal identity map (cache) to see if it’s already in-memory before issuing a query to the database.
LINQ to SQL LoadWith
Most object-relational mappers have a solution for this – Entity Framework’s ObjectQuery has an Include operator (that alas takes a string), and NHibernate has a fetch mechanism. LINQ to SQL has LoadWith which is used like this:
var db = new MyDataContext();
var dlo = new DataLoadOptions();
dlo.LoadWith<Posts>(p => p.Blog);
db.LoadOptions = dlo;
This is a one-time operation for the lifetime of this instance of the data context which can be inflexible and LoadWith has at least one big bug with inheritance issuing multiple joins.
A flexible alternative
This got me thinking and I came up with a useful extension method to provide Include-like facilities on-demand in LINQ to SQL (and potentially other LINQ providers depending on what they support) in .NET 4.0.
public static IEnumerable<T> Include<T, TInclude>(this IQueryable<T> query, Expression<Func<T, TInclude>> sidecar) {
var elementParameter = sidecar.Parameters.Single();
var tupleType = typeof(Tuple<T, TInclude>);
var sidecarSelector = Expression.Lambda<Func<T, Tuple<T, TInclude>>>(
Expression.New(tupleType.GetConstructor(new[] { typeof(T), typeof(TInclude) }),
new Expression[] { elementParameter, sidecar.Body },
tupleType.GetProperty("Item1"), tupleType.GetProperty("Item2")), elementParameter);
return query.Select(sidecarSelector).AsEnumerable().Select(t => t.Item1);
}
To use simply place at the end of your query and specify the property you wish to eager-load, e.g.
var oneInclude = db.Posts.Where(p => p.Published).Include(p => p.Blog));
var multipleIncludes = db.Posts.Where(p => p.Published).Include(p => new { p.Blog, p.Template, p.Blog.Author }));
This technique only works for to-one relationships not to-many. It is also quite untested so evaluate it properly before using it.
How it works
How it works is actually very simple – it projects into a Tuple that contains the original item and all additional loaded elements and then just returns the query back the original item. It is a dynamic version of:
var query = db.Posts.Where(p => p.Published)
.Select(p => new Tuple<Post, Blog>(p, p.Blog))
.AsEnumerable()
.Select(t => t.Item1);
This is why it has to return IEnumerable
[)amien
12 responses