Archive for linq tag

Include for LINQ to SQL (and maybe other providers)

It’s quite common that when you issue a query you’re going to want to join some additional tables.

In LINQ this can be a big issue as associations are properties and it’s easy to end up issuing a query every time you hit one. This is referred to as the SELECT N+1 problem and tools like LINQ to SQL Profiler can help you find them.

An example

Consider the following section of C# code that displays a list of blog posts and also wants the author name.

foreach(Post post in db.Posts)
  Console.WriteLine("{0} {1}", post.Title, post.Author.Name);

This code looks innocent enough and will issue a query like “SELECT * FROM [Posts]” but iterating over the posts causes the lazy-loading of the Author property to trigger and each one may well issue a query similar to “SELECT * FROM [Authors] WHERE [AuthorID] = 1″.

In the case of LINQ to SQL it’s not always an extra load as it will check the posts AuthorID foreign key in its internal identity map (cache) to see if it’s already in-memory before issuing a query to the database.

LINQ to SQL’s LoadWith

Most object-relational mappers have a solution for this – Entity Framework’s ObjectQuery has an Include operator (that alas takes a string), and NHibernate has a fetch mechanism. LINQ to SQL has LoadWith which is used like this:

var db = new MyDataContext();
var dlo = new DataLoadOptions();
dlo.LoadWith<Posts>(p => p.Blog);
db.LoadOptions = dlo;

This is a one-time operation for the lifetime of this instance of the data context which can be inflexible and LoadWith has at least one big bug with inheritance issuing multiple joins.

A flexible alternative

This got me thinking and I came up with a useful extension method to provide Include-like facilities on-demand in LINQ to SQL (and potentially other LINQ providers depending on what they support) in .NET 4.0.

public static IEnumerable<T> Include<T, TInclude>(this IQueryable<T> query, Expression<Func<T, TInclude>> sidecar) {
   var elementParameter = sidecar.Parameters.Single();
   var tupleType = typeof(Tuple<T, TInclude>);
   var sidecarSelector =  Expression.Lambda<Func<T, Tuple<T, TInclude>>>(
      Expression.New(tupleType.GetConstructor(new[] { typeof(T), typeof(TInclude) }),
         new Expression[] { elementParameter, sidecar.Body  },
         tupleType.GetProperty("Item1"), tupleType.GetProperty("Item2")), elementParameter);
   return query.Select(sidecarSelector).AsEnumerable().Select(t => t.Item1);
}

To use simply place at the end of your query and specify the property you wish to eager-load, e.g.

var oneInclude = db.Posts.Where(p => p.Published).Include(p => p.Blog));
var multipleIncludes = db.Posts.Where(p => p.Published).Include(p => new { p.Blog, p.Template, p.Blog.Author }));

This technique only works for to-one relationships not to-many. It is also quite untested so evaluate it properly before using it.

How it works

How it works is actually very simple – it projects into a Tuple that contains the original item and all additional loaded elements and then just returns the query back the original item. It is a dynamic version of:

var query = db.Posts.Where(p => p.Published)
   .Select(p => new Tuple<Post, Blog>(p, p.Blog))
   .AsEnumerable()
   .Select(t => t.Item1);

This is why it has to return IEnumerable<T> and belong at the end (and the use of Tuple is why it is .NET 4.0 only although that should be easy enough to change). Not all LINQ providers will necessarily register the elements with their identity map to prevent SELECT N+1 on lazy-loading but LINQ to SQL does :)

[)amien

Client-side properties and any remote LINQ provider

David Fowler on the ASP.NET team and I have been bouncing ideas about on how to solve an annoyance using LINQ:

If you write properties on the client you can’t use them in remote LINQ operations.

The problem occurs because these properties can’t be translated and sent to the server as they have been compiled into intermediate language (IL) and not LINQ expression trees that are required for translation by IQueryable implementations. There is nothing available in .NET to let us reverse-engineer the IL back into the methods and syntax that would allow us to translate the intended operation into a remote query.

This means you end up having to write your query in two parts; firstly the part the server can do, a ToList or AsEnumerable call to force that to happen and bring the intermediate results down to the client, and then the operations that can only be evaluated locally. This can hurt performance if you want to reduce or transform the result set significantly.

What we came up (David, Colin Meek and myself) is a provider-independent way of declaring properties just once so they can be used in both scenarios. Computed properties for LINQ to SQL, LINQ to Entities and anything else LINQ enabled with little effort and it works great on .NET 3.5 SP1 :)

Before example

Here we have extended the Employee class to add Age and FullName. We only wanted to people with “da” in their name but we are forced to pull down everything to the client in order to the do the selection.

partial class Employee {
	public string FullName {
		get { return Forename + " " + Surname; }
	}

	public int Age {
		get { return DateTime.Now.Year - BirthDate.Year -
			(((DateTime.Now.Month < BirthDate.Month)
			|| DateTime.Now.Month == BirthDate.Month && DateTime.Now.Day < BirthDate.Day) ? 1 : 0));
		}
	}
}
...
var employees = db.Employees.ToList().Where(e => e.FullName.Contains("da")).GroupBy(e => e.Age);

After example

Here using our approach it all happens server side… and works on both LINQ to Entities and LINQ to SQL.

partial class Employee {
    private static readonly CompiledExpression<Employee,string> fullNameExpression
     = DefaultTranslationOf<Employee>.Property(e => e.FullName).Is(e => e.Forename + " " + e.Surname);
    private static readonly CompiledExpression<Employee,int> ageExpression
     = DefaultTranslationOf<Employee>.Property(e => e.Age).Is(e => DateTime.Now.Year - e.BirthDate.Value.Year - (((DateTime.Now.Month < e.BirthDate.Value.Month) || (DateTime.Now.Month == e.BirthDate.Value.Month && DateTime.Now.Day < e.BirthDate.Value.Day)) ? 1 : 0)));

    public string FullName {
        get { return fullNameExpression.Evaluate(this); }
    }

    public int Age {
        get { return ageExpression.Evaluate(this); }
    }
}
...
var employees = db.Employees.Where(e => e.FullName.Contains("da")).GroupBy(e => e.Age).WithTranslations();

Getting started

Usage considerations

The caveats to the usage technique shown above is you need to ensure your class has been initialized before you write queries to it (check out alternatives below) and obviously the expression you register for a property must be able to be translated to the remote store so you will need to constrain yourself to the methods and operators your IQueryable provider supports.

There are a few alternative ways to use this rather than the specific examples above.

Registering the expressions

You can register the properties in the class itself as shown in the examples which means the properties themselves can evaluate the expressions without any reflection calls. Alternatively if performance is less critical you can register them elsewhere and have the methods look up their values dynamically via reflection. e.g.

...
DefaultTranslationOf<Employee>.Property(e => e.FullName).Is(e => e.Forename + " " + e.Surname);
var employees = db.Employees.Where(e => e.FullName.Contains("da")).GroupBy(e => e.Age).WithTranslations();
...
partial class Employee {
    public string FullName { get { return DefaultTranslationOf<Employees>.Evaluate<string>(this, MethodInfo.GetCurrentMethod());} }
}

If performance of the client-side properties is critical then you can always have them as regular get properties with the full code in there at the expense of having the calculation duplicated, once in IL in the property and once as an expression for the translation.

Different maps for different scenarios

Sometimes certain parts of your application may want to run with different translations for different scenarios, performance etc. No problem!

The WithTranslations method normally operates against the default translation map (accessed with DefaultTranslationOf) but there is also another overload that takes a TranslationMap you can build for specific scenarios, e.g.

var myTranslationMap = new TranslationMap();
myTranslationMap.Add<Employees, string>(e => e.Name, e => e.FirstName + " " + e.LastName);
var results = (from e in db.Employees where e.Name.Contains("martin") select e).WithTranslations(myTranslationMap).ToList();

How it works

CompiledExpression<T, TResult>

The first thing we needed to do was get the user-written client-side “computed” properties out of IL and back into expression trees so we could translate them. Given that we also want to evaluate them on the client we need to compile them at run time so CompiledExpression exists which just takes an expression of Func<T, TResult>, compiles it and allows evaluation of objects against the compiled version.

ExpressiveExtensions

This little class provides both the WithTranslations extensions methods and the internal TranslatingVisitor that unravels the property accesses into their actual registered Func<T, TResult> expressions via the TranslationMap so that the underlying LINQ provider can deal with that instead.

TranslationMap

We need to have a map of properties to compiled expressions and for that purpose TranslationMap exists. You can create a TranslationMap by hand and pass it in to WithTranslations if you want to programmatically create them at runtime or have different ones for different scenarios but generally you will want to use…

DefaultTranslationOf

This helper class lets you register properties against the default TranslationMap we use when nothing is passed to WithTranslations. It also allows you to lookup what is already registered so you can evaluate to that although there is a small reflection performance penalty for that:

public int Age { get { return DefaultTranslationOf<Employees>.Evaluate<int>(this, MethodInfo.GetCurrentMethod()); } }

Have fun!

[)amien

Using LINQ to foreach over an enum in C#

I can’t be the only person in the world who wants to foreach over the values of an enum otherwise Enum.GetValues(Type enumType) wouldn’t exist in the framework. Alas it didn’t get any generics love in .NET 2.0 and unhelpfully returns an array.

Thanks to the power of LINQ you can do this:

foreach(CustomerTypes customerType in Enum.GetValues(typeof(CustomerTypes)).Cast<CustomerTypes>())

That is okay, but this is more concise:

foreach(CustomerTypes customerType in Enums.Get<CustomerTypes>())

The tiny class to achieve that is, of course:

using System.Collections.Generic;
using System.Linq;

public static class Enums {
	public static IEnumerable<T> Get<T>() {
		return System.Enum.GetValues(typeof(T)).Cast<T>();
	}
}

Great.

[)amien

Language Integrated Query: An introduction presentation online

This evening’s presentation on Language Integrated Query (LINQ) is now available from my presentations page.

The event went well with 12 developers (it’s a small island) and took a less time to produce because I didn’t write a script for but slides to lead us through areas I wanted to cover.

This means the on-line presentation is of less useful than previous ones however I am pondering the transcript/audio-track option (in my best radio voice of course).

We are now considering topics for February and with something a bit special for March.

Stay tuned!

[)amien