LINQ to SQL details, issues and patterns

LINQ to SQL (formerly called DLINQ) is a simple object-relational mapper (ORM) scheduled for .NET Framework 3.5/Visual Studio 2007 (Orcas).

On projects with new data I’m keen on keeping the tables and classes as similar as possible and so the limited functionality of LINQ to SQL really appeals to me.

Sure for larger applications or where legacy data does not inspire great classes then a more advanced ORM with better mapping facilities can outweigh the disadvantages and learning curve imposed by another layer of abstraction. Microsoft have that covered with LINQ to Entities which will ship post-Orcas because of the immaturity of the the designer and advanced scenarios.

While prototyping I’ve come up with a few oddities that deserve documenting somewhere. Here they are in not-quite-FAQ format.

What SQL is this query or operation generating?

Set the Log property of your DataContext object to Console.Out.

What objects are generated and how do I extend them?

MyDataContext : DataContext

Represents a users view of the database and is generated by designer/SQLMetal exposing a property for each table being mapped.

Can be extended by writing your own partial class or sub classing.

MyEntity / T

Represents an entity for each row in the associated table and is generated by designer/SQLMetal exposing a property for each field being mapped which is also decorated with attributes that specify the underlying field name and type.

Can be extended by writing your own partial class, sub classing or providing a super-class.

No classes are generated for the tables themselves, instead the generic class Table<T> uses the attribute information from the entity objects (cached via various metadata classes such as MetaTable).

What patterns does LINQ to SQL use?

Fortunately, ActiveRecord is ignored and a more flexible approach is used. It basically provides one class for the database and one for each type of entity (because of inheritance, there could be more than one type of entity per table).

I’ve yet to delve deep enough into the classes and patterns to find good matches but a brief look reveals the following possibilities:

Registry: System.Data.Linq.DataContext
Unit of Work: System.Data.Linq.DataContext
Identity Map: System.Data.Linq.IdentityManager
Table Data Gateway: System.Data.Linq.Table<T>
Query Object: System.Data.Linq.DataQuery<T>
Metadata Mapping: System.Data.Linq.Provider.Meta*, System.Data.Linq.*Attribute, System.Data.Linq.Mapping.*

How do I persist changes?

Call the SubmitChanges method on your DataContext object.

What causes the error “Row not found or changed” when submitting changes?

By default LINQ to SQL creates UPDATE statements that include every field as it was when read in the WHERE clause and not just the primary key as you might expect. The upshot is that if the record has changed since it was loaded it will not find it.

However, there are also a couple of scenarios where it won’t find the record even if it hasn’t changed:

If the data range/precision of the .NET type can’t exactly hold the value in the SQL table (e.g. DateTime)
If the table schema doesn’t exactly match the mapping (e.g. a column’s null-ability changing)

If in doubt remove the table from your LINQ schema diagram and drag a fresh copy back in from your database using the Server Explorer pane.

Why does the OnPropertyChanging event use the PropertyChanged event handler delegate?

OnPropertyChanging does seem to be using the PropertyChangedEvent and PropertyChangedEventArgs that OnPropertyChanged use.

Being that PropertyChangingEvent and PropertyChangingEventArgs exist I would assume this is a bug within the designer/SQLMetal in beta 2.

What database vendors are supported?

SQL Server 2000, 2005, 2008 and SQL-CE are currently included.

Why do queries against a table return entities that do not match the query?

This occurs when entity objects in memory are out-of-step with data in the database.

The query is executed against the SQL server and for each matching record it either creates and entity object or uses the cached one it already has.

Therefore it is possible your result set will:

not include matching entities because the database indicates they do not (database changed or do not exist)
include entities that do not match because the database indicated they did (database changed or object changed)

Why is memory consumption so high?

LINQ to SQL tends to burn more memory than you might be expecting because of the multiple objects and aggressive caching in place.

Web applications, web services and middle-tiers often share objects between potential users and requests. It is worth bearing in mind that a DataContext is not well suited to sharing between users because:

Entities are cached indefinitely
- High memory use as every entity is eventually loaded to memory
- Entities to be out-of-date if the data can be updated elsewhere (triggers, imports, background jobs)
DataContext.CommitChanges persists all objects changed via that DataContext
- Difficult to determine a safe point to commit User A’s completed entities without committing User B’s incomplete entities
DataContext’s Table<T> creation is not thread safe
- Accessing a table for the first time on two threads could cause the DataContext to create one instance of a Table<T> while creating another for the same <T> and resulting in the latter being kept but the former being returned to one of the callee’s.

A DataContext is therefore best suited for either a single-user application, per-session or per-request.

[)amien

8 responses

Information is only used to show your comment. See my Privacy Policy.
Kro12/08/2008
"Is many to may relationship possible in LINQ?"
You have to use an intermediary table.
XXX20/06/2008
Is many to may relationship possible in LINQ?
John Rusk08/10/2007
Hi Damien,
You mentioned queries returning objects that do not match the query. It's a shame that MS haven't taken advantage of LINQ to solve this (or at least, partially solve it). They should be able to run the criteria again in memory (after the SQL) to strip out the non-matching objects. I tried to describe this here http://dotnet.agilekiwi.com/blog/index.html (under the heading "option 4"), but unfortunately the post got a little too wordy! :-)
Damien Guard18/05/2007
There are certainly some considerations to be made with the DataContext especially relating to how entities are cached and changes not saved back until the commit is processed.
I'm afraid I don't have any more specific information right now - I hope Microsoft's PAG will produce a best-practices document on using LINQ to SQL in web environments.
[email protected]17/05/2007
? use type datasets right now as a dataaccess layer representing my business entities and DlinQ seems really sexy with the partial class approach.
But the thing is what about the deployment problems it can cause?everything is getting coupled with the datacontext object, will that bring more problem than the solution in a transactionless(transaction free) web enviroment?
Matt Warren16/05/2007
LINQ to SQL does actual use version numbers. Each entity can either use a version number or some or all of its fields to do the optimistic concurrency check. If a version number exists in the database, LINQ to SQL will use only that field. Otherwise you can specify in the mapping which fields are used. The designer and SQLMetal both 'guess' at these for you. They will choose to not use CLOB and BLOB fields for optimistic concurrency.
steve16/05/2007
Sorry about the CRs, I have to write my comments in a separate text editor because the live comment preview slows to a crawl beyond a few lines :/
steve16/05/2007
"By default LINQ to SQL creates UPDATE statements that include every field as it was when read in the WHERE clause and not just the primary key as you might expect. The upshot is that if the record has changed since it was loaded it will not find it."
An interesting approach to optimistic locking - most people use version number fields or some derivative. There's a problem with their approach to this - CLOB and BLOB columns and partitioning. You can't use this technique on tables with large object columns, and any vertical partitioning will make this approach very inefficient. The latter is perhaps less common but BLOBs certainly aren't.
Maybe they just ignore BLOB columns and expect you to put an always-updated field in there (like a timestamp or version number) - although that kind of defeats the object since you might as well just use that one field in that case.
One thing that often irritates me about MS frameworks is that they a) reinvent
the wheel, b) always seem to be in beta, unfinished, or about to be replaced
with a new interface-breaking version. If you want to make something stable you
want something that's been around in the real world for a while and had the
kinks ironed out already. LINQ looks nice in theory but I'm willing to bet it
will have plenty of problems in the short term. I suspect their inability to stick to one framework / interface is to do with their background as a product company that constantly needs to push out something new and buzzwordy for people to buy. J2EE might have become less sexy these days, but at least it has solutions for things like this which have been used and proven over many years, without being end-of-lined all the time to make way for the next sexy acronym. Stablility isn't sexy, but I know which I prefer when it comes to writing production code.