Parameterising the IN clause of an SQL SELECT in .NET

October 2006 – June 2010 .NET (, ) • 1,447 views • 5 responses

I’m a fan of parameterized queries with a strong dislike for building SQL (or other magic strings). Eencoding /escaping errors, localization formatting problems and injection can run rumpant when you think everything is a string.

Problem

Today I found myself a quandary as I needed to SELECT records based on a list of values I had. e.g.

SELECT * FROM Products WHERE ProductCode IN ('ABC123', 'DEF456', 'GHI789')

At first glance the relevant parameterized version might look like:

SELECT * FROM Products WHERE ProductCode IN (@productlist)

The problem here however is that if you put a comma-separate list of items into a string parameter named @productlist then it sends this to the database server:

SELECT * FROM Products WHERE ProductCode IN ('ABC123, DEF456, GHI789')

That’s not what we want at all.

This hack isn’t pretty and it has some limitations:

  • Only works with named parameters
  • Could upset some DB providers that don’t like having command parameters removed or command text modified
  • Parameter name to replace must be totally unique – i.e. not exist as a subset of another parameter name
  • Only preserves the basic IDbCommand declared properties

It should however work across DB providers and types.

Usage

The previous example would mean we use exactly the expected parameterized version with @productlist in place.

Add the parameter as you’d expect but instead of assigning a string/numeric to it assign something IEnumerable.

Finally call this method against the command and parameter before you execute it for the ‘magic’ to happen:

The ‘magic’

public void ExpandDbArrayParameter(IDbCommand cmd, IDbDataParameter parameter) {
    if (parameter.Value is IEnumerable) {
        int index = 0;
        StringBuilder newParameterSQL = new StringBuilder();
        foreach(Object value in (IEnumerable) parameter.Value) {
            String valueParameterName = String.Format("{0}{1}", parameter.ParameterName, ++index);
            IDataParameter valueParameter = cmd.CreateParameter();
            valueParameter.DbType = parameter.DbType;
            valueParameter.Direction = parameter.Direction;
            valueParameter.ParameterName = valueParameterName;
            valueParameter.SourceColumn = parameter.SourceColumn;
            valueParameter.SourceVersion = parameter.SourceVersion;
            valueParameter.Value = value;
            cmd.Parameters.Add(valueParameter);

            if (index == 1)
                newParameterSQL.Append(valueParameterName);
            else
                 newParameterSQL.Append("," + valueParameterName);
        }
        cmd.Parameters.Remove(parameter);
        cmd.CommandText = cmd.CommandText.Replace(parameter.ParameterName, newParameterSQL.ToString());
    }
}

[)amien

5 responses  

  1. Steve on October 26th, 2006

    Although you still get the benefits of the security of a parameterised query, I think (if this works like JDBC) you’ll lose all the performance benefits of a precompiled statement.

    It kind of depends whether the contents of the IN are completely unpredictable, or if there is a preset list of combinations. If it’s the latter, I’d tend to enumerate those combinations in a sort of ‘criteria table’ and use an id from that to perform IN queries. If it’s completely flexible and unpredictable then you can use a transaction-scope temporary table (many databases support these and they’re held in memory, not disk therefore they’re very fast & cheap) which you populate with the list of criteria and use for the IN. No statement recompilation or hacks needed.

  2. Damien Guard on October 26th, 2006

    Yeah I realise this will blow the precompilation/optimisation steps as each command text becomes slightly different.

    If there are more than a handful of entries then a temporary table might well be a suitable option.

    The other option is if you know you won’t use more than n then to put n parameters into the IN list and fill the unused ones with NULL.

    Which avoids the temporary table and optimises better but obviously with the restriction of a maximum number of elements.

    [)amien

  3. Steve on October 26th, 2006

    Yep, that’s certainly an option, assuming that Null doesn’t have a special meaning in this instance (which it sometimes does). Sometimes it’s hard to find ‘safe’ values :)

  4. Lee Wallace on October 26th, 2006

    Possibly one of the geekiest blog postings I’ve ever read.

  5. Damien Guard on October 29th, 2006

    I think the post on WeakReference<T> was geekier.

    [)amien

Leave your response

  1. (kept private)