Parameterising the IN clause of an SQL SELECT in .NET

I’m a fan of parameterized queries with a strong dislike for building SQL (or other magic strings). Encoding, escaping errors, localization formatting problems and injection can run rampant when you think everything is a string.

Problem

Today I found myself a quandary as I needed to SELECT records based on a list of values I had. e.g.

SELECT * FROM Products WHERE ProductCode IN ('ABC123', 'DEF456', 'GHI789')

At first glance the relevant parameterized version might look like:

SELECT * FROM Products WHERE ProductCode IN (@productlist)

The problem here however is that if you put a comma-separate list of items into a string parameter named @productlist then it sends this to the database server:

SELECT * FROM Products WHERE ProductCode IN ('ABC123, DEF456, GHI789')

That’s not what we want at all.

This hack isn’t pretty and it has some limitations:

  • Only works with named parameters
  • Could upset some DB providers that don’t like having command parameters removed or command text modified
  • Parameter name to replace must be totally unique – i.e. not exist as a subset of another parameter name
  • Only preserves the basic IDbCommand declared properties

It should however work across DB providers and types.

Usage

The previous example would mean we use exactly the expected parameterized version with @productlist in place.

Add the parameter as you’d expect but instead of assigning a string/numeric to it assign something IEnumerable.

Finally call this method against the command and parameter before you execute it for the ‘magic’ to happen:

The ‘magic’

public void ExpandDbArrayParameter(IDbCommand cmd, IDbDataParameter parameter) {
  if (parameter.Value is IEnumerable) {
    int index = 0;
    StringBuilder newParameterSQL = new StringBuilder();
    foreach(Object value in (IEnumerable) parameter.Value) {
      String valueParameterName = String.Format("{0}{1}", parameter.ParameterName, ++index);
      IDataParameter valueParameter = cmd.CreateParameter();
      valueParameter.DbType = parameter.DbType;
      valueParameter.Direction = parameter.Direction;
      valueParameter.ParameterName = valueParameterName;
      valueParameter.SourceColumn = parameter.SourceColumn;
      valueParameter.SourceVersion = parameter.SourceVersion;
      valueParameter.Value = value;
      cmd.Parameters.Add(valueParameter);

      if (index == 1)
        newParameterSQL.Append(valueParameterName);
      else
        newParameterSQL.Append("," + valueParameterName);
      }
      cmd.Parameters.Remove(parameter);
      cmd.CommandText = cmd.CommandText.Replace(parameter.ParameterName, newParameterSQL.ToString());
    }
}

[)amien

6 responses to Parameterising the IN clause of an SQL SELECT in .NET

  1. Avatar for

    Information is only used to show your comment. See my Privacy Policy.

  2. Avatar for Roemer

    Pay attention, a string is also an IEnumerable (as char)! So make sure to exclude those in the first if.

  3. Avatar for Lee Wallace
    Lee Wallace

    Possibly one of the geekiest blog postings I’ve ever read.

  4. Avatar for Steve

    Yep, that’s certainly an option, assuming that Null doesn’t have a special meaning in this instance (which it sometimes does). Sometimes it’s hard to find ‘safe’ values :)

  5. Avatar for Damien Guard

    Yeah I realise this will blow the precompilation/optimisation steps as each command text becomes slightly different.

    If there are more than a handful of entries then a temporary table might well be a suitable option.

    The other option is if you know you won’t use more than n then to put n parameters into the IN list and fill the unused ones with NULL.

    Which avoids the temporary table and optimises better but obviously with the restriction of a maximum number of elements.

  6. Avatar for Steve

    Although you still get the benefits of the security of a parameterised query, I think (if this works like JDBC) you’ll lose all the performance benefits of a precompiled statement.

    It kind of depends whether the contents of the IN are completely unpredictable, or if there is a preset list of combinations. If it’s the latter, I’d tend to enumerate those combinations in a sort of ‘criteria table’ and use an id from that to perform IN queries. If it’s completely flexible and unpredictable then you can use a transaction-scope temporary table (many databases support these and they’re held in memory, not disk therefore they’re very fast & cheap) which you populate with the list of criteria and use for the IN. No statement recompilation or hacks needed.