Posts tagged with code-c

Model binding form posts to immutable objects

I’ve been working on porting over my blog to a static site generator and fired up an Azure Function to handle the form-comment to PR process to enable user comments to still be part of the site without using a 3rd party commenting system - more on that in a future post - and found the ASP.NET model binding for form posts distinctly lacking.

It’s been great getting back into .NET and brushing up some skills making the code clear, short and reusable. What I wanted was a super-clear action on my controller that tried to collect, validate and sanitize the data then if all was well create the pull request or report errors.

Ideally it would look like this;

[FunctionName("PostComment")]
public static async Task<HttpResponseMessage> Run([HttpTrigger(AuthorizationLevel.Anonymous, "post")] HttpRequestMessage request) {
    var form = await request.Content.ReadAsFormDataAsync();
    if (TryCreateComment(form, out Comment comment, out var errors))
        await CreateCommentAsPullRequest(comment);
    return request.CreateResponse(errors.Any()
      ? HttpStatusCode.BadRequest : HttpStatusCode.OK, String.Join("\n", errors));
}

To do that however we need a function capable of creating the Comment class from the form post. Sure you can manually do it field by field but that’s not very reusable, repetitive and of course no fun. The Comment class is also - like all good little objects - immutable.

Creating a function to do this is simple with a little bit of reflection;

private static object ConvertParameter(string parameter, Type targetType) {
    return String.IsNullOrWhiteSpace(parameter)
           ? null : TypeDescriptor.GetConverter(targetType).ConvertFrom(parameter);
}

private static bool TryCreateCommentFromForm(NameValueCollection form, out Comment comment, out List<string> errors) {
    var constructor = typeof(Comment).GetConstructors()[0];
    var values = constructor.GetParameters()
                            .ToDictionary(p => p.Name, p => ConvertParameter(form[p.Name], p.ParameterType)
                                      ?? (p.HasDefaultValue ? p.DefaultValue : new MissingRequiredValue()));
    errors = values.Where(p => p.Value is MissingRequiredValue)
                   .Select(p => $"Form value missing for '{p.Key}'").ToList();
    comment = errors.Any() ? null : (Comment)constructor.Invoke(values.Values.ToArray());
    return !errors.Any();
}

What this does is grab the constructor for the Comment and try to find keys in the form that match the parameter name. Any that are missing are reported as errors unless they have a default value in which case that is used. MissingRequiredValue is just an empty object to act as a sentinel. The use of TypeDescriptor.GetConverter means it should be quite happy handling ints, decimals, urls etc.

The constructor for Comment specifies which fields are required and the parameter names must match the form field names by convention. Any value that is optional has a default value that the constructor will happily fill in a sensible default for.

public Comment(string post_id, string message, string author, string email, 
    DateTime? date = null, Uri url = null, int? id = null, string gravatar = null) {
    this.post_id = pathValidChars.Replace(post_id, "-");
    this.message = message;
    this.author = author;
    this.email = email;
    this.date = date ?? DateTime.UtcNow;
    this.url = url;
    this.id = id ?? new { this.post_id, this.author, this.message, this.date }.GetHashCode();
    this.gravatar = gravatar ?? EncodeGravatar(email);
}

I’ll post more of the form commenting system source soon once it’s a bit better tested and I look into anti-spam integration. Ideally I’ll also provide an AWS Lambda variant of the code so you can choose (or load balance) comment posting and almost certainly get what you need on the free tier. For now the Jekyll rendering templates and WordPress exporter are available.

[)amien

Table per hierarchy in Azure Table Storage

If you’re coming from an ORM background to Azure Table Storage you might be wondering how to map class hierarchies to tables.

Documentation on the topic is hard to find unless you know the magic class name EntityResolver which you can discover by digging into the Azure Client for .NET source code.

Let’s say we have a basic blog style system (minimal fields shown):

public class Content {
  public string Id { get; set; }
  public string Title { get; set }
}

public class BlogPost : Content {
  public List<string> Topics { get; set; }
}

public class Page : Content {
  public String Slug { get; set; }
}

The trick is to create an instance of EntityResolver where T is your base class, e.g. Content. Strangely EntityResolver’s signature requires T implement new() so you can’t make your base class abstract.

Firstly we need to add to our base class some kind of identifier for the type – in ORM terms this is referred to as a discriminator. Then we’d override that in the sub-types to ensure new instances get the correct type set on insertion.

public class Content {
  public string Id { get; set; }
  public string Title { get; set }
}

public class BlogPost : Content {
  public List<string> Topics { get; set; }
}

public class Page : Content {
  public String Slug { get; set; }
}

Let’s say we want to store all of these in a table called ‘content’. We would typically write a small helper class to handle the cloud table and storage, e.g.

public class Content {
  public string Id { get; set; }
  public string Title { get; set }
  public virtual string ContentType { get; set; }
}

public class BlogPost : Content {
  public List<string> Topics { get; set; }
  public override string ContentType {
    get { return "blog"; }
    set { }
  }
}

public class Page : Content {
  public String Slug { get; set; }
  public override string ContentType {
    get { return "page"; }
    set { }
  }
}

With just that change you can actually start inserting rows into Azure Table Storage but querying them back will always result in Content types and saving those back will result in data loss!

We can however help the CloudTable client materialize the correct results by creating an EntityResolver:

EntityResolver<Content> contentResolver(partitionKey, rowKey, timestamp, properties, etag) {
  var contentType = properties["ContentType"].StringValue;
  switch (contentType) {
    case "blog": return new BlogPost();
    case "page": return new Page();
    default: throw new NotSupportedException(String.Format("Unknown ContentType '{0}'", contentType));
  }
}

Which is then passed into operations that materialize results. Note that some signatures don’t accept a resolver so find one that does even if it means supplying a default OperationContent. For example:

var query = table.CreateQuery<Content>().Where(c => c.PartitionKey == yearMonth);
var results = query.ExecuteQuery(query.AsTableQuery(), contentResolver, myRequestOptions, myOperationContext);

Given that these entity resolvers are essential to correctly materializing your results without data loss it’s worth wrapping the CloudTable client with the necessary setup/table-creation/entity resolver.

[)amien

LINQ to SQL tips and tricks #2

A few more useful and lesser-known techniques for using LINQ to SQL.

Take full control of the TSQL

There are times when LINQ to SQL refuses to cook up the TSQL you wanted either because it doesn’t support the feature or because it has a different idea what makes an optimal query.

In either case the Translate method allows you to deliver your own TSQL to LINQ to SQL to process as if it were its own with execution, materialization and identity mapping still honored. For example:

var db = new PeopleContext();
if (db.Connection.State == System.Data.ConnectionState.Closed)
    db.Connection.Open();
var cmd = db.GetCommand(db.Persons.Where(p => p.CountryID == 1));
cmd.CommandText = cmd.CommandText.Replace("[People] AS [t0]", "[People] AS [t0] WITH (NOLOCK)");
var results = db.Translate<Person>(cmd.ExecuteReader());

Complex stored procedures

When working with stored procedures the LINQ to SQL designer and SQLMetal tools need a way of figuring out what the return type will be. In order to do this without actually running the stored procedure itself they use the SET FMTONLY command set to ON so that SQL Server will just parse the stored procedure instead.

Unfortunately this parsing does not extend to dynamic SQL or temporary tables so you must change the return type from the scalar integer to one of the known entity types by hand. You could use the following command at the start to let it run regardless given the subsequent warning.

SET FMTONLY OFF

If your stored procedure can not safely handle being called at any time with null parameters set the return type by hand instead.

Cloning an entity

There are many reasons you might want to clone an entity – you may want to create many similar ones, you could want to keep it around longer than the DataContext it came from – whatever your reason implementing a Clone method can be a pain but taking advantage of the DataContractSerializer can make light work of this providing your DBML is set to enable serialization.

If you use discriminator sub-classing you will need to either ensure your type is cast to its concrete type or use my L2ST4 templates for now as .NET 3.5 SP1 doesn’t emit the necessary KnownType attributes to make this automatically happen (fixed in .NET 4.0). Add a simple method to serialize in-memory like this:

public static T Clone<T>(T source) {
    var dcs = new System.Runtime.Serialization.DataContractSerializer(typeof(T));
    using (var ms = new System.IO.MemoryStream()) {
        dcs.WriteObject(ms, source);
        ms.Seek(0, System.IO.SeekOrigin.Begin);
        return (T)dcs.ReadObject(ms);
    }
}

And then to clone simply:

var source = myQuery.First();
var cloned = Clone(source);

Be aware that this comes with a little overhead in the serialization and de-serialization process.

If that is a problem for you then why not grab those templates and make your entities implement ICloneable!

Check out part 3 of LINQ to SQL tips

[)amien

Avoiding SQL injection

Back in ’98 I was developing an extranet site for a local company when I realized that it would be open for exploit if somebody put single quotes in text fields. It was early in the development cycle so I fixed it and moved on, unable to find out how other people were avoiding the problem.

It turned out many were not and it became a well-known exploit called SQL injection. Unfortunately there are many developers who don’t know or appreciate the problem, and it is this:

If you build SQL by appending strings and data without correct encoding your application can be exploited. These exploits can range from exposing sensitive information, through to modification and deletion of data.

This problem is very real and applies to:

  • All SQL statements, not just SELECT
  • All database systems, not just MS SQL or MySQL
  • All programming environments, not just C#, PHP or ASP
  • All data, most essentially that obtained from end-users, regardless of client-side checking

Let’s walk through an example and see how it works and what can be done to avoid it.

Example: User login

We have a user-name and password from a web form and want to get the users ID from the database, or nothing if it wasn’t valid. We want to send something like this SQL statement to our database.

SELECT UserID FROM Users WHERE UserName='Bob' AND Password='test'

And so a developer might do something like this (in C# using .Net);

var dr = connection.Execute("SELECT UserID FROM Users WHERE UserName='" + Request("UserName") + "' AND Password='" + Request("Password") + '");
if (dr.Read()) userId = dr.GetInt32(dr.GetOrdinal("UserID"));

The problem here is that if there is a ‘ in the form fields it effectively breaks out of the selection criteria and allows the end user to add extra criteria or even commands to what we are sending to the database server. Should they enters the following into the password form field…

' OR ''='

Then our code above will send the following SQL to the database:

SELECT UserID FROM Users WHERE UserName='aaa' AND Password='' OR ''=''

Which will return every record in the database and our code will let him log in as the first user it finds – normally a developer or administrator account. Ouch!

Bad solution: Encode it yourself

One solution often adopted is to always ensure all string input has a single-quote replaced by two single-quotes, which is what SQL server expects if you really want to send it a single quote.

This solution fails in that it doesn’t handle numbers or dates and falls apart in that both numbers and dates are often regionally formatted.

Good solution: Let the DB client encode it

A much better solution is to use your environment to perform all the proper encoding for you. As well as protecting you from such exploits you’ll also avoid localization problems where the string representation of something on your client is interpreted differently in your database. This can be a real problem in the UK where dates formatted by a UK web-server are sent to a misconfigured SQL server expecting US formatting and the days and months become transposed without error.

var cmd = new SqlCommand("SELECT UserID FROM Users WHERE UserName=@UserName AND Password=@Password");
cmd.Parameters.Add(new SqlParameter("@UserName", System.Data.SqlDbType.NVarChar, 255, Request("UserName")))
cmd.Parameters.Add(new SqlParameter("@Password", System.Data.SqlDbType.NVarChar, 255, Request("Password")))
dr = cmd.ExecuteReader();
if (dr.Read()) userId = dr.GetInt32(dr.GetOrdinal("UserID"));

Okay there is no doubt this is a little longer but it takes care of all our encoding and localization worries, and if for example you want to insert a lot of data into the database, creating the command and parameters once, then just setting the parameters for each insert (and executing it) will run faster than lots of string building inserts…

Moral of the story

Always encode data properly. If you can use the provided methods and functions to do so. If none are provided grab the specification and find out all the special characters used. Learn what encoding and escape sequences are used and apply them properly.

A few places where data should be encoded properly:

  • HTML – Obviously < and > have special meanings and need to be escaped. ASP.Net’s controls will take care of this if you set the .Text or .InnerText properties but set .InnerHTML at your own peril. Old ASP has the Server.HTMLEncode() function.
  • URL – A whole host of rules but the query string is often modified in code. Use URLEncode() or something similar especially if you want XHTML compliance too.
  • XML – Again a whole host of rules for what is valid data. Either use an XML object to write out your data (MSXML, Xerces etc) or maybe even store it in [[CDATA sections.
  • CSV – Even comma-separated value files have encoding rules. What do you need to do if text is going to have a ” in a field. What happens if a number contains a comma! Find out or use a well regarded library to do it for you.

Notes about the example

A better login system would not allow the web server direct access to sensitive data such as the user table. All access to sensitive information should be through SP’s that enforce those restrictions.

Such a login system would therefore call a stored procedure that logged the attempted, decided if it was valid, and locked out the user if too many incorrect attempts. I’ll blog that if anyone is interested.

Even if you don’t want to do that, returning a single field is better achieved by using ExecuteScalar() and forgetting a data reader.

Microsoft have a developer how-to on injection.

[)amien