Table per hierarchy in Azure Table Storage

If you’re coming from an ORM background to Azure Table Storage, you might be wondering how to map class hierarchies to tables.

Documentation on the topic is hard to find unless you know the magic class name EntityResolver which you can discover by digging into the Azure Client for .NET source code.

Let’s say we have a basic blog-style system (minimal fields shown):

public class Content {
  public string Id { get; set; }
  public string Title { get; set }
}

public class BlogPost : Content {
  public List<string> Topics { get; set; }
}

public class Page : Content {
  public String Slug { get; set; }
}

The trick is to create an instance of EntityResolver where T is your base class, e.g. Content. Strangely EntityResolver’s signature requires T implement new() so you can’t make your base class abstract.

Firstly we need to add to our base class an identifier for the type (in ORM terminology a type discriminator). Then we override that in the sub-types to ensure new instances get the correct type set on insertion.

public class Content {
  public string Id { get; set; }
  public string Title { get; set }
}

public class BlogPost : Content {
  public List<string> Topics { get; set; }
}

public class Page : Content {
  public String Slug { get; set; }
}

Let’s say we want to store all of these in a table called ‘content’. We would typically write a small helper class to handle the cloud table and storage, e.g.

public class Content {
  public string Id { get; set; }
  public string Title { get; set }
  public virtual string ContentType { get; set; }
}

public class BlogPost : Content {
  public List<string> Topics { get; set; }
  public override string ContentType {
    get { return "blog"; }
    set { }
  }
}

public class Page : Content {
  public String Slug { get; set; }
  public override string ContentType {
    get { return "page"; }
    set { }
  }
}

With that change, you can start inserting rows into Azure Table Storage. Querying them back results in Content types, and then saving those back again results in data loss!

We must help the CloudTable client materialize the correct results by creating an EntityResolver:

EntityResolver<Content> contentResolver(partitionKey, rowKey, timestamp, properties, etag) {
  var contentType = properties["ContentType"].StringValue;
  switch (contentType) {
    case "blog": return new BlogPost();
    case "page": return new Page();
    default: throw new NotSupportedException(String.Format("Unknown ContentType '{0}'", contentType));
  }
}

Which passes into operations that materialize results. Note that some signatures don’t accept a resolver, so find one that does even if it means supplying a default OperationContent. For example:

var query = table.CreateQuery<Content>().Where(c => c.PartitionKey == yearMonth);
var results = query.ExecuteQuery(query.AsTableQuery(), contentResolver, myRequestOptions, myOperationContext);

Given that these entity resolvers are essential to correctly materializing your results without data loss, it’s worth wrapping the CloudTable client with the necessary setup/table-creation/entity resolver.

[)amien

0 responses