Posts tagged with azure

WordPress to Jekyll part 2 - Comments & commenting

Part of my series on migrating from WordPress to Jekyll.

  1. My history & reasoning
  2. Comments & commenting
  3. Site search
  4. Categories & tags
  5. Hosting & building

I do enjoy discussion and debate whether designing software or writing articles. Many times the comments have explored the subject further or offered corrections or additional insights and tips. For me, they are vital on my blog so I was somewhat disappointed that Jekyll provides nothing out of the box to handle them.

Third-party solutions like Disqus exist that require you either pay a subscription or have ads inlined with the comments. That $9/month adds up and the alternative of injecting ads onto my blog just to support comment infrastructure doesn’t sit right with me.

Storing comments

So what does Jekyll have that we could build upon?

Well, one very useful feature is the ability to process ‘site data’ held in YML files as a kind of data source for generating content via the Liquid templating language.

So, if we store each comment in a file named _data/{blog_post_slug}/{comment_id}.yml with this format:

id: 12345
name: Damien Guard
email: damieng@gmail.com
gravatar: dc72963e7279d34c85ed4c0b731ce5a9
url: https://damieng.com
date: 2007-12-18 18:51:55
message: "This is a great solution for 'dynamic' comments on a static blog!"

Then we have a model where we can gather all the ones that respond to a post by traversing a single folder and performing some sorting.

By using one-file-per-comment we also make deleting, approving and managing comments as easy as possible.

Rendering comments

Now we can create test data and attempt rendering. I created three Jekyll includes that match my WordPress theme, they are:

  • Render an individual comment (comment.html)
  • Show a form to accept a new comment (new-comment.html)
  • Loop over individual comments for a post (comments.html)

I’ve included all three includes you can copy to your Jekyll _includes folder.

The simplest option is to then just include the comments.html file. For example, my blog post template file looks like this:

---
layout: default
---
<div class="post {{ page.class }}">
  {% include item.html %}
  {{ page.content }}
  {% include comments.html %}
</div>

You’ll also need to add the following line to your Jekyll _config.yml. This is required so my sort function can work due to a couple of restrictions in Jekyll.

emptyArray: []

Exporting comments from WordPress

The next step is getting all the comments out of your existing system. I was using WordPress so created a simple PHP script that will extract them all into individual files with the right metadata and structure.

  • Upload this file to your site
  • Access export-blog-comments.php via your browser and wait for it to complete
  • Download the /comments/ folder over SSH and then remove it and the export-blog-comments.php from your server
  • Copy the /comments/ folder into your Jekyll _data/ folder

Disqus users should check out Phil Haack’s Disqus exporter!

Accepting new comments with an Azure function

We can now render existing comments but what about accepting new ones?

At a minimum we need to accept a HTTP form post and commit a new YML file. Ideally with some validation, a redirect to a thanks page and with the new YML file in a pull request or other moderation facility. Merging the PR will cause a site rebuild and publish the new comment :)

Platform and choices

I chose:

  1. GitHub to host my blog and comments as I use it for my code projects
  2. Azure Function App for the form-post-to-pull-request - details below
  3. C# for the function - a great language I know with good libs

I went with Azure Function Apps for a few reasons:

  • They accept HTTP/HTTPS directly without configuring an “API Gateway”
  • Comment posting is a short-lived operation that happens quite infrequently
  • Free monthly grants of 1 m executions/400,000 GB-s should mean no charge
  • Taking a second or two to spin-up the function should be fine in the users context

(Disclaimer: I have a free MSDN subscription that includes Azure credits as part of my ASP Insider membership although I do not expect this solution to use any of it)

Other platforms

You could easily port this to another C#-capable environment - or port the solution entirely to another language.

If you have a lot of comments you could run the function on three platforms and round-robin the DNS to take advantage of the free usage tiers on each.

How it works

The form receiver function for comments relies on a couple of libraries to deal with YML and GitHub but is otherwise self-explanatory. What it does is:

  1. Receives the form post over HTTP/HTTPS
  2. Attempts to create an instance of the Comment class by mapping form keys to constructor args
  3. Emits errors if any constructor args are missing (unless they have a default)
  4. Creates a new branch against your default using the GitHub OctoKit.NET library
  5. Creates a commit to the new branch with the Comment serialized to YML using YamlDotNet
  6. Creates a pull request to merge the branch with an informative title and body

Installation

Installation requires a few steps but can then just update whenever you update your fork.

  1. Fork the jekyll-blog-comments-azure repo
  2. Create a Function App in the Azure portal (I went with consumption plan on Windows)
  3. Go to Deployment Options, tap Setup and choose GitHub
  4. Authorize it to your GitHub account
  5. Configure Project to your fork of jekyll-blog-comments-azure
  6. Configure Branch to master

You will also need to setup two Application Settings for your function so it can create the necessary pull requests, they are:

  • GitHubToken should be a personal access token with repo rights
  • PullRequestRepository should contain the org and repo name, e.g. damieng/my-blog

The final step is to modify your Jekyll _config.yml so it knows where to post the form. For example:

comments:
  receiver: https://damiengapp.azurewebsites.net/api/PostComment

You should now be able to post a comment on your blog and see it turn up as a pull request against your repository!

Extra steps

  • You can have post authors replies highlighted differently
  • Threaded comments could be supported - feel free to send a pull request or I’ll get to this in time
  • Anti-spam measures will likely need to be improved at some point - right now this is just client-side in JS that requires a second ‘Confirm comment’ click

In Part 3 of the series I’ll go into how I implemented my site search with Algolia!

[)amien

Model binding form posts to immutable objects

I’ve been working on porting over my blog to a static site generator and fired up an Azure Function to handle the form-comment to PR process to enable user comments to still be part of the site without using a 3rd party commenting system - more on that in a future post - and found the ASP.NET model binding for form posts distinctly lacking.

It’s been great getting back into .NET and brushing up some skills making the code clear, short and reusable. What I wanted was a super-clear action on my controller that tried to collect, validate and sanitize the data then if all was well create the pull request or report errors.

Ideally it would look like this;

[FunctionName("PostComment")]
public static async Task<HttpResponseMessage> Run([HttpTrigger(AuthorizationLevel.Anonymous, "post")] HttpRequestMessage request) {
    var form = await request.Content.ReadAsFormDataAsync();
    if (TryCreateComment(form, out Comment comment, out var errors))
        await CreateCommentAsPullRequest(comment);
    return request.CreateResponse(errors.Any()
      ? HttpStatusCode.BadRequest : HttpStatusCode.OK, String.Join("\n", errors));
}

To do that however we need a function capable of creating the Comment class from the form post. Sure you can manually do it field by field but that’s not very reusable, repetitive and of course no fun. The Comment class is also - like all good little objects - immutable.

Creating a function to do this is simple with a little bit of reflection;

private static object ConvertParameter(string parameter, Type targetType) {
    return String.IsNullOrWhiteSpace(parameter)
           ? null : TypeDescriptor.GetConverter(targetType).ConvertFrom(parameter);
}

private static bool TryCreateCommentFromForm(NameValueCollection form, out Comment comment, out List<string> errors) {
    var constructor = typeof(Comment).GetConstructors()[0];
    var values = constructor.GetParameters()
                            .ToDictionary(p => p.Name, p => ConvertParameter(form[p.Name], p.ParameterType)
                                      ?? (p.HasDefaultValue ? p.DefaultValue : new MissingRequiredValue()));
    errors = values.Where(p => p.Value is MissingRequiredValue)
                   .Select(p => $"Form value missing for '{p.Key}'").ToList();
    comment = errors.Any() ? null : (Comment)constructor.Invoke(values.Values.ToArray());
    return !errors.Any();
}

What this does is grab the constructor for the Comment and try to find keys in the form that match the parameter name. Any that are missing are reported as errors unless they have a default value in which case that is used. MissingRequiredValue is just an empty object to act as a sentinel. The use of TypeDescriptor.GetConverter means it should be quite happy handling ints, decimals, urls etc.

The constructor for Comment specifies which fields are required and the parameter names must match the form field names by convention. Any value that is optional has a default value that the constructor will happily fill in a sensible default for.

public Comment(string post_id, string message, string author, string email,
    DateTime? date = null, Uri url = null, int? id = null, string gravatar = null) {
    this.post_id = pathValidChars.Replace(post_id, "-");
    this.message = message;
    this.author = author;
    this.email = email;
    this.date = date ?? DateTime.UtcNow;
    this.url = url;
    this.id = id ?? new { this.post_id, this.author, this.message, this.date }.GetHashCode();
    this.gravatar = gravatar ?? EncodeGravatar(email);
}

I’ll post more of the form commenting system source soon once it’s a bit better tested and I look into anti-spam integration. Ideally I’ll also provide an AWS Lambda variant of the code so you can choose (or load balance) comment posting and almost certainly get what you need on the free tier. For now the Jekyll rendering templates and WordPress exporter are available.

[)amien

Differences between Azure Functions v1 and v2 in C#

I’ve been messing around in the .NET ecosystem again and have jumped back in with Azure Functions (similar to AWS Lambda) to get my blog onto 99% static hosting. I immediately ran into the API changes between v1 and v2 (currently in beta).

These changes are because v1 was based around .NET 4.6 using WebAPI 2 while the v2 is based on ASP.NET Core which uses MVC 6. There are some guides around to converting but none in the pure context of Azure Functions.

I’ll illustrate with a PageViewCount sample that uses Table Storage to retrieve and update a simple page count.

v1 (.NET 4.61 / WebAPI 2)

[FunctionName("PageView")]
public static async Task<HttpResponseMessage> Run(
    [HttpTrigger(AuthorizationLevel.Anonymous, "get")]HttpRequestMessage req, TraceWriter log) {
    var page = req.MessageUri.ParseQueryString()["page"];
    if (String.IsNullOrEmpty(page))
        return req.CreateErrorResponse(HttpStatusCode.BadRequest, "'page' parameter missing.");

    var table = Helpers.GetTableReference("PageViewCounts");
    var pageView = await table.RetrieveAsync<PageViewCount>("damieng.com", page)
        ?? new PageViewCount(page) { ViewCount = 0 };
    var operation = pageView.ViewCount == 0
        ? TableOperation.Insert(pageView)
        : TableOperation.Replace(pageView);
    pageView.ViewCount++;
    await table.ExecuteAsync(operation);

    return req.CreateResponse(HttpStatusCode.OK, new { viewCount = pageView.ViewCount });
}

v2 (ASP.NET Core / MVC 6)

[FunctionName("PageView")]
public static async Task<IActionResult> Run(
    [HttpTrigger(AuthorizationLevel.Anonymous, "get")]HttpRequest req, TraceWriter log) {
    var page = req.Query["page"];
    if (String.IsNullOrEmpty(page))
       return new BadRequestObjectResult("'page' parameter missing.");

    var table = Helpers.GetTableReference("PageViewCounts");
    var pageView = await table.RetrieveAsync<PageViewCount>("damieng.com", page)
        ?? new PageViewCount(page) { ViewCount = 0 };
    var operation = pageView.ViewCount == 0
        ? TableOperation.Insert(pageView)
        : TableOperation.Replace(pageView);
    pageView.ViewCount++;
    await table.ExecuteAsync(operation);

    return new OkObjectResult(new { viewCount = pageView.ViewCount });
}

Differences

The main differences are that:

  1. Return types are IActionResult/ObjectResult objects rather than extension methods against HttpRequestMessage (easier to mock/create custom ones)
  2. Input is the HttpRequest object rather than HttpResponseMessage (easier to get query parameters)

If you get the error ‘Can not create abstract class’ when executing your function then you are trying to use the wrong tech for that environment.

Helpers

Both classes above use a small helper class to take care of Table Storage which doesn’t have the nicest to use API. A wrapper much like a data context that ensures the right types go to the right table names might be an even better options.

static class Helpers {
    public static CloudStorageAccount GetCloudStorageAccount() {
        var connection = ConfigurationManager.AppSettings["DamienGTableStorage"];
        return connection == null ? CloudStorageAccount.DevelopmentStorageAccount : CloudStorageAccount.Parse(connection);
    }

    public static CloudTable GetTableReference(string name) {
        return GetCloudStorageAccount().CreateCloudTableClient().GetTableReference(name);
    }

    public static async Task<T> RetrieveAsync<T>(this CloudTable cloudTable, string partitionKey, string rowKey)
        where T:TableEntity {
        var tableResult = await cloudTable.ExecuteAsync(TableOperation.Retrieve<T>(partitionKey, rowKey));
        return (T)tableResult.Result;
    }
}

To compile

If you want to compile this or maybe you were just looking for code to do a simple page counter here’s the missing TableEntity class;

public class PageViewCount : TableEntity
{
    public PageViewCount(string pageName)
    {
        PartitionKey = "damieng.com";
        RowKey = pageName;
    }

    public PageViewCount() { }
    public int ViewCount { get; set; }
}

[)amien

Table per hierarchy in Azure Table Storage

If you’re coming from an ORM background to Azure Table Storage you might be wondering how to map class hierarchies to tables.

Documentation on the topic is hard to find unless you know the magic class name EntityResolver which you can discover by digging into the Azure Client for .NET source code.

Let’s say we have a basic blog style system (minimal fields shown):

public class Content {
  public string Id { get; set; }
  public string Title { get; set }
}

public class BlogPost : Content {
  public List<string> Topics { get; set; }
}

public class Page : Content {
  public String Slug { get; set; }
}

The trick is to create an instance of EntityResolver where T is your base class, e.g. Content. Strangely EntityResolver’s signature requires T implement new() so you can’t make your base class abstract.

Firstly we need to add to our base class some kind of identifier for the type – in ORM terms this is referred to as a discriminator. Then we’d override that in the sub-types to ensure new instances get the correct type set on insertion.

public class Content {
  public string Id { get; set; }
  public string Title { get; set }
}

public class BlogPost : Content {
  public List<string> Topics { get; set; }
}

public class Page : Content {
  public String Slug { get; set; }
}

Let’s say we want to store all of these in a table called ‘content’. We would typically write a small helper class to handle the cloud table and storage, e.g.

public class Content {
  public string Id { get; set; }
  public string Title { get; set }
  public virtual string ContentType { get; set; }
}

public class BlogPost : Content {
  public List<string> Topics { get; set; }
  public override string ContentType {
    get { return "blog"; }
    set { }
  }
}

public class Page : Content {
  public String Slug { get; set; }
  public override string ContentType {
    get { return "page"; }
    set { }
  }
}

With just that change you can actually start inserting rows into Azure Table Storage but querying them back will always result in Content types and saving those back will result in data loss!

We can however help the CloudTable client materialize the correct results by creating an EntityResolver:

EntityResolver<Content> contentResolver(partitionKey, rowKey, timestamp, properties, etag) {
  var contentType = properties["ContentType"].StringValue;
  switch (contentType) {
    case "blog": return new BlogPost();
    case "page": return new Page();
    default: throw new NotSupportedException(String.Format("Unknown ContentType '{0}'", contentType));
  }
}

Which is then passed into operations that materialize results. Note that some signatures don’t accept a resolver so find one that does even if it means supplying a default OperationContent. For example:

var query = table.CreateQuery<Content>().Where(c => c.PartitionKey == yearMonth);
var results = query.ExecuteQuery(query.AsTableQuery(), contentResolver, myRequestOptions, myOperationContext);

Given that these entity resolvers are essential to correctly materializing your results without data loss it’s worth wrapping the CloudTable client with the necessary setup/table-creation/entity resolver.

[)amien