WordPress to Jekyll part 2 - Comments & commenting

Part of my series on migrating from WordPress to Jekyll.

  1. My history & reasoning
  2. Comments & commenting
  3. Site search
  4. Categories & tags
  5. Hosting & building

I do enjoy discussion and debate whether designing software or writing articles. Many times the comments have explored the subject further or offered corrections or additional insights and tips. For me, they are vital on my blog so I was somewhat disappointed that Jekyll provides nothing out of the box to handle them.

Third-party solutions like Disqus exist that require you either pay a subscription or have ads inlined with the comments. That $9/month adds up and the alternative of injecting ads onto my blog just to support comment infrastructure doesn’t sit right with me.

Storing comments

So what does Jekyll have that we could build upon?

Well, one very useful feature is the ability to process ‘site data’ held in YML files as a kind of data source for generating content via the Liquid templating language.

So, if we store each comment in a file named _data/{blog_post_slug}/{comment_id}.yml with this format:

id: 12345
name: Damien Guard
email: damieng@gmail.com
gravatar: dc72963e7279d34c85ed4c0b731ce5a9
url: https://damieng.com
date: 2007-12-18 18:51:55
message: "This is a great solution for 'dynamic' comments on a static blog!"

Then we have a model where we can gather all the ones that respond to a post by traversing a single folder and performing some sorting.

By using one-file-per-comment we also make deleting, approving and managing comments as easy as possible.

Rendering comments

Now we can create test data and attempt rendering. I created three Jekyll includes that match my WordPress theme, they are:

  • Render an individual comment (comment.html)
  • Show a form to accept a new comment (new-comment.html)
  • Loop over individual comments for a post (comments.html)

I’ve included all three includes you can copy to your Jekyll _includes folder.

The simplest option is to then just include the comments.html file. For example, my blog post template file looks like this:

---
layout: default
---
<div class="post {{ page.class }}">
  {% include item.html %}
  {{ page.content }}
  {% include comments.html %}
</div>

You’ll also need to add the following line to your Jekyll _config.yml. This is required so my sort function can work due to a couple of restrictions in Jekyll.

emptyArray: []

Exporting comments from WordPress

The next step is getting all the comments out of your existing system. I was using WordPress so created a simple PHP script that will extract them all into individual files with the right metadata and structure.

  • Upload this file to your site
  • Access export-blog-comments.php via your browser and wait for it to complete
  • Download the /comments/ folder over SSH and then remove it and the export-blog-comments.php from your server
  • Copy the /comments/ folder into your Jekyll _data/ folder

Disqus users should check out Phil Haack’s Disqus exporter!

Accepting new comments with an Azure function

We can now render existing comments but what about accepting new ones?

At a minimum we need to accept a HTTP form post and commit a new YML file. Ideally with some validation, a redirect to a thanks page and with the new YML file in a pull request or other moderation facility. Merging the PR will cause a site rebuild and publish the new comment :)

Platform and choices

I chose:

  1. GitHub to host my blog and comments as I use it for my code projects
  2. Azure Function App for the form-post-to-pull-request - details below
  3. C# for the function - a great language I know with good libs

I went with Azure Function Apps for a few reasons:

  • They accept HTTP/HTTPS directly without configuring an “API Gateway”
  • Comment posting is a short-lived operation that happens quite infrequently
  • Free monthly grants of 1 m executions/400,000 GB-s should mean no charge
  • Taking a second or two to spin-up the function should be fine in the users context

(Disclaimer: I have a free MSDN subscription that includes Azure credits as part of my ASP Insider membership although I do not expect this solution to use any of it)

Other platforms

You could easily port this to another C#-capable environment - or port the solution entirely to another language.

If you have a lot of comments you could run the function on three platforms and round-robin the DNS to take advantage of the free usage tiers on each.

How it works

The form receiver function for comments relies on a couple of libraries to deal with YML and GitHub but is otherwise self-explanatory. What it does is:

  1. Receives the form post over HTTP/HTTPS
  2. Attempts to create an instance of the Comment class by mapping form keys to constructor args
  3. Emits errors if any constructor args are missing (unless they have a default)
  4. Creates a new branch against your default using the GitHub OctoKit.NET library
  5. Creates a commit to the new branch with the Comment serialized to YML using YamlDotNet
  6. Creates a pull request to merge the branch with an informative title and body

Installation

Installation requires a few steps but can then just update whenever you update your fork.

  1. Fork the jekyll-blog-comments-azure repo
  2. Create a Function App in the Azure portal (I went with consumption plan on Windows)
  3. Go to Deployment Options, tap Setup and choose GitHub
  4. Authorize it to your GitHub account
  5. Configure Project to your fork of jekyll-blog-comments-azure
  6. Configure Branch to master

You will also need to setup two Application Settings for your function so it can create the necessary pull requests, they are:

  • GitHubToken should be a personal access token with repo rights
  • PullRequestRepository should contain the org and repo name, e.g. damieng/my-blog

The final step is to modify your Jekyll _config.yml so it knows where to post the form. For example:

comments:
  receiver: https://damiengapp.azurewebsites.net/api/PostComment

You should now be able to post a comment on your blog and see it turn up as a pull request against your repository!

Extra steps

  • You can have post authors replies highlighted differently
  • Threaded comments could be supported - feel free to send a pull request or I’ll get to this in time
  • Anti-spam measures will likely need to be improved at some point - right now this is just client-side in JS that requires a second ‘Confirm comment’ click

In Part 3 of the series I’ll go into how I implemented my site search with Algolia!

[)amien

WordPress to Jekyll part 1 - My history and reasoning

Part of my series on migrating from WordPress to Jekyll.

  1. My history & reasoning
  2. Comments & commenting
  3. Site search
  4. Categories & tags
  5. Hosting & building

It’s hard to believe it was 13 years ago back in a cold December on the little island of Guernsey when I decided to start blogging. I’d had a static site with a few odd musings on it since 2000 but this was to be conversational, regularly updated and with more technical content. Blogspot seemed the easiest way to get started.

Briefly hosted at home

Within 18 months of regular blogging I’d moved over to Subtext which being a .NET app required Windows hosting so threw it on a small Shuttle PC on my home DSL. This is where I started using it as an experiment for CSS and web techniques but within a year I’d had my 1MB DSL brought to it’s knees twice through articles being featured on Boing Boing.

I did however contribute a little to the project and started chatting with the maintainer - Phil Haack - who I’d end up meeting when we both joined Microsoft years later and is a friend to this day.

Landing on WordPress

DamienG theme in 2008 In 2007 I migrated to a PHP based CMS that was making a name for itself called WordPress. My blog would remain on WordPress for 10 years across shared hosting, VMs and dedicated servers.

One server was caught in an explosion at the ISP, another time my site got pwned through a WordPress vulnerability. I switched themes several times before creating my own super-light MootStrap theme based around the BootStrap 2 layout and nav bar. I messed with wp-SuperCache trying to improve performance and scalability before switching out the PHP engine for HHVM as well as using NGINX instead of Apache and MariaDB instead of MySQL all in an attempt to eek out a bit of extra performance.

While my theme lives on today - for now at least - MootStrap and PHP are no more as I switched over to the Jekyll static site generator earlier this month after a long meandering journey to get there.

Why Jekyll?

I’ve had a lot of success with Jekyll on some other sites I run. Hosting it on GitHub pages or S3 with a CloudFront brings a lot of benefit:

  1. Cost - S3 and CloudFront cost pennies rather than $40+ a month
  2. Security - there’s no code running to be exploited, no WordPress plug-in back-doors
  3. Speed - CloudFront is a geo-distributed CDN and S3 is no slouch either
  4. Editing - text files are easier to process, find, manipulate and markdown much easier to write

The price aspect is definitely worth mentioning again. With the occasional bursts in traffic my site hosting generally worked out around $40 a month for a decent VM. On AWS I’m expecting it to max out at $3 despite these improvements and benefits.

Of course part of the other reason is static site generators are interesting and I like to play.

Some challenges

Jekyll is a static site generator. That is you run the tool somewhere and it produces plain html files with zero server-side code left in them. By its very nature is going to not have support for:

  • Comments - No way to accept or render them
  • Search - No site search facility
  • URL control - Difficult to match the paging/tags/categories with default plugins

Surprisingly however there are blog-friendly facilities where static generation can support it, specifically:

[)amien

Comma-separated parameter values in WebAPI

The model binding mechanism in ASP.NET is pretty slick - it’s clever and highly extensible and built on TypeDescriptor system for all sorts of re-use that lets you get out of having to write boilerplate code to map between CLR objects and their web representations.

One surprising thing however is that out of the box neither WebAPI or MVC support comma-separated parameter values when bound to an array, e.g.

public class MyController : Controller {
    public string Page([FromUri]int[] ids) {
        return String.Join(" ; ", ids);
    }
}

Will only return 1 ; 2 ; 3 when supplied with /my/page?ids=1&ids=2&ids=3 and if you instead give it /my/page?ids=1,2,3 it will fail.

The reason for this was likely because there isn’t a standard for this at all and that the former - supported - scenario maps to what forms do when they post multiple value selections such as that in a select list box. The latter however is much more readable and is expected by some client frameworks and supported by some other web frameworks such as the Java Spring MVC framework.

Of course that extensible system lets us easily extend this behavior so that we can support both transparently - and interestingly enough - even mix-and-match on the same URL. So for example;

/my/page?ids=1,2&ids=3 will now return 1 ; 2 ; 3 in our example.

Although this supports both types if you are currently using commas in your number format this would break your app. e.g. ?ids=1,200&amp;ids=3,500 would have been correctly received as 1200, 500 but now would be incorrectly received as 1, 200, 3, 500

CommaSeparatedArrayModelBinder class

The source is available in the DamienGKit project but also here.

Out of the box it supports integer types and Guid’s although you could extend it to floats and decimals – again just be careful with that formatting!

public class CommaSeparatedArrayModelBinder : IModelBinder {
    private static readonly Type[] supportedElementTypes = {
        typeof(int), typeof(long), typeof(short), typeof(byte),
        typeof(uint), typeof(ulong), typeof(ushort), typeof(Guid)
    };

    public bool BindModel(HttpActionContext actionContext, ModelBindingContext bindingContext) {
        if (!IsSupportedModelType(bindingContext.ModelType)) return false;
        var valueProviderResult = bindingContext.ValueProvider.GetValue(bindingContext.ModelName);
        var stringArray = valueProviderResult?.AttemptedValue
            ?.Split(new[] { ',' }, StringSplitOptions.RemoveEmptyEntries);
        if (stringArray == null) return false;
        var elementType = bindingContext.ModelType.GetElementType();
        if (elementType == null) return false;

        bindingContext.Model = CopyAndConvertArray(stringArray, elementType);
        return true;
    }

    private static Array CopyAndConvertArray(IReadOnlyList<string> sourceArray, Type elementType) {
        var targetArray = Array.CreateInstance(elementType, sourceArray.Count);
        if (sourceArray.Count > 0) {
            var converter = TypeDescriptor.GetConverter(elementType);
            for (var i = 0; i < sourceArray.Count; i++)
                targetArray.SetValue(converter.ConvertFromString(sourceArray[i]), i);
        }
        return targetArray;
    }

    internal static bool IsSupportedModelType(Type modelType) {
        return modelType.IsArray && modelType.GetArrayRank() == 1
                && modelType.HasElementType
                && supportedElementTypes.Contains(modelType.GetElementType());
    }

}

public class CommaSeparatedArrayModelBinderProvider : ModelBinderProvider {
    public override IModelBinder GetBinder(HttpConfiguration configuration, Type modelType) {
        return CommaSeparatedArrayModelBinder.IsSupportedModelType(modelType)
            ? new CommaSeparatedArrayModelBinder() : null;
    }
}

To register

It’s necessary to register ModelBinderProviders with your ASP.NET application at start-up - usually in the WebApiConfig.cs file.

public static class WebApiConfig {
    public static void Register(HttpConfiguration config) {
        // All your usual configuration up here
        config.Services.Insert(typeof(ModelBinderProvider), 0, new CommaSeparatedArrayModelBinderProvider());
    }
}

[)amien

Model binding form posts to immutable objects

I’ve been working on porting over my blog to a static site generator and fired up an Azure Function to handle the form-comment to PR process to enable user comments to still be part of the site without using a 3rd party commenting system - more on that in a future post - and found the ASP.NET model binding for form posts distinctly lacking.

It’s been great getting back into .NET and brushing up some skills making the code clear, short and reusable. What I wanted was a super-clear action on my controller that tried to collect, validate and sanitize the data then if all was well create the pull request or report errors.

Ideally it would look like this;

[FunctionName("PostComment")]
public static async Task<HttpResponseMessage> Run([HttpTrigger(AuthorizationLevel.Anonymous, "post")] HttpRequestMessage request) {
    var form = await request.Content.ReadAsFormDataAsync();
    if (TryCreateComment(form, out Comment comment, out var errors))
        await CreateCommentAsPullRequest(comment);
    return request.CreateResponse(errors.Any()
      ? HttpStatusCode.BadRequest : HttpStatusCode.OK, String.Join("\n", errors));
}

To do that however we need a function capable of creating the Comment class from the form post. Sure you can manually do it field by field but that’s not very reusable, repetitive and of course no fun. The Comment class is also - like all good little objects - immutable.

Creating a function to do this is simple with a little bit of reflection;

private static object ConvertParameter(string parameter, Type targetType) {
    return String.IsNullOrWhiteSpace(parameter)
           ? null : TypeDescriptor.GetConverter(targetType).ConvertFrom(parameter);
}

private static bool TryCreateCommentFromForm(NameValueCollection form, out Comment comment, out List<string> errors) {
    var constructor = typeof(Comment).GetConstructors()[0];
    var values = constructor.GetParameters()
                            .ToDictionary(p => p.Name, p => ConvertParameter(form[p.Name], p.ParameterType)
                                      ?? (p.HasDefaultValue ? p.DefaultValue : new MissingRequiredValue()));
    errors = values.Where(p => p.Value is MissingRequiredValue)
                   .Select(p => $"Form value missing for '{p.Key}'").ToList();
    comment = errors.Any() ? null : (Comment)constructor.Invoke(values.Values.ToArray());
    return !errors.Any();
}

What this does is grab the constructor for the Comment and try to find keys in the form that match the parameter name. Any that are missing are reported as errors unless they have a default value in which case that is used. MissingRequiredValue is just an empty object to act as a sentinel. The use of TypeDescriptor.GetConverter means it should be quite happy handling ints, decimals, urls etc.

The constructor for Comment specifies which fields are required and the parameter names must match the form field names by convention. Any value that is optional has a default value that the constructor will happily fill in a sensible default for.

public Comment(string post_id, string message, string author, string email,
    DateTime? date = null, Uri url = null, int? id = null, string gravatar = null) {
    this.post_id = pathValidChars.Replace(post_id, "-");
    this.message = message;
    this.author = author;
    this.email = email;
    this.date = date ?? DateTime.UtcNow;
    this.url = url;
    this.id = id ?? new { this.post_id, this.author, this.message, this.date }.GetHashCode();
    this.gravatar = gravatar ?? EncodeGravatar(email);
}

I’ll post more of the form commenting system source soon once it’s a bit better tested and I look into anti-spam integration. Ideally I’ll also provide an AWS Lambda variant of the code so you can choose (or load balance) comment posting and almost certainly get what you need on the free tier. For now the Jekyll rendering templates and WordPress exporter are available.

[)amien