Posts in category development - page 3

WordPress to Jekyll part 3 - Site search

Site search is a feature that WordPress got right. Analytics also tells me it is popular. A static site is at a disadvantage, but we have some options to address that.

Considering options

My first consideration was to use Google Site Search alas it was deprecated last year. There are alternative options, but few are free. I agree that people should be paid for their services, something has to keep the lights on, but a small personal blog with no income stream can’t justify this cost.

My next thought was to generate reverse-index JSON files during building. Client-side JavaScript would utilize them as the user types in the search box. It’s an idea I might come back to, but the migration had already taken longer than I anticipated, and I like to ship fast and often :)

Algolia

I soon came across Algolia which, provides a simple API and free tier. Crucially they also supply a Jekyll plug-in to generate the necessary search indexes! Awesome.

Set-up was a breeze! Algolia have a specific, useful guide to indexing with Jekyll. Once you sign-up, you’ll need to configure indexing and integrate it with your site.

Index integration

Firstly, install the jekyll-algolia gem making sure to specify it in your gemfile.

Then configure your Jekyll _config.yml so it knows what to index and where as well as what document attributes are significant:

algolia:
  application_id: {your-algolia-app-id}
  index_name: {your-algolia-index-name}
  settings:
    searchableAttributes:
      - title
      - excerpt_text
      - headings
      - content
      - categories
      - tags
    attributesForFaceting:
      - type
      - searchable(categories)
      - searchable(tags)
      - searchable(title)

Finally, you’ll need to run the indexing. You need to ensure the environment variable ALGOLIA_API_KEY is set to your private Admin API Key from your Algolia API Keys page, then run the following command after your static content is generated:

bundle exec jekyll algolia

Site integration

Wiring up the search box can be a little overwhelming as they have so many clients, options and APIs available. I went with a design that presents the results as you type like this:

This design uses two of their libraries - the search lite and the search helper plus some code to wire it up to my search box and render the results in a drop-down list. I’ll probably further tweak the result format and maybe consider wiring up to the API directly as two libraries for such a simple use case seems a bit overkill.

<script src="https://cdn.jsdelivr.net/npm/algoliasearch@3/dist/algoliasearchLite.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/algoliasearch-helper@2.26.0/dist/algoliasearch.helper.min.js"></script>
<script>
  let searchForm = document.getElementById('search-form')
  let hits = document.getElementById('hits')
  let algolia = algoliasearch('{your-algolia-app-id}', '{your-algolia-search-token}')
  let helper = algoliasearchHelper(algolia, '{your-algolia-index-name}',
    { hitsPerPage: 10, maxValuesPerFacet: 1, getRankingInfo: false })
  helper.on('result', searchCallback)

  function runSearch() {
    let term = document.getElementById('s').value
    if (term.length > 0)
      helper.setQuery(term).search()
    else
      searchForm.classList.remove('open')
  }

  function searchCallback(results) {
    if (results.hits.length === 0) {
      hits.innerHTML = '<li><a>No results!</a></li>'
    } else {
      renderHits(results)
      searchForm.classList.add('open')
    }
    let credits = document.createElement('li');
    credits.innerHTML = "<img src=\"https://www.algolia.com/static_assets/images/press/downloads/search-by-algolia.svg\" onclick=\"window.open('https://www.algolia.com', '_blank')\" />"
    hits.appendChild(credits)
  }

  function renderHits(results) {
    hits.innerHTML = ''
    for (let i = 0; i < results.hits.length; i++) {
      let li = document.createElement('li')
      let title = document.createElement('a')
      title.innerHTML = results.hits[i]._highlightResult.title.value
      title.href = results.hits[i].url
      li.appendChild(title)
      hits.appendChild(li)
    }
  }
</script>

Analytics

I’m a big proponent of analytics when used purely for engineering improvement, and Algolia provides a useful dashboard to let you know how performance is doing, what topics are being searched for and what searches might not be returning useful content.

I’ll dig through that when I have a little more time.

[)amien Note: I do not receive any compensation from Algolia either directly or via any referral program. I’m just a happy user.

WordPress to Jekyll part 2 - Comments & commenting

I do enjoy discussion and debate, whether designing software or writing articles. Many times the comments have explored the subject further or offered corrections or additional insights and tips. They are vital on my blog, and I was disappointed that Jekyll provides nothing out of the box to handle them.

Third-party solutions like Disqus exist that require you either pay a subscription or have ads inlined with the comments. That $9/month adds up. The alternative of injecting ads onto my blog to support comment infrastructure doesn’t sit right with me.

Storing comments

So what does Jekyll have that we could build upon?

One useful feature is the ability to process ‘site data’ held in YML files as a data source for generating content via the Liquid templating language.

So, if we store each comment in a file named _data/{blog_post_slug}/{comment_id}.yml with this format:

id: 12345
name: Damien Guard
email: damieng@gmail.com
gravatar: dc72963e7279d34c85ed4c0b731ce5a9
url: https://damieng.com
date: 2007-12-18 18:51:55
message: "This is a great solution for 'dynamic' comments on a static blog!"

This gives us a model where we can gather all the ones that respond to a post by traversing a single folder and performing some sorting.

Using one file per-comment, we also make deleting, approving and managing comments as easy as possible.

Rendering comments

Now we can create test data and attempt rendering. I created three Jekyll includes that match my WordPress theme. They are:

  • Render an individual comment (comment.html)
  • Show a form to accept a new comment (new-comment.html)
  • Loop over individual comments for a post (comments.html)

I’ve included all three includes you can copy to your Jekyll _includes folder.

The simplest option is to include the comments.html file. For example, my blog post template file looks like this:

---
layout: default
---
<div class="post {{ page.class }}">
  {% include item.html %}
  {{ page.content }}
  {% include comments.html %}
</div>

You’ll also need to add the following line to your Jekyll _config.yml so my sort function can work due to a couple of restrictions in Jekyll.

emptyArray: []

Exporting comments from WordPress

The next step is getting the comments out of your existing system. I created a PHP script that extracts my WordPress comments into individual files with the right metadata and structure.

  • Upload this file to your site
  • Access export-blog-comments.php via your browser and wait for it to complete
  • Download the /comments/ folder over SSH and then remove it and the export-blog-comments.php from your server
  • Copy the /comments/ folder into your Jekyll _data/ folder

Disqus users should check out Phil Haack’s Disqus exporter!

Accepting new comments with an Azure function

We can now render existing comments, but what about accepting new ones?

At a minimum, we need to accept an HTTP form post and commit a new YML file. It needs validation, a redirect to a thanks page, and the new YML file somewhere. I decided on a pull request for ease-of-use and to act as moderation. Merging the PR causes a site rebuild that publishes the new comment. :)

Platform and choices

I chose:

  1. GitHub to host my blog and comments as I use it for my code projects
  2. Azure Function App for the form-post-to-pull-request - details below
  3. C# for the function - a great language I know with useful libraries

I went with Azure Function Apps for a few reasons:

  • They accept HTTP/HTTPS directly without configuring an “API Gateway”
  • Comment posting is a short-lived operation that happens infrequently
  • Free monthly grants of 1 m executions/400,000 GB-s should mean no charge
  • Taking a second or two to spin-up the function should be fine in this context

(Disclaimer: I have a free MSDN subscription that includes Azure credits as part of my ASP Insider membership, although I do not expect this solution to consume any of it)

Other platforms

You could easily port this to another C#-capable environment - or port the solution entirely to another language.

If you many comments, you could run the function on three platforms and round-robin the DNS to take advantage of the free usage tiers on each.

How it works

The form receiver function for comments relies on a couple of libraries to deal with YML and GitHub but is otherwise self-explanatory. What it does is:

  1. Receives the form post over HTTP/HTTPS
  2. Attempts to create an instance of the Comment class by mapping form keys to constructor args
  3. Emits errors if any constructor args are missing (unless they have a default)
  4. Creates a new branch against your default using the GitHub OctoKit.NET library
  5. Creates a commit to the new branch with the Comment serialized to YML using YamlDotNet
  6. Creates a pull request to merge the branch with an informative title and body

Installation

Installation requires a few steps but can then just update whenever you update your fork.

  1. Fork the jekyll-blog-comments-azure repo
  2. Create a Function App in the Azure portal (I went with consumption plan on Windows)
  3. Go to Deployment Options, tap Setup and choose GitHub
  4. Authorize it to your GitHub account
  5. Configure Project to your fork of jekyll-blog-comments-azure
  6. Configure Branch to master

You also need to set-up two Application Settings for your function so it can create the necessary pull requests. They are:

  • GitHubToken should be a personal access token with repo rights
  • PullRequestRepository should contain the org and repo name, e.g. damieng/my-blog

The final step is to modify your Jekyll _config.yml so it knows where to post the form. For example:

comments:
  receiver: https://damiengapp.azurewebsites.net/api/PostComment

You should now be able to post a comment on your blog and see it turn up as a pull request against your repository!

Extra steps

  • You can have post authors replies highlighted differently
  • Threaded comments could be supported - feel free to send a pull request
  • Anti-spam measures need to improve at some point - currently, this is just client-side in JS that requires a second ‘Confirm comment’ click

In Part 3 of the series, I’ll go into how I implemented my site search with Algolia!

[)amien

WordPress to Jekyll part 1 - My history and reasoning

It’s hard to believe it was 13 years ago, back in a cold December on the little island of Guernsey, when I decided to start blogging. I’d had a static site with a few odd musings since 2000, but this was to be more regularly updated and with technical content. Blogspot seemed the easiest way to get started.

Briefly hosted at home

Within 18 months of regular blogging, I’d moved over to Subtext, which, being a .NET app, required Windows hosting, so I threw it on a small Shuttle PC on my home DSL. I started using it as an experiment for CSS and web techniques but, within a year, I’d had my 1MB DSL brought to its knees twice through articles featured on BoingBoing.

I did contribute a little to the project and started chatting with the maintainer Phil Haack. I’d later meet him when we both joined Microsoft years later then again at GitHub (small world).

Landing on WordPress

DamienG theme in 2008 In 2007, I migrated to a PHP-based CMS that was making a name for itself called WordPress. My blog would remain on WordPress for 10 years across shared hosting, VMs, and dedicated servers.

One server got caught in an explosion at the ISP. Another time my site got pwned through a WordPress vulnerability. I switched themes several times before creating my own home-grown super-light MootStrap theme based around the BootStrap 2 layout and navbar. I messed with wp-SuperCache, attempting to improve performance and scalability before switching out the PHP engine for HHVM. Then I switched Apache out for NGINX and MySQL for MariaDB in an attempt to eke out a bit of extra performance.

While my theme lives on today - for now at least - MootStrap and PHP are no more as I switched over to the Jekyll static site generator earlier this month after a long meandering journey to get there.

Why Jekyll?

I’ve had success with Jekyll on some other sites I run. Hosting it on GitHub pages or S3 with a CloudFront brings many benefits:

  1. Cost - S3 and CloudFront cost pennies rather than $40+ a month
  2. Security - there’s no code running to exploit, no WordPress plug-in back-doors
  3. Speed - CloudFront is a geo-distributed CDN, and S3 is no slouch either
  4. Editing - text files are simple to process, find, manipulate and markdown much more fluent to type

The price aspect is worth mentioning again. With the occasional bursts in traffic, my site hosting generally worked out around $40 a month for a decent VM. On AWS, I’m expecting it to max out at $3 despite these improvements and benefits.

Of course, another reason is that static-site generators are interesting, and I like to play.

Some challenges

Jekyll is a static site generator. You run the tool somewhere, and it produces plain HTML files with zero server-side code left in them. By its very nature is going to not have support for:

  • Comments - No way to accept or render them
  • Search - No site search facility
  • URL control - Difficult to match the paging/tags/categories with default plugins

Despite this, there are some blog-friendly plug-ins, specifically:

[)amien

Comma-separated parameter values in WebAPI

The model binding mechanism in ASP.NET is pretty slick - it’s highly extensible and built on TypeDescriptor for re-use that lets you avoid writing boilerplate code to map between CLR objects and their web representations.

One surprise, however, is that out of the box, neither WebAPI nor MVC support comma-separated parameter values when bound to an array, e.g.

public class MyController : Controller {
    public string Page([FromUri]int[] ids) {
        return String.Join(" ; ", ids);
    }
}

Will only return 1 ; 2 ; 3 when supplied with /my/page?ids=1&ids=2&ids=3 and if you instead give it /my/page?ids=1,2,3 it will fail.

The reason was likely because there is no standard for this at all and that the former - supported - scenario maps to what forms do when they post multiple value selections such as that in a select list box. The latter is much more readable and is expected by some client frameworks and supported by some other web frameworks such as the Java Spring MVC framework.

Of course, that extensible system lets us easily extend this behaviour. We can support both transparently - and interestingly enough - even mix-and-match on the same URL. So for example;

/my/page?ids=1,2&ids=3 will now return 1 ; 2 ; 3 in our example.

Although this supports both types if you are currently using commas in your number format this would break your app. e.g. ?ids=1,200&ids=3,500 would have been correctly received as 1200, 500 but now would be incorrectly received as 1, 200, 3, 500

CommaSeparatedArrayModelBinder class

My DamienGKit project contains the source but, I’ll also present it here.

Out of the box, this supports integer types and GUIDs and could be extended for floats and decimals – just be careful with that formatting!

public class CommaSeparatedArrayModelBinder : IModelBinder {
    private static readonly Type[] supportedElementTypes = {
        typeof(int), typeof(long), typeof(short), typeof(byte),
        typeof(uint), typeof(ulong), typeof(ushort), typeof(Guid)
    };

    public bool BindModel(HttpActionContext actionContext, ModelBindingContext bindingContext) {
        if (!IsSupportedModelType(bindingContext.ModelType)) return false;
        var valueProviderResult = bindingContext.ValueProvider.GetValue(bindingContext.ModelName);
        var stringArray = valueProviderResult?.AttemptedValue
            ?.Split(new[] { ',' }, StringSplitOptions.RemoveEmptyEntries);
        if (stringArray == null) return false;
        var elementType = bindingContext.ModelType.GetElementType();
        if (elementType == null) return false;

        bindingContext.Model = CopyAndConvertArray(stringArray, elementType);
        return true;
    }

    private static Array CopyAndConvertArray(IReadOnlyList<string> sourceArray, Type elementType) {
        var targetArray = Array.CreateInstance(elementType, sourceArray.Count);
        if (sourceArray.Count > 0) {
            var converter = TypeDescriptor.GetConverter(elementType);
            for (var i = 0; i < sourceArray.Count; i++)
                targetArray.SetValue(converter.ConvertFromString(sourceArray[i]), i);
        }
        return targetArray;
    }

    internal static bool IsSupportedModelType(Type modelType) {
        return modelType.IsArray && modelType.GetArrayRank() == 1
                && modelType.HasElementType
                && supportedElementTypes.Contains(modelType.GetElementType());
    }

}

public class CommaSeparatedArrayModelBinderProvider : ModelBinderProvider {
    public override IModelBinder GetBinder(HttpConfiguration configuration, Type modelType) {
        return CommaSeparatedArrayModelBinder.IsSupportedModelType(modelType)
            ? new CommaSeparatedArrayModelBinder() : null;
    }
}

To register

It’s necessary to register ModelBinderProviders with your ASP.NET application at start-up - usually in the WebApiConfig.cs file.

public static class WebApiConfig {
    public static void Register(HttpConfiguration config) {
        // All your usual configuration up here
        config.Services.Insert(typeof(ModelBinderProvider), 0, new CommaSeparatedArrayModelBinderProvider());
    }
}

[)amien