Posts tagged with wordpress - page 3

Moving home

I have been planning on moving my blog off my little Windows Shuttle PC at home onto a hosted service for some time and the latest flurry of activity followed by DSL line meltdown was enough to give me the nudge I needed to get the job done.

Rob Conery provided a useful .NET/Subsonic app to make the transition from Subtext about as painless as possible bar the obvious one of going with a PHP based solution when I know .NET is a better technology.

I simply felt the .NET blogging engines didn’t give me what I want right now and yes, I know I should be contributing to them to get them where I want them but I’m just so busy on various projects that if I was coding a blog in the evenings I wouldn’t be writing on it. Hopefully the great, and no doubt equally busy, guys behind those engines will forgive my little foray into WordPress for a while.

The non-blog parts of the web site (yes, there are some, with downloads, fonts, cursors, little tools and a mini-biography) will be integrated with the site shortly and the theme will probably gradually change to something more me. I also want to add a few extra things, the tag cloud and identicons for a start.

The title of this post also has a second meaning… yes, I’ve put an offer in on a house and will hopefully be taking possession in around 6 weeks providing nothing goes wrong.

Your invite to the house warming party will be in the post…

[)amien

Importing BlogML into WordPress

I’ve been trying to get my content out of Subtext and into WordPress – a process that shouldn’t be difficult however Subtext only supports the blog-independent BlogML format and whilst WordPress supports a number of import formats BlogML isn’t one of them. For export WordPress only supports it’s own WordPress WXR format although the BlogML guys have an exporter available.

The first idea was to put together an XSL transform to convert BlogML to WXR.

BlogML format

BlogML posts look like this although Subtext fails to populate the views attribute or even a tag for the user-email as at 1.9.3. It also doesn’t include a field for a commenter’s IP addresses. These two limitations mean no Gravatars or Identicons at the other end right now.

<post id="1" date-created="2006-04-24T04:07:00" date-modified="2006-04-25T11:55:00" approved="true"
    post-url="https://damieng.com/blog/archive/2008/01/30/Test.aspx" type="normal" hasexcerpt="false" views="0">
  <title type="text"><![CDATA[This is a test]]></title>
  <content type="text"><![CDATA[Just testing content]]></content>
  <post-name type="text"><![CDATA[ThisIsATest]]></post-name>
  <categories>
    <category ref="1" />
  </categories>
  <authors>
    <author ref="1" />
  </authors>
</post>

WXR format

WXR posts are extended RSS items and annoyingly doesn’t have a field for view counts at all.

<item>
  <title>This is a test</title>
  <link>https://damieng.com/blog/archive/2008/01/30/Test.aspx</link>
  <pubDate>Thu, 04 Apr 2006 04:07:00 +0000</pubDate>
  <dc:creator>Damien Guard</dc:creator>
  <guid isPermaLink="false">https://damieng.com/blog/archive/2008/01/30/Test.aspx</guid>
  <description></description>
  <content:encoded><![CDATA[Just testing content]]></content:encoded>
  <wp:post_id>1</wp:post_id>
  <wp:post_date>2006-04-24 04:07:00</wp:post_date>
  <wp:post_date_gmt>2006-04-24 04:07:00</wp:post_date_gmt>
  <wp:comment_status>open</wp:comment_status>
  <wp:ping_status>open</wp:ping_status>
  <wp:post_name>about</wp:post_name>
  <wp:status>publish</wp:status>
  <wp:post_parent>0</wp:post_parent>
  <wp:menu_order>0</wp:menu_order>
  <wp:post_type>post</wp:post_type>
</item>

Convert BlogML to WXR using XSLT

There are a few things to bear in mind when using this transform:

  • Link and guid tags are populated but WordPress seems to ignore them. Will investigate soon!
  • Time-zone conversion does not take place – hand-code +offsets in the XSLT to deal with your zone.
  • Track-backs not yet considered.
  • Default namespace in BlogML is not handled – remove the xmlns=”…” declaration from your BlogML file before transforming.
  • HTML within comments is not supported – when I enabled this WordPress treated the HTML as text.
  • Embedded attachments are not supported.
  • Edit the primary site link at channel/link in the transformed file to match your site – BlogML doesn’t include it.

Multiple authors and categories should work just fine so throw this file and your BlogML export through an XSLT processor and presto, WXR content ready for import.

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                              xmlns:dc="http://purl.org/dc/elements/1.1/"
                              xmlns:wp="http://wordpress.org/export/1.0/"
                              xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <xsl:output method="xml" indent="yes" cdata-section-elements="content:encoded"/>

  <xsl:template match="/">
    <rss version="2.0"
        xmlns:content="http://purl.org/rss/1.0/modules/content/"
        xmlns:wfw="http://wellformedweb.org/CommentAPI/"
        xmlns:dc="http://purl.org/dc/elements/1.1/"
        xmlns:wp="http://wordpress.org/export/1.0/">
      <xsl:apply-templates />
    </rss>
  </xsl:template>

  <xsl:template match="blog">
    <channel>
      <title><xsl:value-of select="title"/></title>
      <link>http://addsitehere</link>
      <description><xsl:value-of select="sub-title"/></description>
      <pubDate>
        <xsl:call-template name="topubdate">
          <xsl:with-param name="date" select="@date-created" />
        </xsl:call-template>
      </pubDate>
      <generator>DamienG's BlogML to WordPress transform</generator>
      <language>en</language>
      <xsl:apply-templates />
    </channel>
  </xsl:template>

  <xsl:template match="blog/categories/category">
    <wp:category>
      <wp:category_nicename>
        <xsl:value-of select="title"/>
      </wp:category_nicename>
      <wp:category_parent></wp:category_parent>
      <wp:posts_private>0</wp:posts_private>
      <wp:links_private>0</wp:links_private>
      <wp:cat_name>
        <xsl:value-of select="@description"/>
      </wp:cat_name>
    </wp:category>
  </xsl:template>

  <xsl:template match="post">
    <item>
      <title>
        <xsl:value-of select="title"/>
      </title>
      <link>
        <xsl:value-of select="@post-url"/>
      </link>
      <pubDate>
        <xsl:call-template name="topubdate">
          <xsl:with-param name="date" select="@date-created" />
        </xsl:call-template>
      </pubDate>
      <dc:creator>
        <xsl:variable name="authorref" select="authors/author/@ref" />
        <xsl:value-of select="//author[@id=$authorref]/title"/>
      </dc:creator>
      <xsl:apply-templates select="categories" />
      <guid isPermaLink="false">
        <xsl:value-of select="@post-url"/>
      </guid>
      <description></description>
      <content:encoded>
        <xsl:value-of select="content" disable-output-escaping="yes"/>
      </content:encoded>
      <wp:post_id>
        <xsl:value-of select="@id"/>
      </wp:post_id>
      <wp:post_date>
        <xsl:value-of select="translate(@date-modified,'T',' ')"/>
      </wp:post_date>
      <wp:post_date_gmt>
        <xsl:value-of select="translate(@date-modified,'T',' ')"/>
      </wp:post_date_gmt>
      <wp:comment_status>open</wp:comment_status>
      <wp:ping_status>open</wp:ping_status>
      <wp:post_name>
        <xsl:value-of select="post-name"/>
      </wp:post_name>
      <wp:status>
        <xsl:choose>
          <xsl:when test="@approved='true'">publish</xsl:when>
          <xsl:otherwise>draft</xsl:otherwise>
        </xsl:choose>
      </wp:status>
      <wp:post_parent>0</wp:post_parent>
      <wp:menu_order>0</wp:menu_order>
      <wp:post_type>post</wp:post_type>
      <xsl:apply-templates />
    </item>
  </xsl:template>

  <xsl:template match="comment">
    <wp:comment>
      <wp:comment_id>
        <xsl:value-of select="@id"/>
      </wp:comment_id>
      <wp:comment_author>
        <xsl:value-of select="@user-name"/>
      </wp:comment_author>
      <wp:comment_author_email></wp:comment_author_email>
      <wp:comment_author_url>
        <xsl:value-of select="@user-url"/>
      </wp:comment_author_url>
      <wp:comment_author_IP></wp:comment_author_IP>
      <wp:comment_date>
        <xsl:value-of select="translate(@date-created,'T',' ')"/>
      </wp:comment_date>
      <wp:comment_date_gmt>
        <xsl:value-of select="translate(@date-created,'T',' ')"/>
      </wp:comment_date_gmt>
      <wp:comment_content>
        <xsl:value-of select="content"/>
      </wp:comment_content>
      <wp:comment_approved>
        <xsl:choose>
          <xsl:when test="@approved='true'">1</xsl:when>
          <xsl:otherwise>0</xsl:otherwise>
        </xsl:choose>
      </wp:comment_approved>
      <wp:comment_type></wp:comment_type>
      <wp:comment_parent>0</wp:comment_parent>
    </wp:comment>
  </xsl:template>

  <xsl:template match="post/categories/category">
    <category>
      <xsl:variable name="catref" select="@ref" />
      <xsl:value-of select="/blog/categories/category[@id=$catref]/title"/>
    </category>
  </xsl:template>

  <xsl:template name="topubdate">
    <xsl:param name="date" />
    <xsl:value-of select="substring($date,9,2)" />
    <xsl:value-of select="' '" />
    <xsl:call-template name="monthname">
      <xsl:with-param name="month" select="substring($date,6,2)" />
    </xsl:call-template>
    <xsl:value-of select="' '" />
    <xsl:value-of select="substring($date,1,4)" />
    <xsl:value-of select="' '" />
    <xsl:value-of select="substring($date,12,8)" /> +0000
  </xsl:template>

  <xsl:template name="monthname">
    <xsl:param name="month" />
    <xsl:choose>
      <xsl:when test="$month='01'">Jan</xsl:when>
      <xsl:when test="$month='02'">Feb</xsl:when>
      <xsl:when test="$month='03'">Mar</xsl:when>
      <xsl:when test="$month='04'">Apr</xsl:when>
      <xsl:when test="$month='05'">May</xsl:when>
      <xsl:when test="$month='06'">Jun</xsl:when>
      <xsl:when test="$month='07'">Jul</xsl:when>
      <xsl:when test="$month='08'">Aug</xsl:when>
      <xsl:when test="$month='09'">Sep</xsl:when>
      <xsl:when test="$month='10'">Oct</xsl:when>
      <xsl:when test="$month='11'">Nov</xsl:when>
      <xsl:when test="$month='12'">Dec</xsl:when>
    </xsl:choose>
  </xsl:template>
  <xsl:template match="text()" />
</xsl:stylesheet>

In conclusion

I really don’t want to give up email addresses and IP addresses which gives me two options:

  1. Write an ASPX page that rips the content directly out of the subtext tables and formats it as WXR bypassing BlogML
  2. Extend the Subtext export facility to add the missing fields and transform them from there

I’ll let you know where I go from here…

[)amien

From Blogger to SubText – Export psuedo BlogML from Blogger

Getting my blog out of Blogger.com and into Subtext was not as easy as I’d hoped…

What is BlogML?

BlogML is an XML format designed to encapsulate a blog, it’s posts, comments and categories. Sounds great for transferring between blogs… Alas while SubText and many other engines support it Blogger.com does not.

A simple category-less BlogML file without comments looks something like this;

<blog root-url="www.damieng.com/blog/" date-created="2006-04-25T01:02:25" xmlns="http://www.blogml.com/2006/01/BlogML" xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <title type="text">damieng</title>
  <sub-title type="text">Random musings from Guernsey</sub-title>
  <author name="Damien Guard" email="damieng@gmail.com">
  <posts>
    <post id="113889650370084235" date-created="2006-04-20T01:53:00" date-modified="2006-04-20T01:53:00" approved="true" post-url="https://damieng.com/blog/2006/04/hello.html">
    <title type="text">Hello</title>
    <content type="text"><![CDATA[This is a blog post<br />With HTML!]]></content>
  </posts>
  </author>
</blog>

Check out the BlogML standard itself for full details although doing so requires registration.**The first thing to do is to enter Blogger.com and change the settings for your blog. Specifically you want to go to Formatting Settings and enter 999 and Posts next to Show. If you have more than 99 posts you might have problems.

Setting options on Blogger

Set the Timestamp Format to 4/25/2006 10:38:00AM (obviously the date will be different, it’s the format we’re after) and set Enable Float Alignment to No.

Go to the Comments section and set the Comments Timestamp Format to the same.

Changing the Blogger template

By changing the template we can get Blogger.com to output something close to BlogML but not quite there.

Paste the following block into the template area but DO NOT hit save.

"?> <$BlogTitle$> <![CDATA[<$BlogDescription$>]]> <![CDATA[<$BlogItemTitle$>]]> <![CDATA[<$BlogItemBody$>]]> <![CDATA[<$BlogCommentBody$>]]> <![CDATA[<$BlogCommentAuthor$>]]>

Now hit the Preview button and wait. Once complete, view source and save that somewhere. Feel free now to cancel the template change.

Patching up the bad output

The output from this template isn’t BlogML yet but it’s not too far off. Cut out the junk before <?xml and and after </blog> to get one step closer.

Now that just leaves us with three problems.

  1. Date/time formats are incorrect for both posts and comments
  2. Comments have no titles
  3. Comment authors are in <author> tags encoded as a CDATA hyper link instead of user-name and user-url attributes of the <comment> tag

These are all limitations of the Blogger template system but with a short XML parser and writer you should be able to fix them up.

[)amien