Migrating BlogML to WordPress 3.0

Update (1/18/2013): An amazing developer by the name of Patrick Daly has updated this plugin through GitHub to import tags too. Go grab it here!

Update: My plugin has been approved by WordPress! I’ve updated the documentation and links accordingly.

As part of my migration to WordPress 3.0, I needed to get all of my posts migrated over. My comments were stored via Disqus, and I’m working with them on migrating that data to the new URL slugs.

BlogEngine.Net uses BlogML as their sole export format, which became a little tricky, since there is still no WordPress importer built in to handle this.

As is with most things in life, we are all standing on the shoulders of giants. In this case, Aaron Lerch was my giant and he had created a great BlogML importer, which included all of the necessary XPath information needed to import blog entries.

The only problem is that the plugin was created to work with WordPress 2.6, and the format for importing content has changed since then. The  ‘wp-admin/imports’ folder no longer exists in 3.0, and it could have been earlier than that as well.

An initial peek of the importer code didn’t yield any results to me, but installing the simple RSS importer and examining that code allowed me to see what was going on. With the newer format for importing, you downloaded a plugin, which upon activation registered itself with the import module used in WordPress. All I needed to do was to put this registration wrapper around the existing code, update some documentation, and voila, a working BlogML importer!

Just as my giant has opened the code up, I’m doing the same, for any BlogEngine.Net,  SubText, or other BlogML formatted folks out there that are thinking of migrating to WordPress.

Before importing, you’ll want to go into your BlogML file and look for any file or image references that exist and update them accordingly. For instance, with BlogEngine.Net, it uses a ‘file.axd’ HTTP helper to deliver a lot of its files. You can always update this data after the import. It is up to you.

Using the importer is rather simple:

  1. Log into your WordPress site and go to the ‘Plugins’ section.
  2. Click the “Add New” button and search for the term “blogml”.
  3. Find the ‘blogml-importer’ plugin and install it.
  4. Activate the plugin.
  5. Go to the Tools->Import screen, and select the ‘BlogML’ option.
  6. Follow the directions to complete the process.

As an added bonus (thanks to Aaron), the import process will create a CSV file that maps the old permalinks found in the BlogML file to the new permalinks that WordPress generated. It will also give you the option of mapping posts to a current user, or to create a new user account for them.

I ran this process and had all of my posts migrated over in a few seconds. Granted I only had 65 posts, but it still saved me a lot of time. I’m still doing a bit of image/link cleanup, but that is simple to do.

One other important item to note is that categories and tags may or may not be properly imported. In my case, none of my tags came over, and my categories came over named as GUIDs. Renaming the categories didn’t take long and I’ll simply need to retag my posts. Again, considering the time it would have taken me to renter all of my posts (and I know others have lots more), I’m not too worried about this.

I hope this helps other folks out their with their WordPress migrations. I’m hoping to look through the import code again, and work on getting some of the category/tag issues resolved. I may even try to do some kind of auto-detecting on the image/file links to provide a proper “remap” option like with the user import. Once the plugin is approved in the codex, I’ll update my link to point there.

Enjoy!

If you want to download the plugin go to the direct link here.

UPDATE: If you want a standalone tool that will migrate BlogML to WordPress, check out the Blog Migration Tool I wrote over on CodePlex.

19 thoughts on “Migrating BlogML to WordPress 3.0

  1. Hey there, thanks for this great plugin! Everything worked great for me except that my post content is all base64 encoded :-(

    Can this plugin decode the post content?

    Wondering if anyone else encountered this and what the solution could be … aside from manually copying each post into a base64 decoder (there are 70 in my blog)!

    It seems like it would be simple to write some kind of program to decode the posts in the xml file, but alas, I’m not a computer programmer ;-)

  2. Hi Sean, I’m Simone Chiaretta, from the core team of Subtext.
    I’ve got a BlogML file created by Subtext and it just contains posts with the content encoded in base64.
    The BlogML specs allow it, but seems like your importer cannot handle it.
    Unfortunately I’m not a PHP dev and have totally no experience with it, otherwise I would have updated the code and sent the patch.
    Do you think you could make a quick update to the plugin and allow contents from Subtext to be migrated to WordPress?
    Thx
    Simone

    1. Greetings Simone! I’ll see what I can do to get that plugin updated. I need to dust off a few things with it anyway.

      A couple of other things you might tryout is the link in my previous comment to garfoot’s blog. He wrote a simple console app to do the base64 conversion in .Net, and you should be able to simple compile the code (or pass it along to a buddy to compile) and run it.

      Another option might be to try out the standalone Blog Migrator tool I recently created. I have a link to it at the bottom of the post above. I know there isn’t direct base64 code built into the app yet, but that might also help your immediate needs while I work on some updates.

      If you happen to have a sample base64 encoded Subtext dump I can test with, that would be greatly appreciated, drop me a line with the contact form and we can move forward from there.

      1. Thank you.
        I already used both your options and I’m about to publish a blog post that explains how to do the migration.
        But if you want to test a real BlogML archive, just send me an email and I’ll send it to you
        thx

  3. I’ve got a BlogML file created by Subtext and it just contains posts with the content encoded in base64.
    The BlogML specs allow it, but seems like your importer cannot handle it.
    Unfortunately I’m not a PHP dev and have totally no experience with it, otherwise I would have updated the code and sent the patch.
    Do you think you could make a quick update to the plugin and allow contents from Subtext to be migrated to WordPress?
    Thx

    1. Excellent!!!! Thank you for taking the time to do this! I haven’t had a chance to update the code in a long long time! I’ll make sure to update my links to point to your update.

  4. Hi Sean, I’m using your BlogML importer to import a client’s 2,000+ posts from BlogEngine.NET and while the plugin works great, it’s not assigning the correct author to each post. I’d say it’s about 30% correct when I do an import. There are 5 authors, and with that many posts, it would be crazy for me to change the author manually!

    I wonder if there’s something I can do to this part of the process_post function:
    $primary_post_author_nodes = $this->xPath->match(‘authors/author[1]’, $post);
    $post_author = $wpdb->escape($this->xPath->getAttributes($primary_post_author_nodes[0], ‘ref’));

    Or if you could explain to me what ‘authors/author[1]’ does, I may be able to figure out the changes I need to make. I am assuming $primary_post_author_nodes[0] will take the first value (of the array), but I don’t know how that array is being populated.

    Thanks so much!

    1. Greetings Mercy! Unfortunately it has been quite a while since I’ve actually looked at the code, but taking a quick glance I believe that the primary_post_author_node is the author details. It could be that the structure has changed a little bit if there is more than one author. I’m not quite sure.

      A couple of options to consider. Have you checked out the updated version that Patricky Daly created to handle tags and base64 encoding? There might be something to that that inadvertently solves the issue.

      Secondly, I wrote a standalone tool for migrating content that handles BlogML as well (located at the end of the article) It might be able to help parse the information a little better for you and give you what you need.

      Hope this helps!

  5. Thanks for taking time for sharing this article, it was excellent and very informative. as a first time visitor to your blog I am very impressed. I found a lot of informative stuff in your article. Keep it up. Thank you.

What are your 10 bits on the matter? I want to know!

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s