Moving from WordPress to Middleman, Part II: Multiple categories per post by jgn on Thursday, February 5, 2015 in Technology, Ruby, and Middleman


NOTE
This technique is obsolete; see A Middleman Extension for Categories


One oddity of Middleman and its blog template is that its model for additional metadata is a bit weak. The documentation tells us that you can have "Custom Article Collections." The example they give is for a "category" but you can really only have one value per category attribute. In other words, you can't have multiple categories separated with a comma, which is the pattern for tags. My old WordPress blog had multiple categories per post, with nice pages listing all posts by a given category. I'd like to keep those pages working. But without a strategy for multiple categories per post, I can't refresh my old blog as I'd like.

So how to fix this? It would be possible to create an extension that imitates the "tag" support, but the class organization looks fairly complicated, and I'm not sanguine about writing an extension only to see that the class structure changes and my code starts to fail.

Another way to do this is with global helper methods. This has the downside of polluting the global space. But it has the benefit of being obvious. So here's what I did.

First I created a sample blog with a couple of articles.

middleman init middleman-with-multiple-categories-per-post --template=blog
cd middleman-with-multiple-categories-per-post
middleman article '"Going to Boston by plane"'
middleman article '"Going to Chicago by train"'

In this case, I want both articles to be in the category "Travel" while the Boston trip is also in the category "Boston" and the Chicago trip is also in the category "Chicago." So for the first article, I hand edit the front matter to look like this:

---
title: Going to Boston by plane
date: 2015-02-05 01:35 UTC
tags:
categories: Travel, Boston
---

Now, let's work outside-in. In my layout.erb I am going to want to be able to list out all of my categories. The code is going to look something like this:

<h2>Categories</h2>
<ul>
  <% all_categories.each do |category, articles| %>
    <li><%= link_to "#{category} (#{articles.size})", category_path(category) %></li>
  <% end %>
</ul>

all_categories and category_path are the methods I need. all_categories is going to return a Hash where each key is a category name, and each value is the collection of articles that are defined under that key. category_path is going to return a dasherized all-lower-case URL for that category. If we add this chunk into our layout.erb right before our display of tags, it will fail, so let's implement the methods as helpers.

Here's one way to do it. Put this in config.rb:

helpers do
  def categories(page)
    category_array(page.data[:categories])
  end

  def category_path(category)
    "/category/#{category.parameterize}.html"
  end

  def all_categories
    @all_categories ||= Hash[all_categories_unsorted.sort]
  end

  def category_array(categories)
    (categories || "Uncategorized").split(/,\s*/)
  end

  private

  def all_categories_unsorted
    Hash.new { [] }.tap do |all_categories|
      blog.articles.each do |article|
        categories(article).each do |ac|
          all_categories[ac] <<= article
        end
      end
    end
  end
end

The magic here is in the private method all_categories_unsorted and the fragment blog.articles. When the static page is rendered, Middleman exposes the blog method which has an attribute articles that you can iterate through. What we are doing here is building up the Hash of key/value pairs, each representing a category and the collection of all articles referencing it (by the way, another way to build up this Hash is to use inject but I like this way). Hash.new { [] } is using a block to say that the default value of a Hash entry should be a new Array; and the <<= operator provides for initializing the Hash value as the empty Array if there is no value already; if there is a value, we append (<<) our new article). That's just Ruby 101, right? Meanwhile, it would be nice to sort the Hash by key value. To do that, we leverage the all_categories_unsorted method with a public one called all_categories that does the sort (you should be able to figure out the Hash[all_categories_unsorted.sort] idiom).

The value for the categories metadata tag is taken apart by the method categories which takes the value from the page (e.g., "Travel, Chicago") and splits it up into an Array. As a nicety, when no category is specified, we use a default of "Uncategorized" -- this made it a little easier to migrate my old WordPerfect blog which had a few posts without categories. To get category_path to come out right, we use the parameterize method which does the dasherizing / lowercasing for us -- it's a standard part of Middleman, because it is a standard part of Padrino, which supplies the view helpers.

So that actually looks pretty good. You'll see a nice list of categories.

Unfortunately the links go nowhere. If you click on a link you'll get a "file not found." Oops. Let's fix that.

Middleman provides a hook to do extra stuff afer "all loading and parsing of extensions is complete" (I know this because I read the source). It's referenced though not really explained in the discussion of the Middleman Sitemap.

We define our ready method in config.rb:

ready do
  sitemap.resources
    .map { |r| category_array(r.data["categories"]) }
    .flatten
    .uniq
    .each do |category|
    proxy category_path(category), "category.html", locals: { category: category }
  end
end

Here we go through all of the resources, and grab the value for each occurrence of "categories" - and we make a little array out of that with our helper method category_array. Then we flatten it to make one big Array, uniq it, and then we use Middleman's proxy method to tie the incoming request for a path to a template file (which we will create in a moment). Notice that to get the path for the incoming category we leverage the same category_path helper we used from layout.erb and defined above.

Now all we need is category.html.erb. Here's what it looks like (put this in source/):

<h1>Articles in category '<%= category %>'</h1>

<ul>
  <% all_categories[category].each_with_index do |article, i| %>
    <li><%= link_to article.title, article %> <span>(<%= article.date.strftime('%B %e, %Y') %>)</span></li>
  <% end %>
</ul>

We're almost done. There is one last thing. If you try to build your static blog with middleman build, it will fail, because Middleman will try to render the category.html.erb template which we are only using for our proxies (it fails because it can't find the category local variable, which we have been injecting into the page by the proxy). To not render the raw page, add an ignore call in your configure block for :build - like so (I omitted a bunch of comments):

configure :build do
  ignore '/category.html'
end

A repo with commits for each code block above is at https://github.com/jgn/middleman-with-multiple-categories-per-post.