Hakyll Pt. 3 – Generating RSS and Atom XML Feeds

2019-01-23 (updated: 2019-01-24) [ photo credit robertwpearce ]

Hakyll Pt. 3 – Generating RSS and Atom XML Feeds

2019-01-23 (updated: 2019-01-24)

This is part 3 of a multipart series where we will look at getting a website / blog set up with hakyll and customized a fair bit.

Overview

  1. Hakyll Feed Required Reading
  2. Hakyll’s Prebuilt RSS & Atom Templates
  3. Hakyll Feed Setup
  4. Creating the Atom & RSS XML Files
  5. Unexpected Issue: Setting an updated Field
  6. Using Your Own RSS & Atom Templates
  7. Validating and Using Our Feeds

Hakyll Feed Required Reading

There is already a great starter guide at https://jaspervdj.be/hakyll/tutorials/05-snapshots-feeds.html, so be sure to read this first – it might make it so you don’t have to read this blog post at all.

Hakyll’s Prebuilt RSS & Atom Templates

Thankfully, hakyll aready comes with prebuilt RSS and Atom templates! You can find the source here: https://github.com/jaspervdj/hakyll/tree/master/data/templates. While you won’t need to copy and paste nor even directly use these files, you should look them over to see what fields they are expecting. There are two levels to be aware of: the feed itself and each individual feed item.

Feed-Level

The feed itself is looking for the following, and you’ll provide these through a FeedConfiguration that we’ll discuss in a moment. Here are the fields the atom.xml and rss.xml templates are expecting:

  • title (title of feed)
  • description (description of feed)
  • authorName (feed author name)
  • authorEmail (feed author email)
  • root (your website)
  • updated (feed last updated at; should be done for you)
  • body (feed body; should be done for you)
  • url (path to the XML file; based off of a create ["rss.xml"] function that we’ll discuss)

Feed Item-Level

Each feed item, or entry, expects the following:

  • title (title of the entry)
  • root (your website)
  • url (path to resource)
  • published (published date; "%Y-%m-%dT%H:%M:%SZ" format; should be done for you via hakyll’s dateField context)
  • updated (updated date; "%Y-%m-%dT%H:%M:%SZ" format; should be done for you, unless you provide you own)

Now that you know what sort of data are expected, let’s begin.

Hakyll Feed Setup

As is introduced in the required hakyll feed reading, we need to create a FeedConfiguration. If you’d like to see the FeedConfiguration data constructor, you can view it here: https://github.com/jaspervdj/hakyll/blob/f3a17454fae3b140ada30ebef13f508179f4cd0d/lib/Hakyll/Web/Feed.hs#L63-L75.

We should next figure out what we want our “feed context” to consist of. The official hakyll feed guide (linked above) is:

This will enable you to include the body of your post as the description, but if you provide your own description field in your posts, then this step isn’t necessary. For the mean time, let’s make our own feedCtx function that sticks to the original post.

If you’re unsure of what postCtx is, I recommend checking out the previous article or viewing the source of this site: https://github.com/rpearce/robertwpearce.com/blob/858163216f445eb8b6ab3b4304b022b64814b6f8/site.hs#L131-L136.

Creating the Atom & RSS XML Files

Here is what the official hakyll feed guide recommends:

This is great! However, if we want to generate both an atom.xml feed and an rss.xml feed, we’ll end up with almost duplicated code:

It looks like all the feed compilation is exactly the same except for the renderAtom and renderRss functions that come bundled with hakyll. With this in mind, let’s write our own feed compiler and reduce as much boilerplate as we reasonably can.

To start out, let’s see what we want our top-level end result to be:

While we could potentially abstract this further, this leaves wiggle room for customizing the route for whatever reason you may want to.

This feedCompiler is a function that we need to write that will house the missing logic. Let’s look at its type:

The first 4 parameters describe the types of both renderAtom and renderRss (they’re the same). For reading’s sake, let’s set those to a type alias called FeedRenderer:

And now we can define our feed but do it in a slightly cleaner way:

Using Your Own RSS & Atom Templates

Thanks to Abhinav Sarkar on lobste.rs, I was pointed to a pull request, https://github.com/jaspervdj/hakyll/pull/652, that allows hakyll users to use their own feed templates. Here is some example usage from the PR:

Validating and Using Our Feeds

If you’ve made it this far and have successfully generated and published your atom.xml and/or rss.xml files, see if they’re valid! Head to https://validator.w3.org/feed/ and see if yours validate.

You can check out your new feed in an RSS/Atom feed reader such as the browser plugin FeedBro or any others.

Unexpected Issue: Setting an updated Field

I ran into a feed validation problem where, in a few posts, I manually set the updated field to a date – not datetime – and thus invalidated my feed. The value 2017-06-30 needed to be in the "%Y-%m-%dT%H:%M:%SZ" format, or 2017-06-30T00:00:00Z. This led me down a rabbit hole that ended in me essentially repurposing the dateField code from hakyll (https://github.com/jaspervdj/hakyll/blob/c85198d8cb6ce055c788e287c7f2470eac0aad36/lib/Hakyll/Web/Template/Context.hs#L273-L321). While I tried to use parseTimeM and formatTime from Data.Time.Format in my own way, I couldn’t make it as simple as I wanted, thus leading to me giving up and using what was already there. Here’s what I did:

Woah! We need to break down what’s happening here.

feedCtx

The addition to feedCtx is before our postCtx because of the mappend precedence of what comes out of the pipeline with the value updated. We want first rights to transforming the updated field, so it needs to come first.

updatedField

This function is a Context that leans on hakyll’s field function to say that we want to work with the updated field and then do some Monad stuff with time. The tl;dr is that we send the field’s current value off in order to get a UTCTime value back, and then we format it to be the way we need it.

getUpdatedUTC

It’s really not as bad as it looks! The root of this function does two things:

  1. looks up the the updated value in the metadata
  2. tries to parse it using a bunch of different formats

If it can’t do these things, it simply fails.


Yes, I could have simply written my updated field in the correct format. But where’s the fun in that? I would hate for my feed to silently invalidate itself over something so simple!

Wrapping Up

Whew! We dove in to generating Atom & RSS XML feeds with hakyll, uncovered a nice refactor opportunity via feedCompiler, learned how to validate our feeds and ultimately learned about how a seemingly harmless updated date could prevent us from having a totally valid feed!

Next up:


Thank you for reading!
Robert