RSS

Hakyll Pt. 3 – Generating RSS and Atom XML Feeds

Info

SummaryGenerate rss.xml and atom.xml feeds for your hakyll site.
Shared2019-01-23
Revised2023-02-11 @ 16:00 UTC

This is part 3 of a multipart series where we will look at getting a website / blog set up with hakyll and customized a fair bit.

Overview

  1. Hakyll Feed Required Reading
  2. Hakyll’s Prebuilt RSS & Atom Templates
  3. Hakyll Feed Setup
  4. Creating the Atom & RSS XML Files
  5. Unexpected Issue: Setting an updated Field
  6. Using Your Own RSS & Atom Templates
  7. Validating and Using Our Feeds

Hakyll Feed Required Reading

There is already a great starter guide at https://jaspervdj.be/hakyll/tutorials/05-snapshots-feeds.html, so be sure to read this first – it might make it so you don’t have to read this blog post at all.

Hakyll’s Prebuilt RSS & Atom Templates

Thankfully, hakyll aready comes with prebuilt RSS and Atom templates! You can find the source here: https://github.com/jaspervdj/hakyll/tree/master/data/templates. While you won’t need to copy and paste nor even directly use these files, you should look them over to see what fields they are expecting. There are two levels to be aware of: the feed itself and each individual feed item.

Feed-Level

The feed itself is looking for the following, and you’ll provide these through a FeedConfiguration that we’ll discuss in a moment. Here are the fields the atom.xml and rss.xml templates are expecting:

Feed Item-Level

Each feed item, or entry, expects the following:


Now that you know what sort of data are expected, let’s begin.

Hakyll Feed Setup

As is introduced in the required hakyll feed reading, we need to create a FeedConfiguration. If you’d like to see the FeedConfiguration data constructor, you can view it here: https://github.com/jaspervdj/hakyll/blob/f3a17454fae3b140ada30ebef13f508179f4cd0d/lib/Hakyll/Web/Feed.hs#L63-L75.

feedConfiguration :: FeedConfiguration
feedConfiguration =
    FeedConfiguration
        { feedTitle       = "My Blog"
        , feedDescription = "Posts about x, y & z"
        , feedAuthorName  = "My Name"
        , feedAuthorEmail = "[email protected]"
        , feedRoot        = "https://example.com"
        }

We should next figure out what we want our “feed context” to consist of. The official hakyll feed guide (linked above) is:

let feedCtx = postCtx `mappend` bodyField "description"

-- which can be abbreviated to

let feedCtx = postCtx <> bodyField "description"

This will enable you to include the body of your post as the description, but if you provide your own description field in your posts, then this step isn’t necessary. For the mean time, let’s make our own feedCtx function that sticks to the original post.

feedCtx :: Context String
feedCtx = postCtx <> bodyField "description"

If you’re unsure of what postCtx is, I recommend checking out the previous article or viewing the source of this site: https://github.com/rpearce/robertwpearce.com/blob/858163216f445eb8b6ab3b4304b022b64814b6f8/site.hs#L131-L136.

Creating the Atom & RSS XML Files

Here is what the official hakyll feed guide recommends:

create ["atom.xml"] $ do
    route idRoute
    compile $ do
        let feedCtx = postCtx `mappend` bodyField "description"
        posts <- fmap (take 10) . recentFirst =<<
            loadAllSnapshots "posts/*" "content"
        renderAtom myFeedConfiguration feedCtx posts

This is great! However, if we want to generate both an atom.xml feed and an rss.xml feed, we’ll end up with almost duplicated code:

create ["rss.xml"] $ do
    route idRoute
    compile $ do
        let feedCtx = postCtx `mappend` bodyField "description"
        posts <- fmap (take 10) . recentFirst =<<
            loadAllSnapshots "posts/*" "content"
        renderRss myFeedConfiguration feedCtx posts

It looks like all the feed compilation is exactly the same except for the renderAtom and renderRss functions that come bundled with hakyll. With this in mind, let’s write our own feed compiler and reduce as much boilerplate as we reasonably can.

To start out, let’s see what we want our top-level end result to be:

create ["atom.xml"] $ do
    route idRoute
    compile (feedCompiler renderAtom)


create ["rss.xml"] $ do
    route idRoute
    compile (feedCompiler renderRss)

While we could potentially abstract this further, this leaves wiggle room for customizing the route for whatever reason you may want to.

This feedCompiler is a function that we need to write that will house the missing logic. Let’s look at its type:

feedCompiler :: FeedConfiguration
                -> Context String
                -> [Item String]
                -> Compiler (Item String)
                -> Compiler (Item String)

The first 4 parameters describe the types of both renderAtom and renderRss (they’re the same). For reading’s sake, let’s set those to a type alias called FeedRenderer:

type FeedRenderer =
    FeedConfiguration
    -> Context String
    -> [Item String]
    -> Compiler (Item String)

And now we can define our feed but do it in a slightly cleaner way:

feedCompiler :: FeedRenderer -> Compiler (Item String)
feedCompiler renderer =
    renderer feedConfiguration feedCtx
        =<< fmap (take 10) . recentFirst
        =<< loadAllSnapshots "posts/*" "content"

Using Your Own RSS & Atom Templates

Thanks to Abhinav Sarkar on lobste.rs, I was pointed to a pull request, https://github.com/jaspervdj/hakyll/pull/652, that allows hakyll users to use their own feed templates. Here is some example usage from the PR:

customRenderAtom :: FeedConfiguration -> Context String -> [Item String] -> Compiler (Item String)
customRenderAtom config context items = do
  atomTemplate     <- unsafeCompiler $ readFile "templates/atom.xml"
  atomItemTemplate <- unsafeCompiler $ readFile "templates/atom-item.xml"
  renderAtomWithTemplates atomTemplate atomItemTemplate config context items

Validating and Using Our Feeds

If you’ve made it this far and have successfully generated and published your atom.xml and/or rss.xml files, see if they’re valid! Head to https://validator.w3.org/feed/ and see if yours validate.

You can check out your new feed in an RSS/Atom feed reader such as the browser plugin FeedBro or any others.

Unexpected Issue: Setting an updated Field

I ran into a feed validation problem where, in a few posts, I manually set the updated field to a date – not datetime – and thus invalidated my feed. The value 2017-06-30 needed to be in the "%Y-%m-%dT%H:%M:%SZ" format, or 2017-06-30T00:00:00Z. This led me down a rabbit hole that ended in me essentially repurposing the dateField code from hakyll (https://github.com/jaspervdj/hakyll/blob/c85198d8cb6ce055c788e287c7f2470eac0aad36/lib/Hakyll/Web/Template/Context.hs#L273-L321). While I tried to use parseTimeM and formatTime from Data.Time.Format in my own way, I couldn’t make it as simple as I wanted, thus leading to me giving up and using what was already there. Here’s what I did:

feedCtx :: Context String
feedCtx =
    updatedField <> -- THIS IS NEW
    postCtx      <>
    bodyField "description"


updatedField :: Context String
updatedField = field "updated" $ \i -> do
    let locale = defaultTimeLocale
    time <- getUpdatedUTC locale $ itemIdentifier i
    return $ formatTime locale "%Y-%m-%dT%H:%M:%SZ" time


getUpdatedUTC :: MonadMetadata m => TimeLocale -> Identifier -> m Clock.UTCTime
getUpdatedUTC locale id' = do
    metadata <- getMetadata id'
    let tryField k fmt = lookupString k metadata >>= parseTime' fmt
    maybe empty' return $ msum [tryField "updated" fmt | fmt <- formats]
  where
    empty'     = fail $ "Hakyll.Web.Template.Context.getUpdatedUTC: " ++ "could not parse time for " ++ show id'
    parseTime' = parseTimeM True locale
    formats    =
        [ "%a, %d %b %Y %H:%M:%S %Z"
        , "%Y-%m-%dT%H:%M:%S%Z"
        , "%Y-%m-%d %H:%M:%S%Z"
        , "%Y-%m-%d"
        , "%B %e, %Y %l:%M %p"
        , "%B %e, %Y"
        , "%b %d, %Y"
        ]

Woah! We need to break down what’s happening here.

feedCtx

The addition to feedCtx is before our postCtx because of the mappend precedence of what comes out of the pipeline with the value updated. We want first rights to transforming the updated field, so it needs to come first.

updatedField

This function is a Context that leans on hakyll’s field function to say that we want to work with the updated field and then do some Monad stuff with time. The tl;dr is that we send the field’s current value off in order to get a UTCTime value back, and then we format it to be the way we need it.

getUpdatedUTC

It’s really not as bad as it looks! The root of this function does two things:

  1. looks up the the updated value in the metadata
  2. tries to parse it using a bunch of different formats

If it can’t do these things, it simply fails.


Yes, I could have simply written my updated field in the correct format. But where’s the fun in that? I would hate for my feed to silently invalidate itself over something so simple!

Wrapping Up

Whew! We dove in to generating Atom & RSS XML feeds with hakyll, uncovered a nice refactor opportunity via feedCompiler, learned how to validate our feeds and ultimately learned about how a seemingly harmless updated date could prevent us from having a totally valid feed!

Next up:


Thank you for reading!
Robert