Hakyll Pt. 3 – Generating RSS and Atom XML Feeds
This is part 3 of a multipart series where we will look at getting a website / blog set up with hakyll and customized a fair bit.
- Pt. 1 – Setup & Initial Customization
- Pt. 2 – Generating a Sitemap XML File
- Pt. 3 – Generating RSS and Atom XML Feeds
- Pt. 4 – Copying Static Files For Your Build
- Pt. 5 – Generating Custom Post Filenames From a Title Slug
- Pt. 6 – Pure Builds With Nix
- (wip) Pt. 7 – Customizing Markdown Compiler Options
Overview
- Hakyll Feed Required Reading
- Hakyll’s Prebuilt RSS & Atom Templates
- Hakyll Feed Setup
- Creating the Atom & RSS XML Files
- Unexpected Issue: Setting an
updated
Field - Using Your Own RSS & Atom Templates
- Validating and Using Our Feeds
Hakyll Feed Required Reading
There is already a great starter guide at https://jaspervdj.be/hakyll/tutorials/05-snapshots-feeds.html, so be sure to read this first – it might make it so you don’t have to read this blog post at all.
Hakyll’s Prebuilt RSS & Atom Templates
Thankfully, hakyll aready comes with prebuilt RSS and Atom templates! You can find the source here: https://github.com/jaspervdj/hakyll/tree/master/data/templates. While you won’t need to copy and paste nor even directly use these files, you should look them over to see what fields they are expecting. There are two levels to be aware of: the feed itself and each individual feed item.
Feed-Level
The feed itself is looking for the following, and you’ll provide these through
a FeedConfiguration
that we’ll discuss in a moment. Here are the fields the
atom.xml
and rss.xml
templates are expecting:
* title
(title of feed)
* description
(description of feed)
* authorName
(feed author name)
* authorEmail
(feed author email)
* root
(your website)
* updated
(feed last updated at; should be done for you)
* body
(feed body; should be done for you)
* url
(path to the XML file; based off of a create ["rss.xml"]
function
that we’ll discuss)
Feed Item-Level
Each feed item, or entry, expects the following:
* title
(title of the entry)
* root
(your website)
* url
(path to resource)
* published
(published date; "%Y-%m-%dT%H:%M:%SZ"
format; should be done for
you via hakyll’s dateField
context)
* updated
(updated date; "%Y-%m-%dT%H:%M:%SZ"
format; should be done for
you, unless you provide you own)
Now that you know what sort of data are expected, let’s begin.
Hakyll Feed Setup
As is introduced in the required hakyll feed
reading, we need
to create a FeedConfiguration
. If you’d like to see the FeedConfiguration
data constructor, you can view it here: https://github.com/jaspervdj/hakyll/blob/f3a17454fae3b140ada30ebef13f508179f4cd0d/lib/Hakyll/Web/Feed.hs#L63-L75.
feedConfiguration :: FeedConfiguration
=
feedConfiguration FeedConfiguration
= "My Blog"
{ feedTitle = "Posts about x, y & z"
, feedDescription = "My Name"
, feedAuthorName = "[email protected]"
, feedAuthorEmail = "https://example.com"
, feedRoot }
We should next figure out what we want our “feed context” to consist of. The official hakyll feed guide (linked above) is:
let feedCtx = postCtx `mappend` bodyField "description"
-- which can be abbreviated to
let feedCtx = postCtx <> bodyField "description"
This will enable you to include the body of your post as the description
, but
if you provide your own description
field in your posts, then this step isn’t
necessary. For the mean time, let’s make our own feedCtx
function that sticks
to the original post.
feedCtx :: Context String
= postCtx <> bodyField "description" feedCtx
If you’re unsure of what postCtx
is, I recommend checking out the previous
article or viewing the source
of this site: https://github.com/rpearce/robertwpearce.com/blob/858163216f445eb8b6ab3b4304b022b64814b6f8/site.hs#L131-L136.
Creating the Atom & RSS XML Files
Here is what the official hakyll feed guide recommends:
"atom.xml"] $ do
create [
route idRoute$ do
compile let feedCtx = postCtx `mappend` bodyField "description"
<- fmap (take 10) . recentFirst =<<
posts "posts/*" "content"
loadAllSnapshots renderAtom myFeedConfiguration feedCtx posts
This is great! However, if we want to generate both an atom.xml
feed and an
rss.xml
feed, we’ll end up with almost duplicated code:
"rss.xml"] $ do
create [
route idRoute$ do
compile let feedCtx = postCtx `mappend` bodyField "description"
<- fmap (take 10) . recentFirst =<<
posts "posts/*" "content"
loadAllSnapshots renderRss myFeedConfiguration feedCtx posts
It looks like all the feed compilation is exactly the same except for the
renderAtom
and renderRss
functions that come bundled with hakyll. With this
in mind, let’s write our own feed compiler and reduce as much boilerplate as we
reasonably can.
To start out, let’s see what we want our top-level end result to be:
"atom.xml"] $ do
create [
route idRoute
compile (feedCompiler renderAtom)
"rss.xml"] $ do
create [
route idRoute compile (feedCompiler renderRss)
While we could potentially abstract this further, this leaves wiggle room for
customizing the route
for whatever reason you may want to.
This feedCompiler
is a function that we need to write that will house the
missing logic. Let’s look at its type:
feedCompiler :: FeedConfiguration
-> Context String
-> [Item String]
-> Compiler (Item String)
-> Compiler (Item String)
The first 4 parameters describe the types of both renderAtom
and renderRss
(they’re the same). For reading’s sake, let’s set those to a type alias called
FeedRenderer
:
type FeedRenderer =
FeedConfiguration
-> Context String
-> [Item String]
-> Compiler (Item String)
And now we can define our feed but do it in a slightly cleaner way:
feedCompiler :: FeedRenderer -> Compiler (Item String)
=
feedCompiler renderer
renderer feedConfiguration feedCtx=<< fmap (take 10) . recentFirst
=<< loadAllSnapshots "posts/*" "content"
Using Your Own RSS & Atom Templates
Thanks to Abhinav Sarkar on lobste.rs, I was pointed to a pull request, https://github.com/jaspervdj/hakyll/pull/652, that allows hakyll users to use their own feed templates. Here is some example usage from the PR:
customRenderAtom :: FeedConfiguration -> Context String -> [Item String] -> Compiler (Item String)
= do
customRenderAtom config context items <- unsafeCompiler $ readFile "templates/atom.xml"
atomTemplate <- unsafeCompiler $ readFile "templates/atom-item.xml"
atomItemTemplate renderAtomWithTemplates atomTemplate atomItemTemplate config context items
Validating and Using Our Feeds
If you’ve made it this far and have successfully generated and published your
atom.xml
and/or rss.xml
files, see if they’re valid! Head to
https://validator.w3.org/feed/ and see if yours validate.
You can check out your new feed in an RSS/Atom feed reader such as the browser plugin FeedBro or any others.
Unexpected Issue: Setting an updated
Field
I ran into a feed validation problem where, in a few posts, I manually set the
updated
field to a date – not datetime – and thus invalidated my feed. The
value 2017-06-30
needed to be in the "%Y-%m-%dT%H:%M:%SZ"
format, or
2017-06-30T00:00:00Z
. This led me down a rabbit hole that ended in me
essentially repurposing the dateField
code from hakyll (https://github.com/jaspervdj/hakyll/blob/c85198d8cb6ce055c788e287c7f2470eac0aad36/lib/Hakyll/Web/Template/Context.hs#L273-L321).
While I tried to use parseTimeM
and formatTime
from Data.Time.Format
in my own way, I couldn’t make it as simple as I wanted, thus leading to me
giving up and using what was already there. Here’s what I did:
feedCtx :: Context String
=
feedCtx <> -- THIS IS NEW
updatedField <>
postCtx "description"
bodyField
updatedField :: Context String
= field "updated" $ \i -> do
updatedField let locale = defaultTimeLocale
<- getUpdatedUTC locale $ itemIdentifier i
time return $ formatTime locale "%Y-%m-%dT%H:%M:%SZ" time
getUpdatedUTC :: MonadMetadata m => TimeLocale -> Identifier -> m Clock.UTCTime
= do
getUpdatedUTC locale id' <- getMetadata id'
metadata let tryField k fmt = lookupString k metadata >>= parseTime' fmt
maybe empty' return $ msum [tryField "updated" fmt | fmt <- formats]
where
= fail $ "Hakyll.Web.Template.Context.getUpdatedUTC: " ++ "could not parse time for " ++ show id'
empty' = parseTimeM True locale
parseTime' =
formats "%a, %d %b %Y %H:%M:%S %Z"
[ "%Y-%m-%dT%H:%M:%S%Z"
, "%Y-%m-%d %H:%M:%S%Z"
, "%Y-%m-%d"
, "%B %e, %Y %l:%M %p"
, "%B %e, %Y"
, "%b %d, %Y"
, ]
Woah! We need to break down what’s happening here.
feedCtx
The addition to feedCtx
is before our postCtx
because of the mappend
precedence of what comes out of the pipeline with the value updated
. We want
first rights to transforming the updated
field, so it needs to come first.
updatedField
This function is a Context
that leans on hakyll’s field
function to say that
we want to work with the updated
field and then do some Monad stuff with
time. The tl;dr is that we send the field’s current value off in order to get a
UTCTime
value back, and then we format it to be the way we need it.
getUpdatedUTC
It’s really not as bad as it looks! The root of this function does two things:
1. looks up the the updated
value in the metadata
1. tries to parse it using a bunch of different formats
If it can’t do these things, it simply fail
s.
Yes, I could have simply written my updated
field in the correct format. But
where’s the fun in that? I would hate for my feed to silently invalidate itself
over something so simple!
Wrapping Up
Whew! We dove in to generating Atom & RSS XML feeds with hakyll, uncovered a
nice refactor opportunity via feedCompiler
, learned how to validate our feeds
and ultimately learned about how a seemingly harmless updated
date could
prevent us from having a totally valid feed!
Next up: * Pt. 4 – Copying Static Files For Your Build * Pt. 5 – Generating Custom Post Filenames From a Title Slug * (wip) Pt. 6 – Customizing Markdown Compiler Options
Thank you for reading!
Robert