Fighting With The Past

I am an avid user of RSS and I love my RSS app to death. It brings me a steady stream of news and information every day from the sources that I personally curate; a stream uncorrupted by the ad-funded Internet we have now. It is my breath of fresh air every morning before I suit up with adblockers and venture into the cancer-ridden wasteland of ads and content filler. The premise was simple: no-BS news fully in text with some images sprinkled in. Every article does not waste your time or attention. If need be, click the link at the bottom to read the full article. So a few months ago, an idea lit up in my mind and I thought "Gee, wouldn't it be neat if my website was readable via RSS?" and I got to work.

Browsing RSS Specs

As with most endeavours, everything seemed simple on the surface. Browsing through the RSS specifications, everything looked fine and dandy. A RSS feed is just an itemized list of your latest posts alongside some metadata about your site. Just generate a new XML file after every new post and serve it, and you're set for the day, right? Oh, how wrong I was.

Static Pages And XML

To put it simply, my posts are handwritten in HTML and are not dynamically generated with some CMS. This means that I have to find a way to convert HTML to a XML-kosher format somehow. Thus the hunt begun. In the end, I found Tidy, a tool that can clean up my messy HTML documents to XHTML. XHTML is XML-friendly, but it wasn't the end-all. I only needed the body, not the metadata. This was easily achievable with xmllint and XPATH. With the body prepped and ready, the tricky part is that while syntactically-comformant, HTML tags do not work. I wrapped the body as a CDATA section and went by my way.

I Will Not Regret This, Will I?

The last piece of the puzzle was the metadata for the posts. I went with a JSON file as a temporary databases for the posts, but this is a solution that is bound to bite me back in the future, but who cares about future me right? It works. The duct tape will do for now.

Conclusion

These 3 paragraphs took me weeks to read up about XML and RSS and thinking about the solution. While it works, this is less than ideal. I will be wrangling 3 data formats with a Bash script that is becoming increasingly unwieldy. Reading up XML has also enlightened, if not misguided, me that I need write with XML documents moving forward. If not redesigned, this issue is a ticking bomb waiting to blow up in my face. Time to think really hard. Thanks for reading.


No man ever steps in the same river twice, for it's not the same river and he's not the same man.
- Heraclitus