Making this website

Written on 2019-07-24

I wanted to have my own corner of the internet for a while now. Programming is one of my passions, so building a website wouldn't be a problem. But I'm a student and I don't have an income so I can't afford paying for a server. Luckily there are services that host your static websites for free. I chose to use GitLab Pages because GitLab is open source. Other alternatives that I know of are GitHub Pages (proprietary software) and Netlify.

Having chosen my platform, I needed to start building my website. Because I'm lazy, I searched for the easiest option, and at that time it sounded like "static website generators". Hugo I thought was cool. I grabbed a template, I customized it to make it look less like a template (but I didn't spend too much time on that) and I put it online.

I had a blog. I could write the posts in markdown. Everything was nice and cool.

But I didn't like it.

You see, Hugo required me to learn its templating language to be able to really understand that template and further customize it or to create my own theme. Why did I choose Hugo in the first place? Because I'm lazy, remember? I didn't want to spend two days reading documentation to be able to generate my own theme.

So I wrote my own thing.

I saw other people build cool things with only shell scripts. I thought a shell script would be enough to generate a website: I only need to put strings in files - a complicated programming language made no sense here. Plus I had pipes. I love pipes.

I liked the idea of composing a full HTML document from little reusable parts. Those are sometimes called partials. Mine are "cogs". My website would put together a bunch of cogs to build a working webpage.

"Templating" is done - next come the posts. I thought markdown is a great format for writing blog posts1. Plain old HTML would also work but it's not that pretty, let's be sincere. And

<a href="https://www.website.com">link text</a>
is a lot more to write than
[link text](https://www.website.com)

(Did I say I'm lazy?)

After I duckduckwent on the web and searched for a Markdown to HTML converter I found comrak. Comrak was exactly what I needed because I could pipe some Markdown into it and it can write the HTML to standard output.

Markdown to HTML was all figured out, but I needed a way to describe some metadata for each post. Thanks to Stack Overflow I found some beautiful sed & awk combo code that reads the YAML I extract from each post with more sed and loads it into bash variables (please don't yell at me because I used eval).

All I needed now was to combine all those things in a script to create the final product.

Here it is.

I wrote some functions to turn this thing into something like a DSL (a DSL that was hit in the head with a spiked bat):


function select_file {

function write_to_doc {
	printf "$1" >> $SELECTED_FILE

function begin {
	write_to_doc "<$1 class=\"$2\">\n"

function end {
	write_to_doc "\n"

function write_cog {
	cat "./cogs/$1" >> $SELECTED_FILE
	printf "\n" >> $SELECTED_FILE

function start_doc {
	select_file "$1"
	begin "!DOCTYPE html"
	begin html
	write_cog head.html
	begin body

function end_doc {
	end body
	end html

Armed with those functions, let's build the contact page:

echo "Building contact.html"
start_doc contact.html
	write_cog header.html
	write_cog contact_main.html
	write_cog footer.html

Simple, right? header.html and all its cog friends are just HTML files in a cogs folder. All it does is create a HTML page with a <head> and an opened <body>, pastes all the cogs inside the body and closes the body when it's done. That's it - just a computer copy-pasting code.

Let's look at something a bit more complex - the Home page. It needs to display the latest 5 posts along with some other information. So how do we do that? Well, the extra info can be written as cogs - it doesn't need dynamic generation. But what do we do for the posts? First, we can list the latest 5 posts with ls -r1 | head -n 5. The posts all have their date as the first part of the file name so that's why that works. ls -1 shows each file name on its own line and ls -r sorts them in descending order (we want to have the "biggest" date first). head -n 5 only shows the first 5 lines of what it receives. For each of those files we need to generate a link - the href is simple, just point to the post's HTML. To have the post's title as the link text we need to extract the YAML and evaluate it with all that sed & awk magic. Let's take a look at the code:

echo "Building index.html"
start_doc index.html
	write_cog header.html
	begin main "width-constrainer"
		write_cog about.html
		write_cog projects.html
		begin section posts
			write_to_doc "<h2>Recent blog posts</h2>\n"
			begin ul
				for FILE in $(ls -r1 ./posts | head -n 5); do
					YAML=$(get_yaml "./posts/$FILE")
					echo "$YAML" > yaml.tmp
					eval $(parse_yaml yaml.tmp)
					begin li
						write_to_doc "<time datetime=\"$date\">$date</time> - <a href=\"/posts/$FILENAME.html\">$title</a>\n"
					end li
			end ul
		end section
		write_cog quick_links.html
	end main
	write_cog footer.html

To write the actual post files, I remove the YAML after I evaluate it and then convert the Markdown with comrak. Here's that part, the rest is just cogs and a big for loop:

CONTENT=$(cat "./posts/$FILE" | sed '/\-\-\-/,/\-\-\-/d')
HTML=$(echo "$CONTENT" | comrak --unsafe)
write_to_doc "$HTML"

(In the sed pattern, those dashes are not actually escaped with backslash, but I had to write them like that here because comrak does not like three dashes one after another. ¯\_(ツ)_/¯ I wonder if pandoc does this too...)

I think I highlighted all the important stuff. Oh, I also have an autobuilder.sh script that regenerates the website if a file changes. It's a one-liner with entr and find:

find ./ -path ./public -prune -o -type f \( -iname \*.md -o -iname \*.html -o -iname \*.css -o -iname \*.js -o -iname wishlist -o -iname builder.sh \) | tail -n +2 | entr sh builder.sh

Building this was a great and fun experience as I learned a bit more about shell scripting, HTML, CSS (this site runs with no JavaScript!2) and RSS feeds (yes, I have a RSS feed) and I finally got to build my website the way I want it to be. I highly recommend to try to build your own generator, at least as a coding experiment.

Want to see the code for the whole project? This website is Free Software: get the code from GitLab.

Do you have questions or recommendations? Send me an email at brown121407@member.fsf.org or message me on Mastodon @brown121407@fosstodon.org.

  1. Markdown is, indeed, good for simple blog posts. But I decided that I can live writing the posts in HTML so I go with that now. It takes less time to build the site, too.
  2. That's not true anymore. I use one line of JavaScript to set the title of the pages. That's it.3
  3. THAT'S NOT TRUE ANYMORE! Fixed my builder script so I no longer need that JavaScript line.