Skip to main content
Martin Hähnel

Creating A Guestbook To Crowd Source Alt Texts - Part 1 (Architecture & Setup)

Intro

It's a long-standing tradition of this blog to post a dog pic - a "DailyDogo" per day. I have done so from day one when our dog Napu moved in with us, and I plan to continue to do so for the foreseeable future. However, one thing that always bothered me is that none of these dog pics have alt text, which is important to be inclusive and accessible.

If you've never heard about alt text: It is used by people (not only) using screen readers to be able to understand what an image is about. So alt text is a short description of what you see on the image.

Anyways, I have not added alt-text to my dog pics.[1] But I have come up with a way how YOU can help me to add alt texts to DailyDogos: A guestbook that instead of some kind of "Are you human" check asks that people provide alt text to a random dog pic![2]

Architecture

A diagram outlining the architecture of a guestbook system for a blog. The frontend (/guestbook) connects to a worker providing an API through a catch-all pages function (blog.martin-haehnel.de/api). The API includes six endpoints (getAll, approve-guestbook, create, approve-alt-text, refreshCandidates, and getCandidate), shown in green. These endpoints interact with different components: a GUESTBOOK_CACHE namespace (yellow) used by getAll; a RATE_LIMIT namespace (yellow) linked to create; three database tables (red): guestbook_entry, alt_text_suggestion, and alt_text_candidate, which store guestbook entries and alt-text suggestions. The architecture is color-coded: Eleventy (black), api-Worker (green), Blog (pages project) (blue), D1 Database (red), and KV Key/Value Store (yellow). Dashed lines indicate data flow between components, showing how different endpoints interact with storage and caching.

This site is based on Cloudflare's developer platform, meaning there isn't really a server involved, where I would ftp or ssh files to and configure a http server to handle a route in this way or that way. If that'd be the case I would just build a small php api that handles reading and writing guestbook entries and alt text suggestions.

But this blog is just a static website based on Eleventy and is deployed using Cloudflare Pages, a (mostly) static site host.

I did say "(mostly)" though. Because you can actually have some dynamic functionality by using something called "pages functions" and workers. And that's what we'll use to make this guestbook a reality.

Database Structure

The cloudflare platform has a managed database service called D1, which we'll use to save our guestbook's data and alt text candidates. So we'll need to create three tables[3]:

  1. Table guestbook_entry
    • id
    • name
    • website
    • message
    • approved
  2. Table alt_text_candidate
    • id
    • post_url
    • image_url
  3. Table alt_text_suggestion
    • id
    • alt_text_suggestion
    • fk guestbook_message.id
    • fk alt_text_candidate.id
    • approved

By having a column in the alt_text_suggestion table that references the guestbook entry which was created alongside it, we'll know later where a suggestion came from. That could be nice if we want to give attribution, or make a leader board, or whatever. The suggestions are also related to the candidate - which is what I call a dog pic without an (approved) alt text entry so far.

Worker

We'll create a minimalistic api which consists of four end points:

The getAll endpoint will just do a SELECT * FROM guestbook_entry and return a JSON that can then be used to create a list of guestbook entries on the frontend. For now that's enough. In the future we may want to have pagination functionality, and so we might want to add an offset and limit parameter later. But that's later.

The create endpoint will be called when a user submits a new entry to the guestbook. It creates an INSERT statement for the guestbook_entry table. It checks if the request includes a new alt text suggestion and then INSERTs that data into that table, too.

The getCandidate endpoint returns an entry from the alt_text_candiate table that doesn't have an approved entry, yet. We will get slightly fancy and introduce some randomness, so that we'll get different images with consecutive reloads of the form.

The refresh-candidates endpoint has the most complicated logic since it has to parse html (the entries of a JSON feed) AND check if the candidate is already in the database.

Pages

Cloudflare Pages is what powers the blog itself. The way it works is that Cloudflare builds and redeploys the blog any time I push a new commit to the blog's Github repo. That part won't change, but we'll have to create a new page called /guestbook/.

The /guestbook/ page will consist of two parts:

Also, we want to have a clean way of accessing the API. In theory, we could just access our worker through its URL - every worker gets its own URL - where it is reachable, but I think it's nicer if it just works through https://blog.martin-haehnel.de/api/.

So in other words, the frontend JavaScript will not communicate directly with the api worker, but will send its request instead to an /api/ catch all endpoint. That endpoint will be using a Cloudflare Pages function, which then in turn forwards the request to the actual worker. In that way, we will have only one visible domain involved in serving the blog and its dynamic parts.[4]

Setup

Worker Repo

I started by creating a new worker through what is called C3, a cli tool that creates new repos based on existing templates.

npm create cloudflare@latest -- --template=cloudflare/templates/d1-template

This tool will ask a couple of questions, like what directory we want to install the new worker into (cloudflare-guestbook) and if we want to use Git for version control (yes) and also if we want to deploy the new worker as is (no).

Database, KV, Secrets

Initial Migration

Let's start by adjusting the migration file. I renamed mine to 0001_init.sql and replaced its contents as follows:

-- Migration number: 0001 	 2025-02-16T22:03:36.794Z
-- Guestbook Entries
CREATE TABLE IF NOT EXISTS guestbook_entry
(
  id INTEGER PRIMARY KEY NOT NULL,
  name TEXT NOT NULL,
  website TEXT,
  message TEXT NOT NULL,
  approved BOOLEAN NOT NULL DEFAULT FALSE,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Alt Text Candidates
CREATE TABLE IF NOT EXISTS alt_text_candidate
(
  id INTEGER PRIMARY KEY NOT NULL,
  post_url TEXT NOT NULL,
  image_url TEXT NOT NULL,
  status TEXT NOT NULL DEFAULT 'pending' CHECK (status IN ('pending', 'completed', 'removed')), -- pending = not yet processed, completed = alt text added, removed = image removed
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  CONSTRAINT unique_post_image UNIQUE (post_url, image_url)
  );

-- Alt Text Suggestions
CREATE TABLE IF NOT EXISTS alt_text_suggestion
(
  id INTEGER PRIMARY KEY NOT NULL,
  alt_text_suggestion TEXT NOT NULL,
  guestbook_message_id INTEGER NULL,
  alt_text_candidate_id INTEGER NULL, -- Allow NULL to keep suggestion even if candidate is deleted
  approved BOOLEAN NOT NULL DEFAULT FALSE,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  FOREIGN KEY (guestbook_message_id) REFERENCES guestbook_entry (id) ON DELETE CASCADE,
  FOREIGN KEY (alt_text_candidate_id) REFERENCES alt_text_candidate (id) ON DELETE SET NULL
  );

-- Indexes for Optimized Queries
CREATE INDEX idx_guestbook_approved ON guestbook_entry (approved);
CREATE INDEX idx_suggestion_candidate ON alt_text_suggestion (alt_text_candidate_id);
CREATE INDEX idx_suggestion_approved ON alt_text_suggestion (approved);
CREATE INDEX idx_candidate_image ON alt_text_candidate (image_url);
CREATE INDEX idx_candidate_status ON alt_text_candidate (status);

-- Auto-update `updated_at` on row modifications
CREATE TRIGGER update_guestbook_updated_at
  AFTER UPDATE
  ON guestbook_entry
  FOR EACH ROW
BEGIN
  UPDATE guestbook_entry SET updated_at = CURRENT_TIMESTAMP WHERE id = OLD.id;
END;

CREATE TRIGGER update_candidate_updated_at
  AFTER UPDATE
  ON alt_text_candidate
  FOR EACH ROW
BEGIN
  UPDATE alt_text_candidate SET updated_at = CURRENT_TIMESTAMP WHERE id = OLD.id;
END;

CREATE TRIGGER update_suggestion_updated_at
  AFTER UPDATE
  ON alt_text_suggestion
  FOR EACH ROW
BEGIN
  UPDATE alt_text_suggestion SET updated_at = CURRENT_TIMESTAMP WHERE id = OLD.id;
END;

Lots of stuff going on here.

Creating The Resources And Configuring Bindings

Next, we'll actually create our database, apply the migration, create a secret and a couple of kv namespaces (used for caching and rate limiting, more on that later...) through Cloudflare's cli tool wrangler and binding them to the worker - so that it can use them. What's important here is that we will want to have three different setups:

  1. local - where d1 and kv is just simulated locally
  2. remote preview - where we run our worker on Cloudflare's infrastructure but not on production data
  3. prod - our actual productively deployed version of the guestbook
Prod

We'll start "backwards", so to speak, because some local resources are implicitly created when creating those resources on Cloudflare.

Database
npx wrangler d1 create blog-guestbook

After the database was created, you'll be shown a little piece of JSON:

{
  "d1_databases": [
    {
      "binding": "DB",
      "database_name": "blog-guestbook",
      "database_id": "51eab7a0-a1a3-a441-ce5b-8da45d3a61ad"
    }
  ]
}

Take note of the database_name and database_id. Next, we'll edit the wrangler.jsonc:

{
  //...
  "d1_databases": [
    {
      "binding": "DB",
      "database_id": "51eab7a0-a1a3-a441-ce5b-8da45d3a61ad", // <- update this
      "database_name": "blog-guestbook" // <- maybe update this
    }
  ],
  //...
}

Next, let's apply our migration:

npx wrangler d1 migrations apply --remote blog-guestbook

It'll ask if we actually want to apply the migration (Y). Afterwards we should be able to see our tables with the correct columns through the Cloudflare Dashboard.

KV

KV is another service from Cloudflare that implements a simple Key/Value store that is optimized for quick reads - which makes it perfect for caching. KV stores do not have complicated columns, etc. You can create/read/update/delete keys and their values and that's it.

We'll use two "tables" (called namespaces) to store the following:

So let's create those namespaces:

npx wrangler kv:namespace create "RATE_LIMIT"
npx wrangler kv:namespace create "GUESTBOOK_CACHE"

After each run of the command, we get another bit of json we can add to our wrangler.jsonc configuration file. Which in the end should look something like this:

{
  //...
  "kv_namespaces": [
      {
        "binding": "RATE_LIMIT",
        "id": "91b5e789d42744aar802aace073ed882",
      },
      {
        "binding": "GUESTBOOK_CACHE",
        "id": "gg19ede5038147ccbcc2a90f65495f92",
      }
  ],
  //...
}
Secret

Because we're going to have a few routes that are not supposed to be used by anyone else but the admin of the blog, we'll create a secret API-Key that we'll check for when these admin routes are hit.

A secret is in practice nothing else than an environment variable. But in this case it is masked and not part of the source code repo. You create these by using wrangler:

npx wrangler secret put ADMIN_API_KEY

Wrangler is going to ask for the value and then it'll save the secret. You can refer to it as if it was just a normal environment variable.[6]

Preview Environment

Okay, moving on the preview environment. The reason why we want this environment is that even though we can - and should - develop our little worker mostly locally, it is still a good idea to test the worker in its actual environment.

Database

Let's create a preview database:

npx wrangler d1 create blog-guestbook_preview

This time we only need the id of the database. We will extend our already existing d1 binding:

{
  //...
  "d1_databases": [
    {
      "binding": "DB",
      "database_id": "51eab7a0-a1a3-a441-ce5b-8da45d3a61ad",
      "database_name": "blog-guestbook",
      "preview_database_id": "bbe2ce46-acfc-4072-817d-5441a90c7419" //<- new
    }
  ],
  //...
}

And now we'll apply the migration:

npx wrangler d1 migrations apply --preview --remote blog-guestbook
KV

Let's add preview ids to our KV namespaces next:

npx wrangler kv:namespace create "RATE_LIMIT" --preview
npx wrangler kv:namespace create "GUESTBOOK_CACHE" --preview

In both cases we get a JSON looking like this:

{
  "kv_namespaces": [
    {
      "binding": "RATE_LIMIT",
      "preview_id": "b1ad7e6dc2a8441087a546578c8de2b6"
    }
  ]
}

Take note of the preview_id line and copy it:

{
  //...
  "kv_namespaces": [
      {
        "binding": "RATE_LIMIT",
        "id": "91b5e789d42744aar802aace073ed882",
        "preview_id": "b1ad7e6dc2a8441087a546578c8de2b6" // <- here...
      },
      {
        "binding": "GUESTBOOK_CACHE",
        "id": "gg19ede5038147ccbcc2a90f65495f92",
        "preview_id": "93c1cb783b24461fba02afb20566c34e" // <- ... and here
      }
  ],
  //...
}
Secret

The preview environment is going to be reusing the prod secret. The big difference between prod and preview is simply the data storage. So nothing needs to be done here.

Local Environment
Database

For the local environment, the database will be created for us, based on the binding configuration in the wrangler config file. We do have to apply the migration to it though. Which is done as follows:

npx wrangler d1 migrations apply --local blog-guestbook
KV

Nothing to do here. These will be created locally, based on our config from earlier.

Secret

For our local env to work we need a not-so-secret secret for the ADMIN_API_KEY.

First we create a .dev.vars file in the root of our repo and then add there something like this:

ADMIN_API_KEY=123

Coding Setup

Code-Organization And Configuration

Let's start by preparing the repo for a good editing experience by adding a .editorconfig file to the repo:

root = true

[*]
charset = utf-8
end_of_line = lf
indent_style = tab
insert_final_newline = true
max_line_length = 120
tab_width = 4
trim_trailing_whitespace = true

Next, we'll make sure that we can autoformat and lint any added code. For this, we'll use prettier and eslint.

The template we initially used to get going defined the entrypoint as /src/index.ts, so we'll add all our newly written code under src going forward as well.

Testing

Manual testing is nice, but we should also do some automated testing. Cloudflare recommends using vitest, so we'll do just that.

In the end we should have installed a couple of packages, a vitest.config.ts in the repo root and an additional tsconfig.json inside the test folder.[7]

We're now ready to write tests.

Let's write a dummy test and run it to see that everything works as expected. Create a file in test named dummy.test.ts and add the following:

import { describe, expect, it } from "vitest";
import { createExecutionContext, env, waitOnExecutionContext } from "cloudflare:test";

describe("dummy", () => {
	it("should pass", () => {
		expect(1).toBe(1);
	});
	it("should be able to use cloudflare workers integration", () => {
		expect(env).toBeDefined();
		expect(createExecutionContext).toBeDefined();
		expect(waitOnExecutionContext).toBeDefined();
	});
});

We'll add more tests later, but this should show us that testing is setup correctly and that everything is set up correctly. You can either run this in your IDE or through the commandline using:

npx vitest
Husky

To ensure that any code committed is according to our coding style standards and passes our tests, let's also use Husky, a handy tool which will make sure any code about to be committed to source control is ready.

This installs Husky and a git hook that tries to run npm run test which doesn't exist at this point. So let's add a test script and a linting a formatting script as well to our package.json:

{
  //...
  "scripts": {
    "prepare": "husky", // this one should already exist
    "test": "vitest run",
    "test:watch": "vitest run --watch",
    "lint": "eslint",
    "lint:fix": "eslint --fix",
    "format": "prettier . --check",
    "format:write": "prettier . --write"
  }
  //...
}

The test:watch, lint:fix and format:write scripts are not needed for husky, but for development. But since we were already doing stuff in the file, why not add them?

Let's update the git hook to use test, lint and format. Edit the file .husky/pre-commit:

npm run test
npm run format
npm run lint

That's it. Now anytime we update the worker, we ensure that those updates don't accidentally change the code for worse.


  1. I do add alt text to all other pictures that accompany my writing, though. ↩︎

  2. This will be skippable of course. ↩︎

  3. The following is not complete, there are creation dates, indexes and triggers for the updated_at columns in the actual migration. See below. ↩︎

  4. In theory we could've have implemented the whole guestbook using pages functions. I didn't do that because I wanted to know how workers work. But it would've been totally feasible to do so using only the pages application. ↩︎

  5. I'm not 100% sure this is a good idea, as there may have been times where there were more than 1 doc pic per post, but for now, we'll work with this. ↩︎

  6. Normal env vars can be set in your wrangler.jsonc. We just don't have a need for them here. ↩︎

  7. Make sure btw. that you change the line where it uses a wrangler.toml. We're using a wrangler.jsonc after all. ↩︎