Using Google Sheets and Azure Functions as a headless CMS

U

One of my clients came to me with a relatively simple requirement; they needed to take online payments for their fitness event photography business.

The model is quite simple:

  • There are several different competition dates per year.
  • Competitors can purchase professional photos of themselves at a competition.
  • Each competitor will have a ‘badge number’ to identify them.
  • If purchased before the event, the price is £42.50. If after, it’s £52.50 (needs to be configurable)

I proposed a very basic order form, with the following fields.

  • Event
  • Name
  • Email Address
  • Badge Number
  • Payment Details

Currently, they don’t have a full online website, so a CMS to manage the event date list would be over-kill.

We decided to experiment with a simple Google Sheets spreadsheet to provide a simple way of editing a list of events, and what prices the photos for each event would be, turning Google Sheets into a rudimentary content management system. To do this, we used a very simple 5 column sheet:

`Title | Date | Before Event Price | After Event Price | Key`

Most columns are self explanatory.
Key will be a url friendly slug of the event title, along with date (for use in URLs later, etc…)
This is generated automatically from the Tile and Date, using a script
(Tools -> Script Editor)

This script can be used in a row by calling it, replacing the respective A/B row number:

=Slugify(A2,B2)
=Slugify(A3,B3)
=Slugify(A4,B4)
…etc

API

Now I needed a way of exposing this as an API, so the very basic order form could call on it to populate a dropdown list.
I’ll use a simple Azure Function to process the Google Sheet.

First, I created a publicly viewable link for the Google Sheet:

And appended `?tqx=out:csv&sheet={Events}` – which causes the file to be sent out as a CSV.

The Azure Function then essentially does a GET call to the shareable link, and then uses CsvHelper to parse the CSV retrieved.
Check it doesn’t contain #NAME? (If it does, retry 5 times before failing) – and returns JSON:

The full code is available on GitHub.

Caching

Right now, the Azure Function retrieves the CSV from Google Sheets every request.
This sheet rarely changes (days / weeks between) so it makes sense to cache it.

Rather than doing this within the function itself, I decided to use Cloudflare.

First, I created a subdomain – api.mattmarshevents.co.uk

And configured the Azure Function to run under that.

You’ll notice it states ’Not Secure’
It is – but I’m lazily doing it using Cloudflare – I could use LetsEncrypt for this, but that’s a separate piece of work.

I added a `Cache-Control` header to the response from the Azure Function:

req.HttpContext.Response.Headers.Add("Cache-Control", "public, max-age=600, s-maxage=600");

This lets Cloudflare know to cache the response (for 600 seconds – but this could theoretically be longer)

And that’s it- our static content is now cached on the edge, by Cloudflare, meaning it’s fast.
Typically returning in ~20ms – and importantly, not hitting the Azure Function, and not charging.

But there’s a problem:
If a change is made to the Google Sheet, we have to wait up to 600 seconds (as per our cache policy) for it to be reflected.

Clearing The Cache When Required

As co-found of of StackOverflow Jeff Attwood said:

There’s 2 hard things in computer science.

  • Naming things.
  • Cache invalidation
  • Off by one errors.
  • In our case, when the Google Sheet is edited, we ideally want to clear the Cloudflare cache.

    I noticed in the Cloudflare API, there was an ability to purge files by url:
    https://api.cloudflare.com/#zone-purge-files-by-url

    They give a curl command:

    But that’s not much help to me; I need c#
    Fortunately, for the lazy like myself, there’s https://curl.olsh.me/ – which takes a curl request, and translates it into an `HttpClient` request:

    Using that, I was able to piece together another function:

    So now, by sending a http DELETE to api.mattmarshevents.co.uk/events/cache would fire this function, and delete the items from the Cloudflare cache.

    Last piece of the puzzle, was to wire this up to be called automatically every time the Google Sheet was changed.

    I was able to do this by using another Google Script.
    More specifically, a trigger:

    Then add an ‘On Edit’ trigger:

    This calls a new Google Script function – ClearCache.

    The body of this trigger function is very simple:

    …which sends the required `DELETE` to `https://api/mattmarshevents.co.uk/events/cache`

    Keep-Alive

    It’s well documented that Azure Functions go to sleep after a period of time.
    To combat this, I added a very simple Timer Triggered function to my app, that runs every 5 minutes – that simply does nothing.
    This will prevent the function app from going to sleep.

    Cost

    Now that the solution has been up and running for about a month (and taken some orders) I took a look at how much this was costing:

    £0.03p for a month.

    Looking in Application Insights, I can see there’s been about 3000 requests to `GetEvents`

    requests
    | where name == ‘GetEvents’
    | where timestamp > ago(30d)

    … this could be reduced, by eliminating the keep-alive, and increasing the cache lifetime.

    Alternative to Azure Functions

    Originally, I looked at https://sheety.co – which on the face of it, looked ideal.
    However, there were a couple of issues:

    Firstly, my sheet has many empty rows (to allow for future population).
    These weren’t always ignored, meaning I would have 95+ empty rows.

    Secondly, and more importantly, since I’m using a calculated column (For ‘key’), these weren’t always fully evaluated before being returned (I do believe this is a big / limitation with Google Sheets, rather than Sheet, however)