Reading time ~18 minutes

Building a Serverless Mailing List in AWS

A Brief History of CloudSecList
The Big Picture
The Website (Static Hosting & Personal Emails)
Development and Deployment (CI/CD)
Serverless Mailing Solution
Reporting
How Much Does this Cost?
Conclusion

For those who might not know, I curate a mailing list (cloudseclist.com) which focuses on security aspects of cloud-native technologies. Although it started as a “normal” mailing list hosted in Mailchimp, I quickly realized I wanted to create something more tailored to my needs.

This blog post explains how I ended up creating a serverless mailing list in AWS, and how it works.

You can find a follow-up post, describing how I integrated AI to make the content conversational, at:
Transform Years of Content Into a Conversational Knowledge Base

A Brief History of CloudSecList

Everything started in September 2019, when I thought of (somehow) publishing the most interesting security articles I stumbled upon during the week. Since I’ve always spent a good chunk of my day/week going through different sources of news (from Twitter to specialized blogs, etc.), why not share the most interesting ones with more people?

I'm going to try a new experiment: at the end of every week I will collect news and articles on security topics related to the cloud native landscape that I found interesting. Here's this week's ones: https://t.co/1RUiIQdisZ
— Marco Lancini (@lancinimarco) September 8, 2019

Initially, this experiment started as a page on this same website. It was ok, but multiple people I admire like Clint Gibler and Scott Piper suggested to start a mailing list.

Awesome job collecting these. I hadn't seen most of them. You should consider doing a newsletter.
— Scott Piper (@0xdabbad00) September 8, 2019

So, after a week where I spent my evenings choosing a name and drafting a logo (of course), cloudseclist.com was born.

Last week I announced I was going to try a new experiment, collating news & articles on security topics related to the cloud native landscape. After multiple requests from different people, today I’m releasing https://t.co/wqYVRGz8Fi, a new low volume mailing list. Go sign up!
— Marco Lancini (@lancinimarco) September 14, 2019

Since it was still an experiment, I opted for starting the mailing list on Mailchimp, which provided many basic features (like signup forms, and, of course, mass mailing capabilities) out of the box. After a couple of iterations, though, Mailchimp started to show its limitations, which I ended up summarising in this tweet:

A few reasons: 1. I’m not interested in extensively tracking my subscribers 2. I want to use a gitops approach 3. I want to be able to provide search functionalities for finding stuff within previous issues 4. Mailchimp is too expensive for what it offers
— Marco Lancini (@lancinimarco) February 18, 2020

So in February 2020 I got fed up and decided to start working on a prototype:

For the past few weeks I’ve been toying with a side project: getting rid of Mailchimp as a mailer service for https://t.co/6p4dU6gET7, and replacing it with a custom serverless solution built on top of AWS. Tonight I launched the 1st scale up test (~100 emails)... [1/2] pic.twitter.com/R0HzCqRPOu
— Marco Lancini (@lancinimarco) February 17, 2020

And now scaled to 1000 concurrent deliveries (all within 9 minutes)! 🎉 pic.twitter.com/7ppzPLFQqY
— Marco Lancini (@lancinimarco) February 23, 2020

That “prototype” has now turned into an implementation which proved to be both cost-effective and reliable, since cloudseclist.com has been running out of it for the past quarter.

Let’s finally have a look at how this works.

The Big Picture

The current setup of cloudseclist.com looks more or less like the diagram below.

I appreciate this picture might seem daunting at first, given the number of different components involved. To simplify the discussion, and facilitate its analysis, we can break it down into 4 main areas, covered next:

The Website (Static Hosting & Personal Emails)
Development and Deployment (CI/CD)
Serverless Mailing Solution
Reporting

The Website (Static Hosting & Personal Emails)

Given the previous experience I had by running this website (marcolancini.it) on AWS, I decided to adopt a similar approach also for the showcase website of the mailing list, by using a static site hosted in an S3 bucket and deployed automatically via GitHub actions.

I spoke about this kind of setup in a previous blog post (“My Blogging Stack ”), but here are the details for cloudseclist.com.

Excerpt - Static Hosting & Personal Email

Domain and DNS

Although the domain name for cloudseclist.com is registered with Route53, I’ve found CloudFlare to be more customizable (and cheaper) for DNS management, so from Route53 I pointed the authoritative nameservers to the CloudFlare ones.

Static Hosting

Two S3 buckets host the content of the showcase website:

cloudseclist.com: an S3 bucket configured for static web hosting, with a bucket policy that allows everyone to read its content.
www.cloudseclist.com: another bucket configured for static hosting, but which redirects all requests to the “main” bucket cloudseclist.com.

With the 2 buckets setup, DNS entries in CloudFlare are configured to point two CNAMEs (cloudseclist.com, and www) to the URL of the main bucket:

CNAME cloudseclist.com  cloudseclist.com.s3-website.eu-west-2.amazonaws.com
CNAME www               cloudseclist.com.s3-website.eu-west-2.amazonaws.com

I also use CloudFlare for its ability to automatically provide free TLS certificates, and, most importantly, for its CDN network to speed up delivery across the globe. With a bit of tweaking on the Caching options, together with a few Page Rules, I was able to obtain a decent level of caching as well (which saves me from getting charged for excessive data transfers in/out of S3 by AWS):

(Sending/Receiving) Personal Emails

Since I wanted to manage everything from within AWS, I needed a way to be able to both receive and send emails from my @cloudseclist.com domain. For this, I’ve found the setup proposed by aws-lambda-ses-forwarder quite effective (with only a couple of tweaks needed).

The README in that repo is quite explicative, but, in short, the process to set up email reception with SES is the following:

First of all, I had to verify both the cloudseclist.com domain and a forwarding email address (<redacted>@gmail.com) within SES.
Then, I had to create an S3 bucket to store incoming emails (let’s call it mailbox-bucket). This bucket has a policy that allows SES to put objects in it, and a lifecycle configured to delete objects after 90 days from creation.
The next step involved setting up a lambda (let’s call it SesForwarder) that forwards every incoming email to a destination address (Gmail in my case). This can be obtained by modifying the constants in the index.js file (provided in the aws-lambda-ses-forwarder repo) to fetch emails from mailbox-bucket and forward them to <redacted>@gmail.com.
Finally, I had to setup a Reception Rule on SES, with 2 actions performed for every email incoming into SES:
- S3 action: choose mailbox-bucket. This will allow SES to store the incoming email as an object in the specified S3 bucket.
- Lambda action: choose the SesForwarder Lambda function. This will trigger the lambda, which, in the end, will forward the email to the destination address.

Setting up outgoing emails was then just a matter of creating an SMTP user in SES, and configuring Gmail to send emails as that SMTP user.

Development and Deployment (CI/CD)

Having discussed the setup of the showcase website, and how the content is stored and delivered through AWS, it is now time to explain how I generate the website itself and write new issues of the mailing list.

Website Generation via Jekyll

I’m a fan of monorepos, so the code for all my websites (well, I mainly have 2 at the moment) is stored in a single private repository on Github:

❯ tree -L 1 websites/
websites
├── README.md
├── cloudseclist.com
└── marcolancini.it
2 directories, 1 file

As mentioned previously, I did blog about the setup for marcolancini.it in another post (“My Blogging Stack”), whereas here we will be focusing on cloudseclist.com:

❯ tree cloudseclist.com -L 3
cloudseclist.com
├── docker-compose.yml
├── resources
│   ├── logo
│   └── setup
│       ├── diagrams
│       ├── mailer
│       └── website
├── terraform
└── web
    ├── Dockerfile
    └── site
        ├── 404.html
        ├── Gemfile
        ├── Gemfile.lock
        ├── _config.yml
        ├── _config_dev.yml
        ├── _drafts
        ├── _includes
        ├── _layouts
        ├── _posts
        ├── _sass
        ├── _site
        ├── assets
        ├── feed.xml
        ├── index.html
        ├── past.html
        └── unsubscribe.html

I use Jekyll as a static site generator (from the directory listing above, you can see the web/site/ folder which contains the Jekyll website). The custom modifications I’ve made (apart from a custom theme) are related to the way I run Jekyll, which is via custom Docker images coordinated via docker-compose:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
version: '2'
services:
    # ------------------------------------------------------------------------------------
    # WEBSITE
    # ------------------------------------------------------------------------------------
    web:
        container_name: cls_web
        restart: always
        build:
            context: ./web/
            dockerfile: Dockerfile
        volumes:
            - $PWD/web/site/:/src/website/
        ports:
            - 127.0.0.1:4000:4000
        environment:
              - VIRTUAL_HOST=127.0.0.1
              - VIRTUAL_PORT=4000
        command: jekyll serve --config _config.yml,_config_dev.yml  --host 0.0.0.0 --port 4000

Lines 9-11: the image for the container running Jekyll comes from a custom Dockerfile (shown below):

FROM jekyll/jekyll:latest

# Create workdir
RUN mkdir -p /src/website/
WORKDIR /src/website/

# Cache bundle install
COPY ./site/Gemfile* /src/website/
RUN chmod a+w Gemfile.lock
RUN bundle install

Line 13: the web/site/ folder (containing the Jekyll installation) is shared with the container, so I can make changes from VisualStudio Code on my host and have Jekyll automatically pick up the changes and render them.
Line 15: port 4000 of the container is exposed to the same port on localhost, so that I can access the preview on http://127.0.0.1:4000.
Line 19: the command to run Jekyll, with the _config_dev.yml file used to override the base setup just for local usage.

With this setup, I can create a new issue of the mailing list by adding a new markdown file in the web/site/posts/ folder and write the content locally as a YAML file. Here is a skeleton template:

---
layout: post
issue: "40"
title: "📖 [The CloudSecList] Issue 40"
date: "2020-06-07"
articles:
    - {
        "title": "",
        "title_url": "",
        "description": ""
    }
    - {
        "title": "",
        "title_url": "",
        "description": ""
    }
tools:
    - {
        "title": "",
        "title_url": "",
        "description": ""
    }
aws:
    - {
        "title": "",
        "title_url": "",
        "description": ""
    }
gcp:
    - {
        "title": "",
        "title_url": "",
        "description": ""
    }
---

But how does this end up published?

Automatic Deployments via GitHub Actions

Deploying the static website is done via GitHub actions, which allow me to automatically push the generated HTML content to S3 every time I push to the main branch. Here is how this GitHub workflow looks like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
name: CloudSecList

on:
  push:
    branches:
      - main
    paths:
      - 'cloudseclist.com/web/site/*'
      - 'cloudseclist.com/web/site/*/*'
      - 'cloudseclist.com/web/site/*/*/*'

jobs:
  build:
    runs-on: ubuntu-latest

    steps:
    - name: Checkout repository
      uses: actions/checkout@v1
      with:
        ref: main
        fetch-depth: 1
    - name: Build the site in the jekyll/builder container
      run: |
        docker run \
        -v ${FOLDER}:/srv/jekyll -v ${FOLDER}/_site:/srv/jekyll/_site \
        jekyll/builder:latest /bin/bash -c "chmod 777 /srv/jekyll && jekyll build --future"
      env:
        FOLDER: ${ github.workspace }/cloudseclist.com/web/site
    - name: Deploy
      run: aws s3 sync ${FOLDER}/_site/ s3://${BUCKET} --delete
      env:
        FOLDER: ${ github.workspace }/cloudseclist.com/web/site
        BUCKET: cloudseclist.com
        AWS_ACCESS_KEY_ID: ${ secrets.CLOUDSECLIST_AWS_ACCESS_KEY_ID }
        AWS_SECRET_ACCESS_KEY: ${ secrets.CLOUDSECLIST_AWS_SECRET_ACCESS_KEY }

Lines 5-10: the workflow only runs when a file gets modified in the cloudseclist.com/web/site/* folder on the main branch. This avoids triggering the pipeline from commits on feature branches.
Lines 17-21: the first step simply checks out the repository.
Lines 22-28: the second step builds the site in the jekyll/builder container.
Line 30: finally, the third step syncs the generated website with the main S3 bucket, using the AWS API keys specified in lines 34-35.

Infrastructure as Code via Terraform and GitHub Actions

Speaking of CI/CD, another component I leverage upon to deploy infrastructure in AWS is Terraform. Also in this case, I’ve integrated Terraform with GitHub actions, so that every time I modify a file within the cloudeclist.com/terraform/ folder, a specific pipeline gets triggered.

Here is how the GitHub workflow looks like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
name: "[CLOUDSECLIST] Terraform"

on:
  push:
    branches:
      - main
    paths:
      - "cloudseclist/terraform/*"
  pull_request:
    paths:
      - "cloudseclist/terraform/*"

jobs:
  terraform:
    name: "[CLOUDSECLIST] Terraform"
    runs-on: ubuntu-latest

    steps:
      - name: Checkout repository
        uses: actions/checkout@v2

      - name: Setup Terraform
        uses: hashicorp/setup-terraform@v1
        with:
          cli_config_credentials_token: ${ secrets.TF_API_TOKEN }

      - name: Terraform Init
        run: |
          cd ${FOLDER}
          terraform init
        env:
          FOLDER: ${ github.workspace }/cloudseclist.com/terraform/

      - name: Terraform Format
        run: |
          cd ${FOLDER}
          terraform fmt -check
        env:
          FOLDER: ${ github.workspace }/cloudseclist.com/terraform/

      - name: Terraform Plan
        id: plan
        run: |
          cd ${FOLDER}
          terraform plan -no-color
        env:
          FOLDER: ${ github.workspace }/cloudseclist.com/terraform/
          AWS_ACCESS_KEY_ID: ${ secrets.CLOUDSECLIST_TF_AWS_ACCESS_KEY_ID }
          AWS_SECRET_ACCESS_KEY: ${ secrets.CLOUDSECLIST_TF_AWS_SECRET_ACCESS_KEY }

      - name: Show Terraform Output
        uses: actions/[email protected]
        if: github.event_name == 'pull_request'
        env:
          STDOUT: "```${ steps.plan.outputs.stdout }```"
        with:
          github-token: ${ secrets.GITHUB_TOKEN }
          script: |
            github.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: process.env.STDOUT
            })

      - name: Terraform Apply
        if: github.ref == 'refs/heads/main' && github.event_name == 'push'
        run: |
          cd ${FOLDER}
          terraform apply -auto-approve
        env:
          FOLDER: ${ github.workspace }/cloudseclist.com/terraform/
          AWS_ACCESS_KEY_ID: ${ secrets.CLOUDSECLIST_TF_AWS_ACCESS_KEY_ID }
          AWS_SECRET_ACCESS_KEY: ${ secrets.CLOUDSECLIST_TF_AWS_SECRET_ACCESS_KEY }

Lines 5-11: the workflow only runs when a file gets modified in the cloudeclist.com/terraform/* folder on the main branch. This avoids triggering the pipeline from commits on feature branches.
Lines 19-20: the first step simply checks out the repository.
Lines 22-25: the second step uses the hashicorp/setup-terraform@v1 action from HashiCorp to setup Terraform and configure the Terraform Cloud backend by specifying the TF_API_TOKEN secret.
Lines 27-49: the following steps perform the usual Terraform workflow of init-format-plan.
Lines 51-64: the output of the plan step gets automatically posted as a comment of the pull request (as shown in the screenshot below).
Lines 66-74: finally, only when the pull request gets merged, the final step performs the apply step of the Terraform workflow.

Terraform execution via GitHub Action — Terraform Execution via GitHub Actions

Serverless Mailing Solution

We saw how everything gets deployed, so now it is time to focus on the actual logic behind the serverless mailing list.

The core of the serverless mailing solution is mainly made up of 2 DynamoDB tables, 2 SQS queues, a few API Gateway endpoints, 4 Lambdas, and, of course, SES.

Let’s analyze it by use case.

The showcase website hosted at cloudseclist.com has a form that allows to subscribe to the mailing list.

Behind the scenes, the form is plugged into a REST API Gateway endpoint (/subscribe), which, upon invocation, triggers a lambda function (lambda_subscribe). The lambda itself performs some sanity/security checks on the inputs provided, verifies if the email address is already subscribed, and adds it to a DynamoDB table if not.

This main DynamoDB table contains, alongside the email address provided (also used as the partition key), a randomly generated ID, the subscription date, and the number of the last issue sent to the user.

Create new Issues

Every time a GitHub Action creates a new HTML page (issue) in the cloudseclist.com S3 bucket, a second Lambda (lambda_create_issue) gets triggered.

The lambda:

First, it verifies the file uploaded is a well formatted issue (filtering out other HTML pages).
Then, it downloads the HTML file from S3 into a buffer, and extracts the issue name (to be used as Subject in the emails).
If the release date of the issue doesn’t match the current date, the Lambda terminates to avoid resending the same issue multiple times.
If the date is correct, the Lambda then creates a new entry in a second DynamoDB table (issues), composed of the issue number (to be used as partition key), subject, date, and a counter (user_count) which tracks the number of people the specific issue has been sent to (initialised to 0).
The Lambda then proceeds by looping through the DynamoDB table containing the subscribed users, and, for each one of them:
- Checks if the issue has already been sent to the user.
- Injects a personalized unsubscribe URL.
- Creates an object (email_params) containing the email content.
- Pushes the object to a FIFO SQS queue.

email_params = {
    ConfigurationSetName: CONFIGURATION_SET,
    Source: SENDER,
    Destination: {
        ToAddresses: <recipient>
    },
    Message: {
        Body: {
            Html: {
                Charset: "UTF-8",
                Data: <body>
            },
        },
        Subject: {
            Charset: "UTF-8",
            Data: <subject>
        }
    }
};

The SQS queue is pivotal in this setup, as it decouples the creation of new issues from the actual process of sending them via SES (described in the next section). Properly configuring this SQS queue has also been the most challenging part of this setup, particularly when configuring visibility timeouts/retention periods, and how to respect the SES limits to avoid rate limiting myself. Luckily, a few good resources helped along the way (1, 2, 3, 4, 5).

Errors from this queue get shipped to a dead-letter queue, which, as we will see, is used to monitor the health of the pipeline.

Send Emails

Each time a message gets enqueued in the FIFO SQS queue described above, a third Lambda (lambda_send_issue) gets triggered.

The lambda:

Parses the email parameters from the object in the queue.
Checks if the issue has already been sent to the user, and terminates if so.
Creates the promise and SES service object: AWS.SES({ apiVersion: "2010-12-01" }).sendEmail(email_params).promise();
Handles the promise’s fulfilled/rejected states, and, if SES successfully sent the email, then the Lambda deletes the message from the queue and increases the counter which tracks the number of people the specific issue has been sent to (user_count).

Unsubscribe Users

Finally, the last use case consists of someone who wishes to unsubscribe from the mailing list.

Every email sent has a personalized unsubscribe link, which, if clicked, brings the user to a dedicated page asking to confirm if they wish to unsubscribe. If they do confirm, the form submits a request to an API Gateway endpoint (/unsubscribe).

The endpoint is integrated with a fourth Lambda, which performs some sanity/security checks on the input received, and, if everything is as expected, proceeds by removing the user from the main DynamoDB table.

Reporting

Another area that I had to implement revolved around monitoring and reporting.

As I mentioned at the beginning of this blog post, although I am not interested in massively tracking my users with tracking links injected everywhere, I still need some basic reporting.

In particular, I created 3 weekly reports that give me visibility into:

Health of SES: bounces/complaints/deliveries.
Engagement: number of users opening the issue each week (anonymizing who opened the email).
Subscriber Count: a weekly update on the number of subscribes/unsubscribes for the past week.

Bounces and Complaints

SES can be configured to automatically forward any bounce/complaint/successful delivery to a specific SNS topic. An SQS queue can then be subscribed to the topic in order to buffer all these events.

I then configured a CloudWatch Event to weekly trigger a Lambda function, which consumes all the items in the queue and generates an HTML report. The report gets temporarily stored in an S3 bucket and sent to me via email.

Engagement

Automating engagement reports is quite similar to the process described above for bounces. SES can be configured (via a Configuration Set) to forward to a specific SNS topic any event involving someone opening an email. It should be noted, that SES could also be configured to track clicks, but this is not something I’m interested in.

As before, an SQS queue can then be subscribed to the topic in order to buffer all these events. Another CloudWatch Event then weekly triggers a Lambda function which consumes all the items in the queue and generates an HTML report. The report gets temporarily stored in an S3 bucket and sent to me via email.

Subscriber Count

For the subscriber count report, a third CloudWatch Event weekly triggers another Lambda which performs a “diff” of the subscribed users from the previous week and generates a report with the number of new/removed users.

How Much Does this Cost?

Let’s start by the monthly cost of AWS services: the table and graph below show my expenditure for the AWS account hosting cloudseclist.com for the past quarter (1st March 2020 - 31st May 2020):

Month	Tax($)	S3($)	SES($)	SQS($)	DynamoDB($)	API Gateway($)	Lambda($)	SNS($)	CloudWatch($)	Total cost($)
Mar 2020	0.03	0.02	0.00	0.10	0.03	0.01	0.00	0.00	0.00	0.19
Apr 2020	0.03	0.01	0.00	0.11	0.03	0.00	0.00	0.00	0.00	0.18
May 2020	0.01	0.01	0.00	0.10	0.04	0.00	0.00	0.00	0.00	0.16
Service Total	0.07	0.04	0.00	0.31	0.10	0.01	0.00	0.00	0.00	0.53

Notice: Data transfer costs are included in the services that they’re associated with, such as Amazon EC2 or Amazon S3. They aren’t represented as either a separate line item in the data table or a bar in the chart.

On average this adds up to ~$0.177 per month (~£0.14 at the current exchange rate).

On top of this, we have to add:

Domain names registration (cloudseclist.com): $13.5 per year ($1.125 per month).
GitHub actions: $0, since I’m way below the 2,000 minutes per month of the free tier.
CloudFlare: $0, since I’m on the free tier.

So, in total, I’m spending $1.265 per month (~£1.01), mostly coming from the domain name fees:

Monthly Cost	Total($)
AWS Services	0.14
Domain Name	1.125
GitHub Actions	0.00
CloudFlare	0.00
Total	1.265

I’ll let you calculate the difference with the price Mailchimp charges…

Conclusion

Thank you for making it this far! I hope you found this post interesting, as it described my workflow for creating and managing a serverless mailing list solution based on top of SES.

You can find a follow-up post, describing how I integrated AI to make the content conversational, at:
Transform Years of Content Into a Conversational Knowledge Base

If you find the information shared helpful, if something is missing, or if you have ideas on improving it, please let me know on 🐣 Twitter or at 📢 feedback.marcolancini.it.

Thank you! 🙇‍♂️

About