| Follow @lancinimarco | Subscribe to CloudSecList

Reading time ~14 minutes

Serverless Ad Blocking with Cloudflare Gateway

I’ve always wanted to setup a Pi-hole to block advertisements in my home office, but, at the same time, I didn’t want physical boxes lying around to maintain (plus, I do hate cables).

In this blog, I’ll explain how I managed to mimic the Pi-hole’s behaviour using only serverless technologies (Cloudflare Gateway, to be precise).

This post has been updated on :
  • Improved the Lists section: by having Terraform automatically split the blocklist into smaller chunks at runtime rather than committing multiple files to the repo.
  • Added the Keeping the domain list up to date section: to show how to keep the domain list automatically updated with GitHub Actions.

What is Cloudflare Gateway

Before jumping into the implementation part, a word on Cloudflare Gateway.

Cloudflare Gateway, called by Cloudflare a “Secure Web Gateway”, allows you to set up policies to inspect DNS, Network, and HTTP traffic.

Policy Type Inspected traffic Use case
DNS policies DNS queries Block domains and IP addresses from resolving on your devices.
Network policies Individual TCP/UDP/GRE packets Block access to specific ports on your origin server, including non-HTTP resources.
HTTP policies HTTP requests Block specific URLs from loading, not just the domain itself.

I won’t go into the details of all its other features (like enhanced visibility and protection into SaaS applications) in this blog, but I’ll focus primarily on DNS policies. You can check the official documentation on Cloudflare Zero Trust if you are curious and want to explore the other features.

Setup your Cloudflare Teams account

Before starting, you’ll need to create a Cloudflare for Teams account to follow along with the rest of this blog post. If you don’t already have one, you can visit https://dash.teams.cloudflare.com/ and follow the setup guide. The free plan will be sufficient.

Once created, you can either click through the Settings section of the UI or use Terraform to configure the main options, since Terraform supports Cloudflare Teams with the Cloudflare Provider.

Below you can find an excerpt from my configuration:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
resource "cloudflare_teams_account" "securitybite" {
  account_id = local.cloudflare_account_id

  block_page {
    enabled          = true
    name             = "Your Team Name"
    header_text      = "This website is blocked"
    footer_text      = "Some description"
    logo_path        = "https://example.com/logo.png"
    background_color = "#e8e8e8"
  }

  antivirus {
    enabled_download_phase = true
    enabled_upload_phase   = false
    fail_closed            = false
  }

  proxy {
    tcp = true
    udp = true
  }

  logging {
    redact_pii = true
    settings_by_rule_type {
      dns {
        log_all    = true
        log_blocks = false
      }
      http {
        log_all    = true
        log_blocks = false
      }
      l4 {
        log_all    = true
        log_blocks = false
      }
    }
  }

  activity_log_enabled = true
  tls_decrypt_enabled  = false

}
  • Line 1: the cloudflare_teams_account resource contains the configuration for the Secure Web Gateway (see related Terraform docs).
  • Line 2: most resources in Cloudflare’s Terraform provider are tied to a Cloudflare account via the account_id argument.
  • Lines 4-11: configuration for a custom block page (more on this below).
  • Lines 19-22: configuration block for specifying which protocols are proxied. In this case, I’ve enabled it for both TCP and UDP.
  • Lines 24-40: represents whether all or only blocked requests are logged by DNS, HTTP and L4 filters. Be sure to enable activity logging, at least for all DNS logs.

Connect devices to the Gateway

Next, you’ll need to configure your devices to send DNS queries to Cloudflare (or even proxy all traffic leaving the device through Cloudflare’s network). The most straightforward way to accomplish this is by installing the Cloudflare WARP client, allowing you to forward traffic from your device to Cloudflare’s edge, where Cloudflare Gateway can apply advanced filtering.

Cloudflare Zero Trust. Image courtesy of Cloudflare
Cloudflare Zero Trust. Image courtesy of Cloudflare

Setting up Cloudflare WARP is relatively straightforward. Let’s see how.

Install the Cloudflare certificate

Although not strictly required to enable the WARP client, installing the Cloudflare root certificate on your device is helpful if you want to display a custom block page.

Infact, Gateway responds to any blocked domain with 0.0.0.0 and does not return that blocked domain’s IP address. As a result, the browser will show a default error page, and users will not be able to reach that website. This behaviour may confuse some users and make them think their Internet connection is not working.

Configuring a custom block page on the Zero Trust dashboard helps avoid this confusion.

Blocked page - default Blocked page - custom
Default vs Custom block page

The Cloudflare docs on how to install the Cloudflare certificate are pretty extensive (and include instructions for macOS, Windows, Linux, ChromeOS, iOS, and Android), but here is the short version for macOS:

  1. Download the Cloudflare certificate (.crt).
  2. Verify its fingerprint: ➜ openssl x509 -noout -fingerprint -sha256 -inform der -in <Cloudflare_CA.crt>
  3. Add the certificate to your system by installing it in the Login keychain and trusting it.
Install the certificate in the keychain Trust the certificate
Installing the Cloudflare certificate

Install the WARP client

With the certificate now trusted, the latest configuration step involves installing the WARP client on your device:

  1. Download the WARP client for your OS.
  2. In the WARP client Settings, log in to your organization’s Zero Trust instance (something like <your-team-name>.cloudflareaccess.com).
    WARP Client
    WARP Client
  3. Verify the device’s connectivity:
    • On the WARP-enabled device, open a browser and visit any website.
    • In the Zero Trust dashboard, navigate to Logs > Gateway > DNS and make sure you can see DNS queries originating from your device.
Gateway Activity Log - DNS
Gateway Activity Log - DNS

A note on Chromium-based browsers

During my first attempt, I realized I couldn’t see my requests being proxied to Cloudflare. After a bit of digging, it turned out it is because Chrome doesn’t honour the operating system’s DNS settings by default.

To ensure requests from Chrome are proxied, go to Chrome’s Settings > Security > Use Secure DNS, and select With your current service provider.


Subscribe to CloudSecList

If you found this article interesting, you can join thousands of security professionals getting curated security-related news focused on the cloud native landscape by subscribing to CloudSecList.com.


Add policies to the Gateway

Now that devices are connected to the Gateway, we can start enforcing some policies.

With DNS policies, when a user makes a DNS request to Gateway, Gateway matches the request against the content or security categories you have set up. If the domain does not belong to any blocked categories, or if it matches an Override policy, the user’s client receives the DNS resolution and initiates an HTTP connection. You can find more details on DNS policies’ syntax on the Cloudflare docs website.

First, let’s look at a policy recommended by Cloudflare, and then I’ll describe the custom rule I created to mimic the Pi-hole behaviour.

Policy: Block security risks

Cloudflare provides a native policy (called ​​Block all security risks) which blocks known threats (such as Command & Control, Botnet, Malware, etc.) based on Cloudflare’s threat intelligence. Too good not to enable it!

The following is its Terraform configuration:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
resource "cloudflare_teams_rule" "block_malware" {
  account_id = local.cloudflare_account_id

  name        = "Block malware"
  description = "Block known threats based on Cloudflare’s threat intelligence"

  enabled    = true
  precedence = 10

  # Block all security risks
  filters = ["dns"]
  traffic = "any(dns.security_category[*] in {178 80 83 176 175 117 131 134 151 153 68})"
  action  = "block"

  rule_settings {
    block_page_enabled = true
  }
}
  • Line 11: The filter used for traffic matching. In this case, we are filtering DNS traffic.
  • Line 12: This is the actual rule, which will match any requests classified as belonging to any of those categories. The DNS Categories page of the Cloudflare docs lists all of them (for example, 178 maps to Typosquatting & Impersonation).
  • Line 13: The action to apply for any traffic that matches the filter. In this case, the request will be blocked.
  • Line 16: By setting block_page_enabled = true, we will have the Gateway return the custom block page we have previously created.

As a result, all traffic eventually directed to domains labelled as Malicious will be automatically blocked.

Policy: Ad blocking

Although Cloudflare has a Deceptive Ads category among its DNS Categories, this won’t encompass the whole set of ad-related domains we want to block.

Here is where the similarities with Pi-hole come into play (and also where they end): the only thing needed to mimic the Pi-hole’s behaviour is to select a blocklist suitable for our use case.

Pi-hole ships with a default list, StevenBlack’s Unified Hosts List, which I found to be a bit too extensive (with 14,0919 blocklisted domains as of September 2022). Such a vast list also poses some problems in uploading this data into Cloudflare (more on this later). So I decided to look for a slightly “lighter” alternative.

After some googling, I stumbled upon The Big Blocklist Collection, which collects various lists for this purpose. From this list of lists, I opted for the AdAway default blocklist, which blocks both ad and analytics providers, with 7,320 blocklisted domains as of September 2022.

Now that we have a blocklist, we can create a Gateway Policy based on these hosts.

Lists

Adding each of these 7,320 hosts in the traffic field of a cloudflare_teams_rule rule, as we saw above, is not feasible. Luckily for us, Cloudflare provides Lists, which are lists of URLs, hostnames, or other entries to reference when creating Secure Web Gateway policies. Lists allow quickly making rules that match and take action against several items at once.

The limitation is that lists can include up to 5,000 entries for Enterprise subscriptions and 1,000 for Standard subscriptions. Hence, Pi-hole’s default blocklist of ~14,000 domains was too big for the free plan.

I ended up taking the AdAway default blocklist, committing it to my monorepo, and then splitting it into smaller chunks, each made up of up to 1,000 entries, at runtime by Terraform itself.

AdAway list split committed to my monorepo
AdAway list split committed to my monorepo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
locals {
  # The full path of the list holding the domain list
  pihole_domain_list_file = "${path.module}/cloudflare/lists/pihole_domain_list.txt"

  # Parse the file and create a list, one item per line
  pihole_domain_list = split("\n", file(local.pihole_domain_list_file))

  # Remove empty lines
  pihole_domain_list_clean = [ for x in local.pihole_domain_list : x if x != "" ]

  # Use chunklist to split a list into fixed-size chunks
  # It returns a list of lists
  pihole_aggregated_lists = chunklist(local.pihole_domain_list_clean, 1000)

  # Get the number of lists (chunks) created
  pihole_list_count = length(local.pihole_aggregated_lists)
}

resource "cloudflare_teams_list" "pihole_domain_lists" {
  account_id = local.cloudflare_account_id

  for_each = {
    for i in range(0, local.pihole_list_count) :
      i => element(local.pihole_aggregated_lists, i)
  }

  name  = "pihole_domain_list_${each.key}"
  type  = "DOMAIN"
  items = each.value

  # TODO: Needed cause otherwise it will keep adding the same items at each apply
  lifecycle {
    ignore_changes = [items]
  }
}
  • Line 1-17: first, we must do some manipulations to create the chunked lists.
    • Line 3: a local variable which contains the full path of the file holding the domain list. Useful for future refactoring if we want to change the filename without affecting the rest of the logic.
    • Line 6: we then parse the file and create a list, one item per line.
    • Line 9: we will upload a list containing domain names, so Cloudflare will reject anything else that is not in the correct format. Hence, we are removing any potential empty lines within the file.
    • Line 13: we use the chunklist function to split the original list into fixed-size chunks, each of up to 1,000 entries. It will create as many lists as needed to split the original one.
    • Line 16: for ease, we also calculate the number of lists (chunks) created in the previous step. This number will be helpful for the for_each below.
  • Line 23-25: the for_each will make sure to create one pihole_domain_lists resource for each chunk.
  • Line 29: we specify that the list contains domain names (not IPs, URLs, or emails).
  • Line 30: as items, we provide each chunk’s content.
  • Line 33-35: one thing I haven’t figured out (maybe a bug in the provider?) is that without forcing Terraform to ignore changes, it will try to keep adding the same items at each apply. If someone knows why, please do let me know!
Cloudflare Lists
Cloudflare Lists

Rule

Now we can easily reference the chunked lists into a Gateway rule:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
locals {
  # Iterate through each pihole_domain_list resource and extract its ID
  pihole_domain_lists = [for k, v in cloudflare_teams_list.pihole_domain_lists : v.id]

  # Format the values: remove dashes and prepend $
  pihole_domain_lists_formatted = [for v in local.pihole_domain_lists : format("$%s", replace(v, "-", ""))]

  # Create filters to use in the policy
  pihole_ad_filters = formatlist("any(dns.domains[*] in %s)", local.pihole_domain_lists_formatted)
  pihole_ad_filter  = join(" or ", local.pihole_ad_filters)
}

resource "cloudflare_teams_rule" "block_ads" {
  account_id = local.cloudflare_account_id

  name        = "Block Ads"
  description = "Block Ads domains"

  enabled    = true
  precedence = 11

  # Block domain belonging to lists (defined below)
  filters = ["dns"]
  action  = "block"
  traffic = local.pihole_ad_filter

  rule_settings {
    block_page_enabled = false
  }

}
  • Line 1-11: first of all, we need to do some manipulations to create the filter as expected by Cloudflare.
    • Line 3: we start by iterating through each pihole_domain_list resource and extracting its ID.
    • Line 6: then, we need to remove dashes from these IDs (as the rule doesn’t accept them) and then prepend them with the $ sign.
    • Line 9-10: finally, we create the actual filter by putting the formatted list IDs in OR statements like "any(dns.domains[*] in $xxxx)" (where xxxx is a list ID).
  • Line 23: The filter used for traffic matching. In this case, we are filtering DNS traffic.
  • Line 24: The action to apply for any traffic that matches the filter. In this case, the request will be blocked.
  • Note how I am not setting block_page_enabled = true here, as I don’t want to show a custom block page but simply not load the ads.
Ad blocking policy
Ad blocking policy (the policy name is highlighted in red because it contains a complex filter that cannot be modified by the web UI, but only via API)

The final result? Take a look below.

Before After
Before vs After

Analytics

To see the top Allowed and Blocked requests across your devices, navigate to Analytics > Gateway. You can filter the data by selecting a specific location and/or time.

Gateway Analytics
Gateway Analytics

In addition, you can inspect any Allowed/Blocked requests in the Gateway Activity Log.

Gateway Activity Log
Gateway Activity Log

Keeping the domain list up to date

The last remaining point is: with the list committed to the repo, how can we keep it automatically updated?

In this case, I opted for a GitHub Actions workflow that periodically (monthly) fetches the list upstream and commits it to the repo if it has changed:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
name: 'Update Pi-Hole Domain List'

on:
  workflow_dispatch:
  schedule:
    - cron: '0 10 15 * *' # At 10:00 on day-of-month 15

env:
  FOLDER: '<redacted>/cloudflare/lists'

jobs:
  auto-update:
    runs-on: ubuntu-20.04

    permissions:
      id-token: write
      contents: write
      pull-requests: write

    steps:
      - name: 📂 Checkout Branch
        uses: actions/[email protected]

      #
      # Fetch domain list
      #
      - name: 🔗 Fetch Domain List
        working-directory: ${ { env.FOLDER } }
        run: |
          LIST_URL="https://adaway.org/hosts.txt"
          LIST_FNAME="pihole_domain_list.txt"

          echo "[*] Fetching list: ${LIST_URL} -> ${LIST_FNAME}"
          wget --quiet $LIST_URL -O $LIST_FNAME

          echo "[*] Sorting list..."
          sort -u -o $LIST_FNAME $LIST_FNAME

          echo "[*] Removing comments..."
          grep -o '^[^#]*' $LIST_FNAME > temp.txt
          mv temp.txt $LIST_FNAME

          echo "[*] Extracting domains..."
          cat $LIST_FNAME | awk '{ print $2 }' > temp.txt
          mv temp.txt $LIST_FNAME

          echo "[*] Removing localhost from list..."
          sed -i '/localhost/d' $LIST_FNAME
          sed -i '/127.0.0.1/d' $LIST_FNAME

      #
      # Commit file
      #
      - name: ↗️ Create Pull Request
        uses: peter-evans/[email protected]
        with:
          token: ${ { secrets.GITHUB_TOKEN } }
          title: 'Update Pi-hole domain list'
          branch-suffix: timestamp
          commit-message: 'Update Pi-hole domain list'
          body: ''
Workflow Commit
GitHub Action Workflow

Conclusions

In this post, I explained how I blocked advertisements in my home office, mimicking the Pi-hole’s behaviour, using only serverless technologies.

So far, the experience has been very positive:

  • No servers to maintain and keep up-to-date (and no cables either!)
  • Ad blocking is not tied to a single network: I can switch to a mobile network on my phone and still maintain the protection of the filters enforced by Cloudflare Gateway.
  • No performance impact.

I hope you found this post valuable and interesting, and I’m keen to get feedback on it! If you find the information shared helpful, if something is missing, or if you have ideas on improving it, please let me know on 🐣 Twitter or at 📢 feedback.marcolancini.it.

Thank you! 🙇‍♂️

Subscribe to CloudSecList

If you found this article interesting, you can join thousands of security professionals getting curated security-related news focused on the cloud native landscape by subscribing to CloudSecList.com.

Marco Lancini

Marco Lancini
Hi, I'm Marco Lancini. I am a Principal Security Engineer, advisor, investor, and writer mainly interested in cloud native technologies, security, and technical leadership...  [read more]