Reading time ~5 minutes

Semgrep for Cloud Security

What is Semgrep?
Semgrep for Infrastructure as Code
- Terraform
  - Unencrypted EBS Volumes
  - Open Security Groups
- Kubernetes
Conclusions

Semgrep is an emerging static analysis tool which is getting traction within the AppSec community. Its broad support to multiple programming languages, together with the easiness with which is possible to create rules, makes it a powerful tool that can help AppSec teams scaling their efforts into preventing complete classes of vulnerabilities from their codebases.

But what about cloud security? In the era of Infrastructure as Code, where tools like Terraform, CloudFormation, Pulumi (and many others) are used to provision infrastructure from (de-facto) source code, can we apply the same approach to eradicate classes of cloud-related vulnerabilities from a codebase?

I decided to spend part of my weekend experimenting with this, and to get an idea of what Semgrep can provide to cloud/platform security teams.

What is Semgrep?

Before jumping into the details, it is worth explaining what Semgrep actually is. As per their website, Semgrep is:

A fast, open-source, static analysis tool that excels at expressing code standards — without complicated queries — and surfacing bugs early at editor, commit, and CI time.

Precise rules look like the code you’re searching; no more traversing abstract syntax trees or wrestling with regexes.

The Semgrep Registry has 1,000+ rules written by the Semgrep community covering security, correctness, and performance bugs. No need to DIY unless you want to.

At a high level, Semgrep leverages Abstract Syntax Trees (ASTs) to build a model of the code you are analyzing. Unlike other tools based on ASTs, though, Semgrep lowers the entry bar by abstracting away the AST syntax itself.

Code as ASTs. Image courtesy of Clint Gibler.

Out of the box, Semgrep supports mainstream programming languages (e.g., Go, Java, Python, Ruby, Javascript, etc.) and has a library of open source rules ready to be re-used.

Explaining how to use Semgrep is out of scope for this blog post, but the official documentation is really well made, and the online playground is an excellent space where to start playing with it (without having to spend time installing anything).

📙 The CloudSec Engineer is out now!

The CloudSec Engineer is a practical guide on how to enter, establish yourself, and thrive in the Cloud Security industry as an individual contributor.

You can head over to CloudSecBooks.com to find more information about the book and its contents.

Semgrep for Infrastructure as Code

As briefly mentioned earlier, the benefit that Semgrep can bring to AppSec teams is obvious (and if you are still not convinced, I recommend you to watch this this presentation from Clint Gibler).

What I was curious to try was how well the same approach could fit a codebase made of Terraform (HCL) and YAML files, as those languages are not currently supported by Semgrep. Hence, I relied on its Generic Pattern Matching engine.

Terraform

The official semgrep-rules repository already contains a folder dedicated to Terraform.

Within this folder, we can see 7 rules already made open source, mainly focusing on Terragoat scenarios and S3 buckets.

Unencrypted EBS Volumes

Let’s start wrapping our head around it by picking the unencrypted-ebs-volume rule. In the repo we can see a sample Terraform file (shown here below):

resource "aws_ebs_volume" "web_host_storage" {
  availability_zone = "ap-southeast-2"
  encrypted         = false
  size = 1
  # ruleid: unencrypted-ebs-volume
  tags = {
    Name = "abcd-ebs"
  }
}

Quite straightforward, with an aws_ebs_volume resource declaring an EBS volume with encryption disabled (as it can bee seen from encrypted = false).

So what we want to grep here is for an occurrence of encrypted = false (or the lack of encrypted = true), as shown in the corresponding rule:

rules:
- id: unencrypted-ebs-volume
  patterns:
    - pattern-either:
      - pattern: |
          {...}
    - pattern-not-inside: |
        resource "aws_ebs_volume" "..." {... encrypted=true ...}
    - pattern-inside: |
        resource "aws_ebs_volume" "..." {...}
  languages:
    - generic
  paths:
    include:
    - '*.tf'
  message: |
    An EBS volume is configured without encryption enabled.
  severity: WARNING

You can try this rule in the Semgrep playground: https://semgrep.dev/s/ZWrA/.

Open Security Groups

As a second test, I wanted to create my first Semgrep rule to detect a Security Group open to the world (0.0.0.0/0), like the one below:

resource "aws_security_group" "allow_tls" {
  name        = "allow_tls"
  description = "Allow TLS inbound traffic"
  vpc_id      = aws_vpc.main.id

  ingress {
    description = "TLS from VPC"
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = ["10.0.1.0/24", "0.0.0.0/0"]
  }

  tags = {
    Name = "allow_tls"
  }
}

What we want to grep here is any occurrence of 0.0.0.0/0 within an ingress block:

rules:
- id: open-security-group
  patterns:
    - pattern-inside: ingress { ... }
    - pattern: "0.0.0.0/0"
  languages:
    - generic
  paths:
    include:
    - '*.tf'
  message: |
    A security group is allowing inbound traffic from the public internet (0.0.0.0/0).
  severity: WARNING

You can try this rule in the Semgrep playground: https://semgrep.dev/s/ne51/.

Of course this is a very basic case, where the offending string (0.0.0.0/0) is directly hardcoded within the security group definition. The rule will have to be extended if we want to take into account cases where the CIDR can be specified, for example, via variables.

Kubernetes

Next, I wanted to create a rule more focused on Kubernetes (or, more precisely, YAML files).

Let’s take as a sample the case where you might want to enforce all your Kubernetes Ingresses to be private, removing all the public ones:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: test-ingress
  annotations:
    kubernetes.io/ingress.class: public
spec:
  rules:
  - http:
      paths:
      - path: /testpath
        pathType: Prefix
        backend:
          service:
            name: test
            port:
              number: 80

In this example, we want to grep for the kubernetes.io/ingress.class annotation, and ensure it has the approved value of nginx-internal:

rules:
- id: public-ingress
  patterns:
    - pattern: kubernetes.io/ingress.class
    - pattern-not-inside: |
        kubernetes.io/ingress.class: nginx-internal
  languages:
    - generic
  paths:
    include:
    - '*.yaml'
  message: |
    An Ingress has been made public.
  severity: WARNING

You can try this rule in the Semgrep playground: https://semgrep.dev/s/ErGE/.

Conclusions

I have to say the extensibility, and simple syntax, of Semgrep are making it very promising for cloud security teams as well. In a few hours, thanks to the official documentation and Playground, I was able to go from absolute 0 to writing my first rules.

The main challenge I can think of at the moment is: how much does Semgrep overlap with OPA Conftest? Although Conftest has been created with cloud resources in mind, and benefits from the sinergies with the rest of the OPA offering (like Gatekeeper), basically everyone in the industry at some point complained about how cumbersome the Rego language is. In my opinion, this could be a defining factor that might help expand the adotpion of Semgrep from platform teams.

I’m quite curious to hear other people’s opinions on this, so please feel free to reach out to me on Twitter.

About

Semgrep for Cloud Security

What is Semgrep?

📙 The CloudSec Engineer is out now!

Semgrep for Infrastructure as Code

Terraform

Unencrypted EBS Volumes

Open Security Groups

Kubernetes

Conclusions

Subscribe to CloudSecList

Marco Lancini

Security Logging in Cloud Environments - AWS

Introducing CloudSecDocs.com

About

The CloudSec Engineer

CloudSec* Projects

Collections

Applied AI for Security & Engineering

Cloud Security Strategies

Continuous Visibility into Ephemeral Cloud Environments

Kubernetes Primer for Security Professionals

Must Read

Automating Security Operations with AI: Triaging Renovate PRs

My Claude Code Setup (2026 Edition)

What to look for when reviewing a company's infrastructure

On Establishing a Cloud Security Program

Security Logging in Cloud Environments - GCP

Security Logging in Cloud Environments - AWS

Tracking Moving Clouds: How to continuously track cloud assets with Cartography

The Current State of Kubernetes Threat Modelling

Mapping Moving Clouds: How to stay on top of your ephemeral environments with Cartography

So I Heard You Want to Learn Kubernetes

Recent Articles

Automating Security Operations with AI: Triaging Renovate PRs

Redesigning CloudSecList with Claude Design

My Claude Code Setup (2026 Edition)

You Don't Need a Vendor to Automate Security Questionnaires

Transform Years of Content Into a Conversational Knowledge Base

Tags