Protect Your Serverless Applications with AWS WAF and CrowdSec: Part II

In the previous article of this series, I showed you how to deploy the CrowdSec Security Engine to protect an Nginx server running on an EC2 instance behind an Application Load Balancer. In case you missed it, you can find it here.  While this setup is still common, it’s a bit “old-fashioned” and can be more difficult to manage in the long term.

Many applications deployed in the cloud use the serverless approach, and CrowdSec can help you protect those as well!

Our target application infrastructure is pretty straightforward:

  • Served by Lambda functions behind API Gateway
  • The API Gateway will be behind a CloudFront distribution
  • The AWS WAF will protect the CloudFront distribution

The Crowdsec Security Engine and the AWS WAF Remediation Component will run on an EC2 instance to simplify deployment.

The Security Engine will read logs from the CloudFront logging bucket and make decisions about IPs. The AWS WAF Remediation Component will then manage the content of an AWS WAF WebACL to block malicious IPs.

If you want to try it for yourself, the terraform configuration used to deploy this infrastructure is available in this GitHub repository

Word of caution: This deployment is NOT intended for production usage, as some IAM permissions are very permissive, and access to the API Gateway is not restricted only to CloudFront.

Deploying the infrastructure

The application I’m using for this tutorial is very simple — it returns a Hello World message in a JSON body on /.

I won’t discuss the application’s deployment, as it’s outside the scope of this article, but you can check out the Terraform configuration in the GitHub repository shared above.

The main points to know are:

  • CloudFront will send its log to an S3 bucket
  • S3 will send a notification to an SQS queue when a new file is uploaded to the bucket
  • CrowdSec will monitor the SQS queue for events and automatically read new log files as they appear in the bucket

As in the previous article, I grant an IAM role to our EC2 instance, allowing access to the AWS WAF APIs and in addition to the S3 logging bucket.

Note: If you are trying this for yourself, please note that CloudFront may take up to 4 hours to start reliably delivering logs after the initial configuration, so don’t panic if you don’t see any logs straightaway.

Deploying the CrowdSec Security Engine

For the full tutorial on how to get started and install CrowdSec check out the official documentation

First, you need to add the CrowdSec repositories to the EC2 instance:

$ curl -s https://install.crowdsec.net | sudo sh

Once this is done, install CrowdSec:


$ sudo apt install crowdsec

Because I will be parsing CloudFront logs, I can simply install the CloudFront collection from the CrowdSec Hub:


$ sudo cscli collections install crowdsecurity/aws-cloudfront

This will download the CloudFront log parser, the various HTTP-related scenarios, and the HTTP context so you can see more information about the malicious requests detected by the Security Engine in the CrowdSec Console.

Speaking of the Console, I’ll also enroll my new installation in the CrowdSec Console.

If you do not already have an account, go to https://app.crowdsec.net, create one, and get your enroll key:


$ sudo cscli console enroll XXXXX --name crowdsec-cloudfront

I’ll also enable context sharing:

Now, I’ll go back to the Console and accept my new Security Engine:


$ sudo cscli console enable context

My next step is to configure the CrwodSec Security Engine to read the CloudFront logs from my S3 bucket.

To do this, I am going to use the S3 datasource by creating a file:  /etc/crowdsec/acquis.d/s3_cloudfront.yaml


source: s3
polling_method: sqs
sqs_name: crowdsec-log-notification
sqs_format: s3notification
use_time_machine: true
region: eu-west-1
labels:
  type: aws-cloudfront
  

This configuration tells CrowdSec to monitor the SQS queue called crowdsec-log-notification for new files appearing in an S3 bucket and that the logs will be in the AWS CloudFront format.

One very important parameter is use_time_machine: by default, when processing logs in real time, the Security Engine will use the time at which it reads a line as the timestamp (as some log types do not have a timestamp in them 😢).

But in the case of CloudFront, the logs are only uploaded to the bucket every five minutes, and the Security Engine will process an entire file at once. This means using the ingestion timestamp could lead to many false positives (the Security Engine would think that 5 minutes worth of logs happened in a few seconds at most).

The use_time_machine parameter forces the Security Engine to look at the timestamp of each log line and use that instead when processing the events (this is how the replay mode works!)

I can now restart the Security Engine since I am done configuring it.


$ sudo systemctl restart crowdsec

Installing the AWS WAF Remediation Component

Now, let’s install the AWS WAF Remediation Component.

Since I already set up the CrowdSec repository, I can just run the command:


$ sudo apt install crowdsec-aws-waf-bouncer

Because it runs on the same machine as CrowdSec, it gets automatically registered in the Security Engine.

I still need to configure the Remediation Component to update the content of my web ACL by editing the /etc/crowdsec/bouncers/crowdsec-aws-waf-bouncer.yaml file:


api_key: XXXXXX
api_url: "http://127.0.0.1:8080/"
update_frequency: 10s
log_media: file
log_dir: /var/log/
log_level: info
daemon: true
waf_config:
  - web_acl_name: cloudfront-webacl
    fallback_action: ban
    rule_group_name: crowdsec-rule-group
    scope: CLOUDFRONT
    ipset_prefix: crowdsec-ipset-
    capacity: 300
    

The important section is waf_config Which tells the Remediation Component to manage the content of the cloudfront-webacl ACL, create a rule group named crowdsec-rule-group, and prefix the various IPSets that will automatically be created with crowdsec-ipset.

Once this configuration is done, I can restart the Remediation Component:


$ sudo systemctl restart crowdsec-aws-waf-bouncer

The Remediation Component will automatically create the required resources.

Even if I haven’t attacked my CloudFront distribution yet, if I go to the AWS console and have a look at the IPSets, we’ll see a lot of blocked IPs; that’s the CrowdSec community blocklist redistributed automatically to every user (each set contains up to 10k IPs):

Time to test

Now let’s test everything!

I’ll just simulate a very dumb HTTP scanner:


$ for x in $(seq 1 100); do curl https://d2vwdcykiqlka3.cloudfront.net/$x/ ; done

After the scan is done, I need to wait a few minutes for CloudFront to make the logs available, and then I can verify CrowdSec was able to parse the logs and detect the attack properly:


$ sudo cscli metrics
...
Local API Alerts:
╭────────────────────────────┬───────╮
│           Reason           │ Count │
├────────────────────────────┼───────┤
│ crowdsecurity/http-probing │ 1     │
╰────────────────────────────┴───────╯
...

You can see that CrowdSec triggered a decision against my IP:


$ cscli decisions list
│   ID   │  Source  │    Scope:Value   │           Reason           │ Action │ Country │       AS       │ Events │     expiration     │ Alert ID │
├────────┼──────────┼──────────────────┼────────────────────────────┼────────┼─────────┼────────────────┼────────┼────────────────────┼──────────┤
│ 232673 │ crowdsec │ Ip:X.X.X.X │ crowdsecurity/http-probing │ ban    │ FR      │ 12322 Free SAS │ 14     │ 3h51m50.299984449s │ 9        │
╰────────┴──────────┴──────────────────┴────────────────────────────┴────────┴─────────┴────────────────┴────────┴────────────────────┴──────────╯

And if I try to access the CloudFront distribution, I am blocked:

__wf_reserved_inherit

You can also see the alerts CrowdSec generated in theCrowdSecCconsole alongside the context that was sent:

Conclusion

In this second and last part of this AWS WAF series, I showed you how to configure CrowdSec to read logs from a CloudFront distribution and how to use the AWS WAF Remediation Component to automatically block threats in CloudFront before they can reach the application.

For this setup, I used a serverless application as an example, but obviously, you can use the exact same configuration for any service you host behind CloudFront.

I also used the S3 logs of CloudFront, which comes with a 5-minute delay. Cloudfront does support pushing real-time logs to a Kinesis stream, which could be read with the CrowdSec Kinesis datasource

Hope you enjoyed this tutorial! If you try this setup yourself, reach out on our Discord or Discourse and let me know if you have any issues or questions.

You may also like

Protect Your Applications with AWS WAF and CrowdSec: Part I
Tutorial

Protect Your Applications with AWS WAF and CrowdSec: Part I

Learn how to configure the AWS WAF Remediation Component to protect applications running behind an ALB that can block both IPs and countries.

Securing A Multi-Server CrowdSec Security Engine Installation With HTTPS
Tutorial

Securing A Multi-Server CrowdSec Security Engine Installation With HTTPS

In part II of this series, you learn about the three different ways to achieve secure TLS communications between your CrowdSec Security Engines in a multi-server setup.

Setting up A Multi-Server CrowdSec Security Engine Installation
Tutorial

Setting up A Multi-Server CrowdSec Security Engine Installation

In part I of this series, you learn how to deploy multiple Security Engines in a multi-server setup with one of the servers configured to store and share the collected signals.