Writing Web Application Firewall (WAF) rules can be tedious and complex, especially when dealing with convoluted YAML syntax. To solve this, we recently published a Model Context Protocol (MCP) for CrowdSec, specifically focused on helping users leverage AI to automatically write scenarios and WAF rules for the CrowdSec Security Engine.
First of all, an MCP (Model Context Protocol) is an open-standard specification that enables Large Language Models to securely and consistently connect to external data sources and local tools, essentially acting as a “USB port” for AI to interact with your files, databases, and APIs.
This extract from our internal tooling has drastically improved our ability to create virtual patching rules for the WAF and internal detection rules for our data lake. In our context, the CrowdSec MCP allows a user to simply instruct their favourite LLM, like ChatGPT, Claude, Gemini, and others, and say, “Hey, look up this blog post and create a WAF rule for it”, and end up with a highly accurate result.


In the era of rapidly progressing LLMs, this might sound like a normal Monday morning. But achieving consistent, production-ready WAF rules takes a few tricks; let’s not forget, there is no magic. Here is a walk-through of the process that brought us here and the lessons we learned along the way.
Why Automate WAF Rules with AI?
Let’s cut to the chase: writing WAF rules is tedious, and life is too short to write YAML. At CrowdSec, we track about 100 new vulnerabilities every month, which means writing just as many YAML WAF rules with accompanying tests, often with convoluted syntax. The initial need was clear: we wanted to increase our ability to produce reliable WAF rules while reducing as much as possible the human bandwidth spent writing WAF rules or scenarios.
We already leveraged LLMs to industrialise those processes and increase their velocity, but the way MCP interacts with LLM allowed us to take it one step further.
Why We Built a Model Context Protocol (MCP) for CrowdSec?
In our first iterations, we exploited OpenAI APIs directly with some custom prompts, including examples, instructions, etc. However, even if you can set the temperature of the LLM when using the API, the results aren’t consistent, as you might know. As a result, the generated rules are sometimes correct and sometimes incorrect.
When things went right, everybody was happy, but when things went wrong (hallucinations, misunderstandings from the LLM), the human had to take over and start over. This was both a source of frustration and a waste of time, with every iteration providing results of varying quality. We quickly identified that a feedback loop was needed.
Last but not least, previously developed tooling had strong adherence to some input formats – such as nuclei templates – and we wanted natural language understanding to achieve the same result from less structured sources, such as blog posts or write-ups.
Both these reasons make a good case for experimenting with MCPs.
Overcoming LLM Limitations in WAF Rule Generation
On tasks that sound rather simple, an LLM might fail spectacularly, especially if it involves a lot of small steps to follow meticulously. Providing the LLM with tools (such as linting, syntax validation, etc.) is a great way to introduce feedback loops, giving it a chance to identify its mistakes and correct itself, and that’s what makes MCP attractive in this use case.
Feedback Loop is Key
In the context of an MCP, given that you provide enough tools, it will correct itself over time when it starts hallucinating. As a concrete example, the first steps of the MCP usage by the LLM to generate a WAF rule look like:
- Get “WAF rules syntax” prompt from MCP: it covers the syntax of the WAF rules, with some do’s and don’ts.
- Submit the generated WAF rule to the “syntax validation” tool: it will validate the generated YAML based on our public YAML schemas.
Simply introducing this 2nd step allows the LLM to understand that it generated an incorrect YAML document, go back to the 1st step, make another attempt, and go back to the 2nd step until it gets proper validation. Providing the LLM with specific steps to follow enables iterative refinement, significantly improving the overall quality of generated results.
In the example above, we see self-correction via a feedback loop: upon getting the prompt “WAF rule challenge” that aims at identifying common mistakes or suboptimal patterns, the LLM corrects itself:
> Actually, since the guidelines say only use OR or AND (not both), and the paths differ, but the injection pattern is the same, I’ll create a single rule using a URI regex to match both paths with the AND condition on the injection:

Let’s not forget about consistency
Another key advantage of MCP integration is consistency. As one might know, while LLMs excel at a lot of things, consistency isn’t their forte. By exposing the right tools and prompts to the LLM, we can strongly mitigate the issue:
- Sometimes, a few lines of code will succeed where a smarter LLM might erratically fail. Identifying those choke points and exposing the tools at the right time allows us to get around the LLM’s inconsistencies.
- Exposing clear steps to the LLM with various subprompts helps ensure that validation steps aren’t skipped (which might otherwise happen). Typically, forcing the LLM to analyse previously generated data with a different prompt enables it to perform self-correction and pushes consistency one step further.
Thus, MCPs sometimes allow for filling gaps in the LLM’s ability. In the example below, the LLM generated a syntaxically correct rule, but upon testing the WAF rule against an actual exploit, it realizes its own mistake:

More tools for better results
However, to obtain a correct WAF rule, it isn’t enough to produce a syntaxically correct document, as there are many pitfalls:
- The rule might be too broad or too narrow.
- The rule might simply not match the exploit.
- The rule isn’t following best practices.
- The rule needs to include a proper test case for regression and false-positive detection.
To achieve all this, we provide the MCP with more than 20 tools that will help it at the various steps of the process, so that the overall workflow looks like:

Identified limitations
While we’re overall very satisfied with the current results produced by the LLM and how it drastically allowed us to reduce production time of rules while increasing the consistency of the results, there are a few key points that need to be kept in mind when designing such systems:
- LLMs get overwhelmed when there is too much data: On other features that implied interacting with verbose APIs, we repeatedly observed the LLMs getting confused or lost when dealing with MCP outputs. You often need to find bypasses or shortcuts to avoid exposing the LLM to significant volumes of data, or risk inconsistent results – most likely due to context size.
- LLMs lack consistency: sometimes, we had to expose or provide tools that might seem overly basic, simply because the LLM repeatedly failed on steps that seemed trivial. It was specifically true when you step away from creative tasks, and instead ask LLM to verify a document against a yaml/JSON schema.
- Prompt engineering: while it might sound really stupid, we observed very different output quality depending on how the prompt was formatted, rather than what was inside.



