Want to improve the security of your ecommerce website?

Learn how

The CrowdSec FOSS Business Model: Open Source as the Digital Twin of Fair Trade

To all FOSS investors and startupers creating enterprise-grade software, in this article, I’d like to walk you through our decision-making journey and the whys behind every choice and mistake we made. I hope this all-out disclosure will help you in your adventures.

Starting the fourth year of CrowdSec, I feel it’s time to explain our business model in detail for several reasons. First and foremost, to avoid unnecessary fear among the community members and, second, to reassure our clients and investors that a free product and a commercial offering can peacefully coexist.

Aligning the interests of our community, our investors, and our clients was possible, yet it required quite some work and sweat.

One concern is that a free product is structurally eating its paid version and will become too limited to avoid this cannibalism. Another comes from the classical “if it’s free, you’re the product.” Finally, both worlds, the business one and the Free and Open Source Software (FOSS) one, wonder if that core component will be maintained over time, nesting deeply in their system. The FOSS community thinks “too good to stay free; they will have to monetize once they reach a certain size,” and the business thinks, “FOSS is great, but people need to eat, the project will eventually run out of funds, and deprecate, it’s a production risk”.

This is a delicate balance to strike, whatever model you’re embracing — FOSS, OpenCore, or Commercial and Open Source Software (COSS). We have seen many companies having trouble on those fronts lately and being forced to change their stance, with severe consequences for their ecosystems.

In enterprise-grade software, there are ten wrong reasons to embrace FOSS licensing and very few fundamentally good ones, leading many projects to pivot. The industry is riddled with FOSS companies forced to move from database systems to OS or cloud stacks. Another issue is the companies trying to have a premium model based on features, relying on an open core. If your feature is good, it should be in the product, and if not, your community is entitled to develop it by themselves, sometimes competing with your own.  

Open source is the digital twin of fair trade. You offer something openly, transparently, and without second thoughts, but you also clearly express what you expect in return. In the case of CrowdSec, we choose to edit, maintain, and give away a great piece of software, and in return, we get signals. We also hold the rights to create premium offerings and functionalities, which make sense only if you use the software on an enterprise level and want extra bells and whistles.

But it’s not our first FOSS dance with Thibault (our CTO), and we chose our model with precautions, leading to months of careful evaluation of options before launching our company. For the record, I publish some FOSS content, but I’m not a FOSS monk, preaching that everything should be FOSS and that no other model should exist. I’m a pragmatic entrepreneur, and so is our crew, which, as we’ll see, is the best insurance that we’ll maintain the engine free overtime and the business afloat.

Why choose a FOSS license?

To tip the balance of cybercrime, we needed to create the largest-ever CTI network. To do so, we wanted to reduce friction to adoption to its minimum. Money being the mother of all frictions, the tool needed to be free. Thibault and his team also worked a lot on ease of use, documentation, one-line installs, distribution, good default settings, and visualization since they also can be sources of friction. 

We also wanted users to be able to adapt it to their context and knew for a fact that we couldn’t cover all possible combinations of logs, hardware, software, and everything in between by ourselves, so open source was a logical path. 

Free, open, well, FOSS it is then.

Then came the license debate. Well, out of 300, what could be an issue here, really? MIT, GPL, Apache, BSD2 or 3 clauses, etc. Heck, we wanted it as open as it gets. For some very subtle reasons, MIT won over BSD3 clauses. 

This MIT license is the first fail-safe mechanism to protect the community. If we ever go toward a very aggressive monetization of the Security Engine, anyone can fork it.

Some objected in the past that our Consensus algorithm, the one filtering the tens of millions of signals we receive daily to sort out poisoning and false positives, isn’t public. It’s entirely true, but not for the reason one would think. There is no secret sauce here; instead, this part of CrowdSec is as much code as it is infrastructure. The two are vastly intertwined and heavily rely on AWS microservices. Open sourcing wouldn’t help anyone running it at home, and without the overview effect of the network, it would be useless anyway. 

Also, it’s a swift iterating code, meaning documenting it and making it open source grade for publication would be very heavy for us at that stage. We will likely open-source it eventually, but it’s not yet time. You can still join us and our discussions to make this Consensus better. It’s not a black box, either. The team welcomes every good soul willing to help.

If the software is ever forked, well, it’ll be upon the new team to create its own curation system, but we’re pretty transparent as to how we do it ourselves. And without this curation algorithm, CrowdSec is still the best IPDS you’ll find.

With those ingredients, we knew we could create a snowball and get the Network Effect to detect dangerous IP addresses at scale. Now … about this money thing…

The Security Engine is a means to an end, not an end in itself

How is this network going to feed those clever people behind the scenes? To achieve our goal, we needed funds, business angels & VCs, hence a solid monetization plan to put in front of roughly 20M€ investment.

The Network Effect provides valuable CTI & blocklist data. The hundreds of thousands of servers parsing their logs and sharing the aggressions they faced are pumping a whooping 20 million signals per day. This is the gold stash of CrowdSec: the data

By monetizing our data, we get a sound stream of revenue, and businesses gain access to the freshest-ever real-time map of public IP addresses used by cybercriminals. Now, this is the second fail-safe to protect both companies and the community. 

Let’s imagine we go south and change the license, close the source code, and ask for a lot of money per agent running, tying it to cloud dependency, etc. A real nightmare. Well, next thing you know, the network will be shrinking quickly, people will leave, our signal stream will start to dry out, and our revenue stream along with it.

So, if we start monetizing the FOSS security engine, we’d lose our current income source, as simple as it is. Letting go of the prey for the shadow doesn’t make much sense, and I’m not sure our VC would agree with this strategy.

The delicate art of drawing the line between paid & free

Even though we had a solid view of how we wanted to architecture our income stream, this doesn’t mean we struck a home run on the first try.

The IDPS stays free, and if you handle a few instances of CrowdSec, you do not need extra features. But if you’re a business, you’d want 1-year data retention, auto-enrolment, targeted attack detection, 5-minute updates, multi seats, alert context, multiple premium blocklists, etc.

We thought there would be a natural conversion between free & paid users. The classical cases tell you that 2.5% is a decent rate. If we develop a great top-notch enterprise offering and make it super helpful and user-friendly, we should automatically get 2.5% of users upgrading to premium plans… right? We have around 70K active users, 45K of whom have a CrowdSec Console account, and we can reach out to them. This would net us above 1000 paid customers (to date), with an average cart of 1000€ monthly. We’re golden! 

Fine, let’s make this and test the water…

Well, the results were… not so great. What we forgot in the first place is that most users, no matter if they are business grade or tinkerers, chose us because we are free. It’s part of the deal. There is no issue with this, but the famous 2.5% isn’t applying here. In more detail, there is no tinkerer or SMB market ready to pay. What they get for free is enough for them, and no budget is yet allocated to an upgrade here. This is good in a sense; we are helpful, and it’s part of the fail-safe mechanisms we talked about earlier.

On the other hand, there is a tremendous market for large and XL enterprises. Budget is less of an issue here, and by fishing in our Console users, we found pearls working in top 1000 businesses worldwide, and they are receptive to our premium offerings. The only ‘issue” is that instead of a low touch approach (put your credit card details by yourself, you’re all set), we had to pivot toward a sales, high touch model to satisfy those accounts.

Capturing intention vs. catching attention

But there is still a catch. Big corps? Timing is entirely different.

Their SecOps teams are swamped, you don’t get anywhere close to their infrastructure engineers, their backlogs are crowded, and no matter what you bring, take a ticket and queue. So yes, they buy, but the sales cycle is around 18 months rather than 6. Also, it’s much more complicated to sell a product like CrowdSec, a versatile, low footprint, behavior-based, and crowd-powered protection system, than a CTI. Blue Ocean looks cool: “I have no competition, and I’ll set the standard”. But in reality, it is difficult to reach potential customers with a product that does not fit one specific category. 

So, if you start plowing the market and educating every client about what you do, most of them will not get it. Blue Ocean strategy is more challenging to pull, trust me.

The better road here is to let the user discover, adopt, extend their use, and enjoy, and then they will be back to you when they are ready. Takes time? Yes. Frustrating? Yes. But it’s also mechanical, and L/XL corps have started to use the product, integrate it, and deploy it at a vast scale. There is just no way we can accelerate their decision cycles, but there is no way for us to capture their intention at the exact right moment either. So, the better road here is to scoop them when they’re ripe.

CTI, TTI, and blocklists: the real cash engine

On the other hand, we have data — precious, unique, easy to consume, and API-driven.

Our data is not “yet another OSINT low-quality feed” packaged with markup or a copy of “yet another outdated C2 server list”. Those insights from the network are unique, at scale, false-positive proof, coming from the largest CTI network on earth, growing 0.4% daily. The CrowdSec data covers more protocols, offers more diversity (from 3000+ Autonomous Systems in 180 countries), is harvested from genuine intrusion attempts (not honeypot traffic), and is as fresh as possible.

We sort out this data in two buckets: Cyber Threat Intelligence (CTI) and Tactical Threat Intelligence (TTI). 

CTI is all about knowing what happened in the past and how to contextualize something happening in a log pit or a SIEM, and is a massive help for forensics. That data isn’t curated enough to be injected into an edge filtering component, but they are an enormous help to make sense out of events. 

At CrowdSec, the CTI database holds around 50 million IP addresses from the last six months. 90 million in total, 10 million for the previous 30 days. They come with 20+ fields that any system like OpenCTI, MISP, or others can parse, integrate, and make sense of.

TTI is all about blocking IP addresses that are, beyond reasonable doubt, malicious. This level of certainty could only be reached because many very diversified and well-trusted network members saw those IP addresses, and they don’t belong to any of our or your allowed lists. These blocklists of malicious IPs can be directly injected into your firewall, load balancers, reverse proxy, web servers, etc. They are actionable.

TL;DR

This is what CrowdSec is monetizing:

The attack signals received and curated by the Network Effect.

Not the software.
Not your data.

It’s not a zero-sum game. We take from the bad guys their IP addresses and techniques and turn them into an income stream to allow the more significant number to defend themselves for free, the businesses to become more resilient, and the company to thrive and pay the brains behind it. The value here is created on the backs of the bad guys, not the good ones. 

This makes us no saint since we are also a business and see our interest in this, but we believe this is the fairest and most resilient business model we could adopt.

The delicate art of drawing the line between paid & free

Now, since the network generates the value, it’s only fair to share not only the FOSS Engine but also some of the signals back to those helping us make the internet a safer place.

But if we give the entire blocklist for free, we will just be abused by other CTI companies, MSPs, or MSSPs, and we’ll lose part of our income. 

To draw a fair limit, we decided to update all members daily and send them back only IP lists that their Security Engines contribute to identifying. Long story short, if your machine runs a VOIP anti-scan scenario, you’ll get all IPs that are scanning VOIP infrastructure in return because you partake in the effort. Still, for example, you won’t get free data on IPs brute-forcing Terminal Server Endpoints (TSE). By the way, it’s fair to assume that if you don’t filter, say, terminal server brute force, you probably don’t run TSE, so you probably don’t care about the IP hammering it.

CrowdSec CTI has three colors: Silver, Gold, and Platinum.

The Silver-grade data is free for the ones partaking in the detection effort. We give it back to any Security Engine detecting a specific type of attack. If your security engine participates in VOIP attack detection, it will be fed back to the blocklist of public IPs attacking VOIP infrastructure for free. 

Gold-grade data is what you get when you have a premium subscription to the CrowdSec Console. This data is integrated into your subscription fee and available to boost the remediation capacities of your Security Engines. You can also buy Gold data apart from any Security Engine, as a TTI list.

Platinum-grade data get all of our CTI or TTI data, with all details regarding their behaviors, MITR classification, active times, and targeted countries (20+ fields), delivered by the minute. They also receive extra curated, Network Effect-related blocklists, some of which are enriched by our AI models — for example, our Residential proxy / VPN list. In platinum, you can also get the entire CTI data lake or even get a specific blocklist tailored to your needs. 

Conclusion

This was a long one, but I felt it was a necessary disclosure that showcases how we carefully took time to align our interests with the ones of the community, our investors, and our clients. This is not a “how to monetize open source” writeup since every FOSS project is different. Still, I hope sharing this transparently will help everyone find their path and answer most questions about our business model choices, philosophy, ethics, and resilience.

You may also like

Upgrading the CrowdSec Infrastructure to Support IPv6-Only Users
Inside CrowdSec

Upgrading the CrowdSec Infrastructure to Support IPv6-Only Users

Follow our journey as we upgrade the CrowdSec infrastructure to allow our IPv6-only users to set up CrowdSec without any hiccups.

Network Effect x AI: Transforming CTI into Tactical Threat Intelligence
Inside CrowdSec

Network Effect x AI: Transforming CTI into Tactical Threat Intelligence

Delve into IP-based cyber defense and explore how the Network Effect and AI are transforming CTI into crowd-powered Tactical Threat Intelligence.

Through Smoke and Fire
Inside CrowdSec

Through Smoke and Fire

While working on some new features for CrowdSec, we also have been experimenting with scenarios focused on post-exploitation behaviors, relying on auditd for Linux. That experiment led to another…how quickly can a machine get compromised? How fast can the CrowdSec network spot this compromised machine? And, how fast will it make its way to the crowdsourced blocklist? This article will answer all these questions and more!