The Advantages of eBPF for CWPP Applications
Extended Berkeley Packet Filter (eBPF) is a framework for loading and running user-defined programs within the Linux OS kernel, to observe, change, and respond to kernel behavior without the destabilizing impact of kernel modules. eBPF provides kernel-level visibility directly from user space. This combination of visibility and stability makes the eBPF framework particularly attractive for security applications.
In this blog post, we describe how eBPF works, its significance to cloud workload protection platforms (CWPP) for machine-speed detection of OS-level runtime threats, and the benefits of such an architectural approach, namely stability, scalability, and performance. We will then summarize how SentinelOne has over the last 3 years, in close cooperation with leaders across a wide variety of verticals, crafted the most high-performing, resource-efficient, and DevOps-friendly CWPP solution on the market.
eBPF Architectural Overview
eBPF programs allow us to observe and respond to application (workload) behavior within the kernel without modifying the application code itself. This is useful for many applications, especially security applications such as cloud workload protection.
Consider the following diagram in Figure 1, modified for simplicity from the original found at ebpf.io.
Here, we have an application (for example, a CWPP agent) running in user space and which includes an eBPF program for process-level visibility within the Linux kernel. The eBPF program itself is in bytecode, though developers usually use a higher level programming language whose compiler supports eBPF bytecode. This eBPF program is loaded into the Linux kernel, where the program is immediately verified by the eBPF Verification Engine. Then, the program is compiled and attached to a targeted-by-design kernel event; this is what is meant when one says that eBPF programs are “event-driven.” Whenever this event occurs, the program is attached to this event, runs its observation and analysis tasks to completion, and presents results back to the application.
The mechanism by which information is transferred between the eBPF program and the user space application/workload is called “eBPF Maps” or simply “maps”. Now that we have a high-level overview, let’s dig in a little deeper for more complete understanding.
eBPF Safety
The eBPF Verification Engine and Just-in-Time Compiler are the means by which the eBPF framework ensures that, first and foremost, the eBPF program to be loaded and run within the kernel does not destabilize the kernel. This is Rule No. 1: Do No Harm.
Kernel Modules: The Inferior Alternative
Consider the alternative to eBPF: writing kernel modules. Kernel modules raise concerns about operational stability and complexity. While writing a kernel module does indeed allow a developer to change kernel behavior, it is a highly specialized skill, which therefore makes staffing and retention an issue. More pointedly, using kernel modules raises the specter of two critical risk questions: (1) will my kernel module crash the machine?, and (2) will it introduce a security vulnerability?In addition to stability and security concerns, there is the matter of operational overhead: a kernel module only works for a specific Linux kernel version and distribution. Maintaining the kernel module consumes precious developer cycles and complicates operational management unnecessarily. The eBPF framework addresses each of these pain points, making kernel modules far less desirable.
Before any eBPF program is loaded into the kernel, it passes through the Verification Engine and JIT Compiler. The Verifier ensures that the program is safe to run, will not crash the system, and will not compromise data. It validates that several conditions are met:
- The process loading the eBPF program has the necessary privileges to do so.
- The eBPF program does not crash the system.
- The eBPF program runs to completion. That is, it does not loop indefinitely.
Once verified, the JIT Compiler translates the program from bytecode into machine instructions, optimizing for speed of execution.
Now that the eBPF program is verified and compiled, it is attached to a kernel-level event, such that when the event occurs, the program is triggered, run to completion, and information presented to the user space application. This brings us to eBPF Maps, or simply “maps”.
eBPF Maps
eBPF maps are the mechanism by which information transfers between the eBPF program and the user space application. Bidirectional information flow is supported. A map is a data structure that the eBPF program and user space application can read or write.
For example, the program might be triggered on an event such as gzip of a file. The eBPF program will write some information about that event, such as the file name, filesize, and gzip timestamp, to the map. It might also increment the number of times a gzip operation occurs within a given period of time. If that number exceeds a certain threshold, the eBPF program can write a judgment of “MALICIOUS” to the data structure. Stated simply, the eBPF program observed behavior indicative of a ransomware attack and flagged this behavior as malicious. The user space program – in our example, a cloud workload protection (CWPP) agent – can read that map, see the malicious judgment, and take appropriate action. Basic information processing occurred within the eBPF program, minimizing the amount of information passed to the user space application and thereby optimizing performance.
Advantages of eBPF within CWPP
A cloud workload protection platform agent does what other security controls do not: detect and respond to runtime threats, like ransomware or zero days, in real time. This makes CWPP a vital component of a cloud defense in depth strategy. An organization can, and quite often should, have other cloud security measures in place, such as AppSec, CSPM, and more. Each plays a role in a robust cloud security strategy. A CWPP agent works alongside these other controls, to (1) provide runtime protection and (2) record workload telemetry.
As shown in Figure 2, a ransomware attack on a cloud compute instance (VM) can lock-up a cloud workload in milliseconds. Note that the CWPP agent in this 1-minute video detected and stopped the ransomware attack mere moments (less than a second) after it was launched.
Try getting this real-time response from a side-scanning solution. You cannot. Side-scanning is typically run only once a day, because taking snapshots of a cloud compute instances’ storage volumes for inspection is cost-prohibitive. Moreover, a side-scan architecture lacks process-level visibility within the kernel. These are the forensic details which the SOC needs to investigate and appropriately tag and route the incident to the appropriate DevOps owner. Only a behavioral, real-time CWPP agent using the eBPF framework provides the combination of real-time process-level visibility and stability, making it the preferred choice.
Increasingly, cybersecurity insurance underwriters require CWPP before they will even quote a policy. Machine-speed threats such as ransomware demand an ability to respond faster, and with higher accuracy, than human-powered technology alone. Additionally, a historical record of workload telemetry not only facilitates investigation in the event of a security incident, but also makes proactive threat hunting possible. In this way, threat actors can be stopped before they even launch an attack.
The application of the eBPF framework within a CWPP program offers several advantages, including but not limited to:
- Operational stability
- System performance
- Business agility
Operational Stability
While a kernel module can provide the kernel visibility which a CWPP application requires, running code in the kernel can be dangerous. A false move can destabilize the system (ie, kernel panic), or introduce a security vulnerability into the kernel. Neither of these outcomes are in any way acceptable, especially where a CWPP agent is concerned. A CWPP agent that uses kernel modules can cause kernel panics that crash the VM and brick your workload. These unplanned outages threaten financial performance, order fulfillment, customer loyalty, and create costly, disruptive fire drills.
In stark contrast to a kernel module, the eBPF framework includes safety controls such as the Verification Engine, JIT Compiler, and more. As a result, eBPF programs will not crash the kernel. Neither can they reach into arbitrary memory space within the kernel, making them much less prone to security vulnerabilities. eBPF programs provide all the kernel-level visibility with none of the risk from kernel modules: no tainted kernels or panics. For these reasons, eBPF is the preferred choice for CWPP from an operational stability perspective.
System Performance / Resource Efficiency
Transferring information from within the kernel to user space is slow and introduces performance overhead (CPU, memory). In contrast, the eBPF framework enables us to observe kernel behavior and perform analysis within the kernel before transferring a subset of results back to user space. This creates a fundamental performance advantage for CWPP agents operating in user space and which use eBPF programs. eBPF provides high observability with lower overhead relative CWPP agents with kernel modules.
Business Agility
Developers should be focused on innovation, not on juggling the kernel dependency hassles which kernel modules introduce. By operating from user space, DevOps have more flexibility to update the host OS image with less concern of that update conflicting with their CWPP agent. eBPF makes this possible. As a result, more DevOps can be devoted to innovation, and less (much less) to maintenance concerns.
Moreover, because the CWPP agent itself uses the eBPF framework and avoids kernel modules, the vendor too is more focused on innovation. And of course the customer reaps the benefits of this virtuous cycle of agile velocity.
Singularity Cloud Workload Security
Working with Customers
At SentinelOne, we work closely with our customers, innovating and advancing existing solutions, even as we accelerate execution of our product vision. Dating back to 2019, a customer urged us to re-architect our Linux CWPP agent to use eBPF. The easy answer would have been to politely decline, but we are both intellectually curious and fanatical about customer success. Once we understood the benefits which eBPF would bring to our customers, we got to work. The result? SentinelOne customers around the world have the advantage of a CWPP continuously refined over 3 years, and which has delivered some exceptional performance.
High Performance
Independent test results prove this out. In April 2021, MITRE Engenuity published its MITRE ATT&CK benchmark results for Carbanak & FIN7, an evaluation focused on emulating financial threat groups. For the first time, MITRE ATT&CK included Linux servers in its testing. SentinelOne was the only vendor with 100% visibility across Windows devices and Linux servers (Figure 3). We had the most enriched detections (“Analytic Detections,” in MITRE’s vernacular), as shown in Figure 4. Far from “noisy,” our patented Storyline technology auto-correlates related detections to maximize signal-to-noise ratio (SNR) and streamline investigation and response.
CWPP must be real-time if it is to defend cloud workloads from runtime attack and ensure business continuity. Machine-speed attacks spread evil at machine speed. Delayed detections give the adversary the time needed – literally, only a matter of seconds – to bring a cloud workload to a grinding halt. And if not ransomware, then it’s malware quietly spreading throughout your cloud footprint. In broad brushstrokes, the wider the spread, the larger the remediation effort. Delays cost. SentinelOne delivered 100% real-time detection, with zero delays, again, as defined by MITRE. No spin, just a common language to compare apples to apples. The fewer the delays, the better.
Similarly, the 2022 MITRE Engenuity ATT&CK testing showed SentinelOne had exceptionally high performance. The Wizard Spider + Sandworm emulation also included Linux servers. Here again, SentinelOne led from the front with 99% Analytic Coverage, much more than CrowdStrike, Microsoft, or TrendMicro. Head-to-head comparisons are available at the MITRE Engenuity website.
Resource Efficiency
SecOps prefer our CWPP performance and partnership, but we recognize that it is Infrastructure & Operations who carry the costs of operating an agent, even if those costs eventually are transferred internally to the lines of business. Any application, be it a CWPP agent or otherwise, requires compute and memory resources to function, and those resources come at a cost. For deployment within a fixed and sunk cost infrastructure such as a data center, such apps take away resources that would otherwise be available for the primary business workloads; while it’s not an incremental operational expense, there is the opportunity cost of resources. For cloud IaaS however, resources used are metered and paid for on-demand; deploying a CWPP agent may necessarily increase the size of the cloud compute instance (e.g., from a t4g.medium to a t4g.large), and thereby incrementally raise its operational expense. It’s a necessary expense, to be sure, but an incremental expense nonetheless.
Therefore, we obsess about CPU and memory utilization as much as we do about performance. Our eBPF agent architecture refined over the years enables us to deliver exceptional security performance in a very compact footprint. Check out this blog post about advancements made in Linux and K8s Agents v22.3. And in July 2022, we announced support for AWS Graviton3, the most recent AWS ARM processor generation providing further benefits in compute, power, etc.
Additionally, if you are running containerized workloads, a single SentinelOne CWPP K8s agent per K8s worker node protects the host, all its pods, and all their containers. Deployed as a DaemonSet, our agent scales automatically to ensure your business workload is defended even under peak demand.
DevOps Friendly
In addition to working closely with customers as partners and delivering performance leadership, we recognize that organizations went to the cloud to go faster, not slower. Innovate swiftly, operate securely. Singularity Cloud Workload Security solves the agility/security paradox by simplifying deployment, automating scalability with workload demand, and of course, operating entirely in user space.
- Automated deployment fits within standard DevOps provisioning methods, including CloudFormation, Terraform, Helm, and a host of others.
- We support 13 leading Linux distributions and a wide array of versions, all from a single CWPP agent. Say goodbye to 60 pages of user documentation devoted to “this agent version” mapped to “that Linux distribution.” Our eBPF agent abstracts aways that complexity.
- Our agent has no kernel modules, so DevOps don’t have to worry about kernel panics.
Summary
The advantages of the eBPF framework make it the preferred choice for cloud workload protection. Superior system performance translates to lower operational costs than alternatives relying on kernel modules. Operational stability aspects provide for better business continuity.
Refined over 3 years in a global installed base, Singularity Cloud Workload Security delivers market-leading performance, flexibility, and scalability. If you are searching for a CWPP product which uses the eBPF framework, is preferred by titans of industry and mid-market commercial alike, and which regularly shines in benchmark testing such as MITRE ATT&CK, we hope you consider SentinelOne. Customer case studies and testimonials can be found both on our webpage and on independent peer review platforms such as Gartner Peer Insights. When you are ready to speak with a cloud security expert, our team would be happy to connect with you. Let us show you why thousands of SentinelOne customers worldwide trust us to protect their business.