BPF – What is the Berkeley Packet Filter?
The Berkeley Packet Filter (BPF) or Berkeley Filter is relevant for all Unix-like operating systems, such as Linux. The main task of the special-purpose virtual machine, developed in 1992, is to filter data packets from networks and embed them in the kernel. The BPF provides an interface with security layers for data content or programs. The security layers are responsible for ensuring the reliable transmission of data packets and regulating access to these packets.
When a recipient receives such a data packet, the BPF reads the security layer data from the packet and looks for errors, for example. This enables the recipient to resolve any errors. What’s more, it can compare the data with filter definitions and accept or discard a packet depending on whether or not it is classified as relevant. This can save a lot of computing capacity.
How does the Berkeley Packet Filter work?
In order to perform its functions, the Berkeley Packet Filter was embedded as an interpreter in machine language as part of a virtual machine. As a result, the BPF executes a predefined format of instructions. In its role as interpreter, the Berkeley Filter reads the source files, analyzes them and runs instruction by instruction. In turn, it translates the instructions into machine codes, thereby enabling direct execution.
Using SysCalls – i.e. by calling up special, operational system functions – the Berkeley Filter sends requests to the kernel. This checks the access rights before confirming or denying the request. The around 330 Linux SysCalls include the following:
- read – allows a file to be read
- write – allows a file to be written
- open – opens files or devices
- close – closes files or devices
- stat – requests the status of a file
Thanks to ongoing development, BPF now operates as a universal, virtual machine directly in the kernel, where the entire organization of processes and data occurs. With its many new features, the filter is known as Extended BPF – or eBPF for short. It can securely run any applied intermediate language (byte code) during runtime (just-in-time compilation) directly in the kernel. The Extended BPF runs within an isolated environment in the kernel and is therefore executed under protection. This environment model – known as a sandbox – helps to reduce the risk that the system has an adverse effect on the kernel logic.
The Berkeley Filter can run both in kernel mode (maximum access to the computer’s resources) and in user mode (restricted access to computing resources).
Advantages of the Berkeley Filter
Using the eBPF, you can filter data packets and prevent irrelevant data from slowing down your PC performance. Unusable or erroneous datasets can be rejected or repaired straight away. Moreover, the Extended BPF provides increased security with the SysCalls; you can easily measure your performance with the system calls or track processes.
The BPF implementation was expanded with “zero copy buffer extensions” in 2007. Thanks to these extensions, device drivers can save collected data packets directly in the program without first having to copy the data.
Programming filters with BPF
In user mode, you can define individual filters for the Berkeley Filter interface at any time. The relevant codes were previously written manually and translated into a BPF byte code. Nowadays, the LLVM Clang Compiler makes it possible to translate byte codes directly.
Example programs are also stored in the kernel libraries which simplify the process of defining eBPF programs. Various help functions make working with filters easier.
The eBPF verifier for security
Executing system calls in the kernel is always associated with certain security and stability risks. Before an eBPF SysCall loads, it has to go through a series of checks:
- First, it’s checked whether the system call was ended and doesn’t contain any loops. This could otherwise result in the kernel crashing. During this process, the control flow graph (CFG) of the program is checked in order to detect unreachable instructions that are not subsequently loaded.
- Before and after an instruction is executed, the status of the eBPF system call is checked. This is to ensure that the Extended BPF only acts in permitted areas and does not access data outside the sandbox. However, not every pathway needs to be examined individually. A subset is usually enough.
- Finally, the SysCall type is configured. This step is important to restrict which kernel functions can be called from the SysCall and which data structures can be accessed. This way, you can use system calls to directly access network packet data, for instance.
The SysCall types generally handle four functions: where the program can be attached, which kernel help functions can be called, whether network packet data can be accessed directly or indirectly and which object type is transmitted as a priority in a system call.
The following eBPF SysCall types are currently supported by the kernel:
- BPF_PROG_TYPE_SOCKET_FILTER
- BPF_PROG_TYPE_KPROBE
- BPF_PROG_TYPE_SCHED_CLS
- BPF_PROG_TYPE_SCHED_ACT
- BPF_PROG_TYPE_TRACEPOINT
- BPF_PROG_TYPE_XDP
- BPF_PROG_TYPE_PERF_EVENT
- BPF_PROG_TYPE_CGROUP_SKB
- BPF_PROG_TYPE_CGROUP_SOCK
- BPF_PROG_TYPE_LWT_ *
- BPF_PROG_TYPE_SOCK_OPS
- BPF_PROG_TYPE_SK_SKB
- BPF_PROG_CGROUP_DEVICE