article image

In a recent blog post , we compared three different technical approaches to filtering egress traffic on Linux: IP tables, IP sets, and BPF. While that provided some interesting baseline benchmarks of the core Linux technologies, we wanted to go beyond that to look at how one would implement such filters in practice, using off-the-shelf cloud native network policy solutions.

In the realm of the Cloud Native, it is not far-fetched to imagine a Kubernetes cluster needing egress filtering for controlling the traffic (host or pod) attempting to leave the network to possibly wild and dangerous endpoints on the internet. Indeed, this is a common use case for avoiding exfiltration of data by malicious workloads.

One could of course build a custom egress filtering framework to suit the use case based on the existing technologies in the Linux networking pipeline. Or one could take advantage of the Kubernetes CNI plugins that already offer similar functionality.

Our friends at SAP asked us to perform a benchmark of the two most widely used Kubernetes CNIs, Calico and Cilium, for this task. This blog post presents the methodology and results from benchmarking Calico and Cilium deployed on a Lokomotive cluster.

Before getting into the details, we would like to make clear that the results of this benchmark are highly specific to a single use case of egress filtering of large numbers (up to millions) of records, and do not reflect the general performance of either Cilium or Calico for Kubernetes networking and policy enforcement. Further, we make use of Cilium capabilities that are still in beta (as described in more detail below).

Goals

We had the following goals going into this study:

  • Provide a reproducible benchmark framework that anyone can download and use.
  • Compare the scalability and potential performance overhead by Kubernetes CNI plugins such as Calico and Cilium against using the underlying Linux filtering mechanisms (IP sets and eBPF, respectively).

About Calico

Calico is the most popular open source CNI plugin for Kubernetes, according to the recent Datadog container survey . Calico not only provides networking but also offers policy isolation for securing the Kubernetes cluster using advanced ingress and egress policies.

Calico provides a choice of dataplane including a standard Linux networking dataplane (default), a pure Linux eBPF dataplane and a Windows HNS dataplane.

About Cilium

Cilium , an increasingly popular open source Kubernetes CNI plugin, leverages eBPF to address the networking challenges of container workloads such as scalability, security and visibility. Cilium capabilities include identity-aware security, multi-cluster routing, transparent encryption, API-aware visibility/filtering, and service-mesh acceleration.

Cilium only recently added support for both deny and host policies, and they are still considered beta features (expected to be generally available in Cilium 1.10). As such, they are not performance tuned, and in fact the Cilium team suggests that to get good performance, the preferred long-term solution would be to use prefilter , as they already do for ingress filtering. Unfortunately, egress filtering using prefilter is not currently supported (see issue #14374 ).

Network Policies

Kubernetes network policies are defined using the Kubernetes NetworkPolicy resource. However, Kubernetes itself does not enforce network policies, and instead delegates their enforcement to the network plugins, in our case Calico or Cilium.

Kubernetes NetworkPolicy resource is an application-centric construct i.e. it allows you to specify how a pod is allowed to communicate with others Pods, Services, external Ingress or Egress traffic. However, it cannot be used to enforce rules on a node or cluster level. Hence for egress filtering, we create network policies using these CNI plugins’ custom APIs.

Calico provides its NetworkPolicy, GlobalNetworkPolicy and GlobalNetworkSet API objects which provide additional features such as order, namespace scoped or cluster-wide enforcement of policies.

For Calico, we used GlobalNetworkSet API passing a list of CIDRs that we want to deny egress to and then reference the GlobalNetworkSet resource in the GlobalNetworkPolicy via label selectors.

Under the hood, this setup is similar to using IP sets for filtering egress traffic. Calico uses GlobalNetworkSet to create IP sets and GlobalNetworkPolicy to update the iptables matching the IP set.

Cilium, on the other hand, uses eBPF as the underlying technology to enforce network policies. A Cilium agent running on each host translates the network policy definitions to eBPF programs and eBPF maps and attaches them to different eBPF hooks in the system. Network policy definitions are created with the help of CiliumNetworkPolicy and CiliumClusterwideNetworkPolicy custom resources for namespace scoped and cluster-wide network traffic respectively.

For Cilium, we used CiliumClusterwideNetworkPolicy API passing a list of CIDRs to deny egress and match the policy using label selectors for both application workloads and network traffic on the host.

Metrics

Filtering network traffic could be a costly operation, especially if the number of rules to check is considerably high — such as in millions (as is the case with a real-world scenario we are working on). Throughput, CPU usage and latency must all be measured to provide a meaningful conclusion for the technologies to be used.

We have used the following metrics to measure the performance of the filters:

  • Throughput
  • CPU usage
  • Latency

For a Kubernetes CNI an equally important metric is Set-up time. Since Calico/Cilium delegates the responsibilities to the underlying technologies, we want to capture the time taken by the CNI plugin to process the created API objects and enforce the network policies for egress filtering.

Scenario

The scenario for this test is, as in our previous egress filtering benchmark, composed of a client and a server computer that communicate through an IP network. The egress filtering is performed on the client machine and there is no filtering performed on the server side.

We test five possible mechanisms for doing the filtering:

  • iptables, “raw” IP sets and tc-eBPF test application, as in the previous benchmark (retesting to ensure consistency)
  • Calico and Cilium, in both cases running the client application in a pod in a Lokomotive Kubernetes cluster, with the solution under test running as the Kubernetes CNI plug-in.

In addition, for a baseline reference, we also ran the test without any filtering mechanism in place.

This is shown in the following diagram:

Benchmark Set-up

As mentioned earlier, our set-up builds on the work of the existing framework. Hence the software and hardware profiles used to benchmark IP sets and tc-eBPF largely remain the same, except for the following changes:

  • Updated software versions for Flatcar, ipset, iperf and the Linux kernel.
  • Two Lokomotive clusters with one worker node to run the benchmarks, one each for Calico and Cilium.

Hardware

To perform the test we used the following bare metal servers running on Equinix Metal :

  • 2 machines as client and server machines for IP set and tc-eBPF filters.
  • 2 machines for the Lokomotive cluster (1 controller, 1 worker) for benchmarking Calico.
  • 2 machines for the Lokomotive cluster (1 controller, 1 worker) for benchmarking Cilium.

The specifications of all the machines used were:

c2.medium.x86
1x AMD EPYC 7401P 24-Core Processor @ 2.0GHz
2x 120GB SSD
2x 480GB SSD
64GB RAM
2x 10Gbps

Software

Kubernetes is deployed using Lokomotive . We used two separate clusters for this benchmark, to isolate comparison of Calico and Cilium from interfering with each other.

Calico

Calico offers a choice of dataplane options, including standard Linux networking (its default) and eBPF. However, Calico’s eBPF dataplane doesn’t support host endpoints, which means that node-level egress filtering is not possible with it. Hence, we use the default standard Linux networking dataplane for our tests.

Cilium

Cilium with its eBPF based dataplane is installed on Lokomotive using a modified default configuration to support our test scenario. The changes are as follows:

  • Increase the number of entries in the endpoint policy map to the maximum limit allowed; i.e. 65536.
  • Enable host firewall for enforcing host network policies.
  • The network interface name on which the host firewall applies.

Software Versions

The exact versions of all the tools we used are:

  • Flatcar Container Linux by Kinvolk Alpha (2705.0.0)
  • Linux kernel 5.9.11
  • iperf 3.6 (in a Docker container with the host network)
  • iptables v1.6.2
  • ipset v7.6, protocol version: 7
  • Lokomotive v0.5.0 for Calico; Cilium feature branch for installing Lokomotive with Cilium
  • Kubernetes v1.19.4
  • Calico v3.16.4
  • Cilium v1.9.0

A minimal working configuration for deploying Lokomotive on Equinix Metal can be found here and the instructions are mentioned in the README.md .

Tests

We used the following parameters for each of the tests:

  • Throughput: The goal of this test is to maximize throughput, ignoring CPU consumption (i.e. CPU will typically be saturated). Therefore, iperf3 was used with the bandwidth set to 10Gbps (equal to the network interface adapter speed) and UDP Packet size set to 1470. Throughput is tested in Gbps; we have not measured throughput in packet per second (pps) but that could be added in the benchmark framework (see issue #16 ).

  • CPU usage: In this test we want to see the variation in CPU usage for a given throughout. Therefore, we again use iperf3 but reduce the bandwidth to 1G. UDP Packet size remains at 1470.

  • Latency: To test latency, we bombard the server with ICMP packets using the ping utility at a rate of 1000 pings per millisecond.

  • Setup Time: As we discussed in our first egress filtering benchmark, set-up time is an implementation-detail specific to the benchmarking application and most certainly can be improved upon. Setup time for Calico and Cilium is calculated using ‘ping’ (ICMP) on a polling basis, checking the enforcement of policies on each poll.

Reproducibility

All the tools and instructions to reproduce these tests are provided in the GitHub repository github.com/kinvolk/egress-filtering-benchmark

Constraints

  • To avoid the error etcdserver: Request entity too large multiple NetworkPolicy manifests (for Cilium and Calico) are created with each containing a maximum of 50000 CIDR entries. When the number of rules increases, more manifests are created and sent to the Kubernetes API server.

  • One possible source of error in testing is if the Kubernetes API server and etcd themselves become a bottleneck for requests, especially when a lot of resources are created in a short span of time. since both Calico and Cilium use them extensively. Therefore, Controller nodes were chosen such that they can handle the load generated with a very high number of rules and, during our testing, we closely monitored CPU utilization to ensure there was no such bottleneck..

Results

Before we get into the detail, we should note that Cilium failed (#14377) our tests when the number of rules increased to 100,000 and beyond, hence the disappearance of the orange bar from the results at the further end of the horizontal axis.

Test #1 - Throughput

The results of the throughput test are shown in the following chart:

As can be seen, the overhead of Calico on top of IP sets is not much. Calico managed to keep its throughput between 2.5 and 2.7 Gbps despite the increasing number of rules. In correlation with the increased CPU usage (test #2) and latency (test #3), it is not a surprise that Calico manages a slightly lower throughput than the base IP sets scenario.

Cilium on the other hand maintained a consistent throughput around 2.3 Gbps which is slightly lower than other filters.

Given that this test is for maximum throughput, ignoring CPU usage, we would expect the CPU to be saturated, and that is indeed what we see with CPU consumption hovering between 85% and 98%.

Test #2 - CPU Usage

The results of the CPU usage test are shown in the following chart:

CPU usage for ipset and tc-eBPF is very similar, and does not increase with the increase in the number of rules.

Cilium has the highest CPU usage of all filters, except for iptables, at a consistent 40 percent. Analyzing the stack trace using the BCC’s profile tool, we found that an equal amount of time for Cilium was spent doing iptables rule processing. We suspect this is due to default Cilium installation being kubeProxyReplacement=probe, meaning Cilium works in conjunction with kube-proxy. There is another mode which Cilium supports i.e. kube-proxy free , which we believe would improve the performance as it wouldn’t be spending the time in iptables rules processing( see issue #18 ).

Calico reports a marginally higher CPU usage compared to IP sets. We have not profiled the cause of this but there could be different possibilities to explore, one of them being that Lokomotive ships by default with a few pre-configured Calico network policies to secure workloads and host endpoints, hence as a result for each packet sent or received by iperf, the kernel needs to iterate through those rules. Note that, because these are Calico-specific policies, they are not applied in the Cilium test case.

Test #3 - Latency

As expected from our previous results, raw iptables performs very badly at scale, so we have to show the graph with a logarithmic scale. However, this doesn’t provide a clear difference between other filters, so we also show the results with a linear scale, excluding the iptables results.

The second graph removes iptables to properly dissect the information present regarding latency for other filters.

Even though Calico and Cilium reported a slightly increased latency compared with other filters, the main point is that the increase in the number of rules did not adversely affect the latency for either of them (which is what we would expect from our prior results with IP sets and eBPF).

Latency can also be measured with netperf TCP_RR instead of ping. With more time to investigate, this is something we would like to explore if we rerun these tests in future (see issue #17 ).

Test #4 - Set-up Time

The set-up time is acceptable for all the filters even though it does increase as the number of rules grows.

The increase in rules does not seem to affect Calico greatly and as the number of rules increases, the difference in set-up time for Calico against other filters gets smaller.

As there is no synchronous confirmation of when rules have been applied, we check on a polling basis, which necessarily introduces a level of granularity to this test, making the results an approximation of the exact set-up time. Specifically, to calculate the set-up time needed by Calico/Cilium, the benchmark application introduces a 1 millisecond sleep before checking again for the readiness of the egress filters. For this reason, the graph should not be used as an exact benchmark comparison. Instead, the values on the graph should be thought of as the upper limit, incrementing in milliseconds.

Test #5 - IP sets and Calico for Rules More than 1 Million

Since we found Calico scaling easily to our maximum test scenario of 1 Million rules, we decided, for fun, to push the boundaries to the breaking point and see what happens.

By default, when Calico creates the IP set, the maximum limit on the ipset is 1048576 as per the Felix configuration resource specification .

This means that if the number of rules is more than the maximum value, Calico throws an error:

Hash is full, cannot add more elements.

Calico pod goes into the CrashLoopBackoff state until the GlobalNetworkSets created are deleted with kubectl.

To mitigate the issue, the maximum ipset size needs to be configured. Thankfully, we can do that by editing the default FelixConfiguration as follows.

calicoctl patch felixconfiguration default --patch='{"spec": {"maxIpsetSize": 100000000}}'

Re-running the tests resulted in Calico managing to update 10 Million rules, however, the health of felix service fluctuated from Live to Not live. For 50 Million and 100 Million records, it is not entirely clear whether the limitation was the benchmarking framework or Calico (issue #4069 ) – but clearly, we’re going beyond the limits that anyone is likely to need for production.

Conclusion

Our results show that directly programming IP sets rules delivers the best overall performance for egress traffic filtering. However, if you are running Kubernetes anyway, you will likely already be deploying a CNI provider like Calico or Cilium, which already support declarative policies for egress filtering, and our tests show that you should consider using these tools.

Our benchmark tests show that Calico didn’t introduce any scalability issues, as shown by similar IP set and Calico curves across all metrics. Calico compensates for a minimal performance overhead (compared with raw ip sets) by offering reliability and flexibility of updating the list of egress filtering rules such that the entire operation is smooth, quick and hassle-free. If you are looking to implement node-level egress IP filtering to your cluster, based on our tests we would recommend Calico as a viable, performant solution.

If you are already using Cilium for Kubernetes networking, then it might make sense to use it for egress filtering, provided the number of rules is not too large. We found Cilium showed scalability issues, in particular failing when approaching 100,000 rules in our test scenarios. Of course, our scenarios are limited and it would not be fair to assume that Cilium would fare the same in other use cases. I would argue the case on behalf of Cilium on two fronts: (a) The network policy features such as deny policies and host policies are beta features and will certainly get optimizations as they mature to stable; and (b) The main use case of Cilium lies in building on top of eBPF and providing observability at a granularity that was not possible before, thanks to the emergence of eBPF.

We are bullish on eBPF as a technology, and look forward to Cilium’s continued leadership and progress in this area. We also look forward to Calico completing its eBPF dataplane implementation, to enable a choice of eBPF solutions for this use case.

Since we are comparing Kubernetes CNI against other filters, it is also important to note that the benchmark setup consisted of only one use case (egress filtering) while in a production environment a Kubernetes cluster would be hosting other applications, with all the network policies being handled by the CNI. You should take this benchmark as a narrow measure of one aspect of the performance of the solutions tested, and validate actual system-wide performance in your own environment.

Update 2021-01-20: The introduction was updated to emphasize that this benchmark is use case specific, should not be taken as a general performance comparison, and that the Cilium team does not recommend its use in production for egress filtering of large numbers of IP addresses.

Related Articles