During our visit to KubeCon in Paris, we attended a presentation by Nico Vibert & Dan Finneran of Isovalent about Cilium. I personnaly was hyper motivated to learn more about this network management solution that seems to be gaining traction!
Indeed, Kubernetes isn't designed with an easy and "implicit" network management approach. This is where Cilium comes in and starts the presentation by introducing some questions:
Who manages the K8s network?
Often, network infrastructures are unaware that "pod networks" exist, what to do?
What tools are available for troubleshooting the K8s network?
How to check for bottlenecks and performance?
How to manage traffic encryption?
How to handle load balancing?
Other requirements : we need to secure our applications, we need to know what egress IP we're using for our pods in order to manage IP filtering, etc...
Cilium is here for that
Natively in Kubernetes, all network rules are managed by a component called kube-proxy. This component controls iptables across all nodes of the cluster.
However, even thokugh iptables is widely used in the GNU/Linux environment, it's not always ideal.
Indeed, when iptables needs to update a rule, it must recreate and update all rules in a single transaction. This is a real problem when a cluster has a significant number of nodes and Pods running on it.
Moreover, this tool can only filter based on IP addresses or ports, not on potential paths or HTTP methods. This is somewhat inconvenient in a Kubernetes context where many applications are APIs.
Finally, iptables generally consumes a significant amount of CPU when running with Kubernetes.
Most of iptables' shortcomings are addressed with Cilium. It can filter at the Layer 7 (application) of the OSI model and addresses scalability issues that iptables struggled with.
How does it work ?
eBPF is the primary answer. Indeed, eBPF allows direct communication with the kernel. In fact, a slide states "what Javascript is to the browser, eBPF is to the Linux Kernel."
eBPF is used to control the traffic, load balancers, network policies, service mesh, ingress, etc.
eBPF is low level kernel coded and that's the reason it's able to communicate with the kernel and manage traffic very quickly.
Note : ePBF is massively used by Facebook / Google / Netflix to handle the traffic. Think about that the next time you're loading content from them ;)
Here's a performance comparaison between eBPF / ipvs and iptables
The end of the presentation mainly revolves around use cases, showing manifest files after installation of Cilium.
You can find the video of the presentation here .
Addition: how to install Cilium?
The installation can be done through different ways. The first one is using an Helm chart, the second one is using cilium-cli (doc available here ).
Example with helm :
helm repo add cilium
https://helm.cilium.io/
helm upgrade --install cilium cilium/cilium
--version 1.xx.x-rcx
--namespace kube-system
--set sctp.enabled=true
--set hubble.enabled=true
--set hubble.metrics.enabled="{dns,drop,tcp,flow,icmp,http}"
--set hubble.relay.enabled=true
--set hubble.ui.enabled=true
--set hubble.ui.service.type=NodePort
--set hubble.relay.service.type=NodePort
When it's done we can check the "cilium status" command to be sure that the cilium CNI was correctly installed.
Known limitations
AKS supports Cilium but has some limitations (no L7 rules, no hubble) . More details here :
https://learn.microsoft.com/en-us/azure/aks/azure-cni-powered-by-cilium#limitations
EKS: many advanced features of Cilium are not yet enabled as part of EKS Anywhere, including Hubble observability, DNS-aware and HTTP-Aware Network Policy, Multi-cluster Routing, Transparent Encryption, and Advanced Load-balancing. More details here :