-
Notifications
You must be signed in to change notification settings - Fork 275
Description
I was planning to upgrade cilium images I use to run hubble observe and I noticed that newer images use significantly more CPU to do the same work.
I started from quay.io/cilium/cilium:v1.15.5 and switching to 1.15.16 or newer (1.17.x, 1.18.x) causes an increase in CPU usage ~5x in my setup.
I have 10 pods organized as daemonset, each is running 3 containers of hubble observe, each fetching a subset of logs. All those processes produce total ~30MB/min of logs for whole system.
# log for business traffic
# 20 exclusions total
hubble observe --follow --print-node-name --time-format RFC3339Milli \
--not --namespace kube-system \
--not --namespace A \
--not --namespace B \
--not --namespace C
...
# log for technical traffic
# 20 inclusions total
hubble observe --follow --print-node-name --time-format RFC3339Milli \
--namespace kube-system \
--namespace A \
--namespace B \
--namespace C
...
# log for dropped traffic
hubble observe --follow --print-node-name --time-format RFC3339Milli \
--type drop --type l7 --verdict DROPPED --not --to-ip ff02::/16
I tested a couple of cilium versions:
- 1.15.5, 1.15.6, 1.15.12, 1.15.14, 1.15.15 --> these behave normally, my grafana shows that whole deamonset uses < 1 cpu in total
- 1.15.16, 1.15.19, 1.16.16, 1.17.6, 1.17.9, 1.18.3 ---> for these I see increased CPU usage of almost 5 cpus total.
Here is a screen from graphana after I downgraded back to 1.15.15
This is from a standard graphana Dashboard Kubernetes / Compute Resources / Namespace (Workloads), with metric plotted being more or less:
sum(
node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{namespace="logs"}
* on(namespace,pod)
group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace="logs"}
) by (workload, workload_type)
Here is also hubble version from inside of the containers for those 2 closest image tags:
v1.15.15 > hubble version
hubble v1.17.1@HEAD-0d65c11 compiled with go1.23.6 on linux/amd64
v1.15.16 > root@os-workernode06:/home/cilium# hubble version
hubble v1.17.2@HEAD-aba36c0 compiled with go1.23.7 on linux/amd64
Is this intended effect or a bug?
Is new hubble version doing something more, that requires more cpu?