Increased "hubble observe" CPU usage after upgrade from 1.15.15 to 1.15.16 or newer

I was planning to upgrade cilium images I use to run `hubble observe` and I noticed that newer images use significantly more CPU to do the same work. 

I started from  `quay.io/cilium/cilium:v1.15.5` and switching to 1.15.16 or newer (1.17.x, 1.18.x) causes an increase in CPU usage ~5x in my setup. 

I have 10 pods organized as daemonset, each is running 3 containers of  `hubble observe`, each fetching a subset of logs. All those processes produce total ~30MB/min of logs for whole system.

```
# log for business traffic
# 20 exclusions total
hubble observe --follow --print-node-name --time-format RFC3339Milli  \
  --not --namespace kube-system \
  --not --namespace A \
  --not --namespace B \
  --not --namespace C 
...

# log for technical traffic
# 20 inclusions total
hubble observe --follow --print-node-name --time-format RFC3339Milli \
     --namespace kube-system \
     --namespace A \
     --namespace B \
     --namespace C
...

# log for dropped traffic
hubble observe --follow --print-node-name --time-format RFC3339Milli \
          --type drop --type l7 --verdict DROPPED --not --to-ip ff02::/16 
```

I tested a couple of `cilium` versions:
- 1.15.5, 1.15.6, 1.15.12,  1.15.14, 1.15.15 --> these behave normally, my grafana shows that whole deamonset uses < 1 cpu in total 
- 1.15.16, 1.15.19, 1.16.16, 1.17.6, 1.17.9, 1.18.3 ---> for these I see increased CPU usage of almost 5 cpus total. 

Here is a screen from graphana after I downgraded back to 1.15.15

<img width="1901" height="266" alt="Image" src="https://github.com/user-attachments/assets/f5bc7391-04f4-4415-8449-56fb22a58376" />

This is from a standard graphana Dashboard [Kubernetes / Compute Resources / Namespace (Workloads)](https://grafana.com/grafana/dashboards/12118-kubernetes-compute-resources-namespace-workloads/), with metric plotted being more or less:

```
sum(
  node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{namespace="logs"}
* on(namespace,pod)
  group_left(workload, workload_type) namespace_workload_pod:kube_pod_owner:relabel{namespace="logs"}
) by (workload, workload_type)
```

Here is also `hubble version` from inside of the containers for those 2 closest image tags: 
```
v1.15.15 >  hubble version
hubble v1.17.1@HEAD-0d65c11 compiled with go1.23.6 on linux/amd64

v1.15.16 > root@os-workernode06:/home/cilium# hubble version
hubble v1.17.2@HEAD-aba36c0 compiled with go1.23.7 on linux/amd64
```

Is this intended effect or a bug? 
Is new hubble version doing something more, that requires more cpu? 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Increased "hubble observe" CPU usage after upgrade from 1.15.15 to 1.15.16 or newer #1727

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Increased "hubble observe" CPU usage after upgrade from 1.15.15 to 1.15.16 or newer #1727

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions