[kubernetes] adds beta support for rotated logs (including GZIP) #15441

AndersonQ · 2025-09-23T16:56:32Z

DO NOT MERGE BEFORE 9.2 IS RELEASED

Proposed commit message

[kubernetes] adds beta support for rotated logs

Enables the collection of rotated log files, including GZIP-compressed files, for Kubernetes container logs.

This enhancement introduces new configuration options to control the ingestion of rotated logs, including specifying paths for the rotated logs and enabling GZIP decompression.

Checklist

I have reviewed tips for building integrations and this pull request is aligned with them.
I have verified that all data streams collect metrics or logs.
I have added an entry to my package's changelog.yml file.
~~[ ] I have verified that Kibana version constraints are current according to guidelines.~~
~~[ ] I have verified that any added dashboard complies with Kibana's Dashboard good practices~~

Author's Checklist

How to test this PR locally

install this integration to an ES stack

cd packages/kubernetes
elastic-package install -v

start a kind cluster with the follwing config

node-cfg.yaml

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
  - role: control-plane
    kubeadmConfigPatches:
      - |
        kind: KubeletConfiguration
        containerLogMaxSize: "10Mi"
        containerLogMaxFiles: 10

kind create -v cluster --config ./node-cfg.yaml

run a flog pod, wait the logs to be generated.

flog.yaml

apiVersion: batch/v1
kind: Job
metadata:
  name: flog-log-generator
spec:
  template:
    spec:
      containers:
        - name: flog
          image: mingrammer/flog
#          too small "-d" won't give kubelet time to rotate the files
          args: ["-t", "stdout", "-f", "json", "-n", "185000", "-d", "1ms"]
      restartPolicy: Never
  backoffLimit: 4

kubectl apply -f flog.yaml

add the kubernets integration enabling to ingest rotated logs. Ideally only enable container logs to avoid errors as kind is really minimal and not all metrics are available
install a 9.2.0-SNAPSHOT elastic-agent using the instructions for deploying it on k8s
ensure all 185000 were ingested and the ingested files include the rotated plain text and gzip files

Related issues

Closes [kubernetes]: Add GZIP support for logs ingested by filestream #14238

Screenshots

elastic-vault-github-plugin-prod · 2025-09-23T17:45:25Z

🚀 Benchmarks report

To see the full report comment with /test benchmark fullreport

elastic-sonarqube · 2025-09-29T08:47:11Z

Quality Gate failed

Failed conditions
2.3% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube

elasticmachine · 2025-09-29T08:49:02Z

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

packages/kubernetes/_dev/build/docs/container-logs.md

packages/kubernetes/data_stream/container_logs/manifest.yml

packages/kubernetes/_dev/build/docs/container-logs.md

packages/kubernetes/docs/container-logs.md

rdner · 2025-09-30T08:26:43Z

packages/kubernetes/data_stream/container_logs/manifest.yml

+        title: Kubernetes container rotated log path
+        multi: true
+        default:
+          - /var/log/pods/${kubernetes.namespace}_${kubernetes.pod.name}_${kubernetes.pod.uid}/${kubernetes.container.name}/*.log.*


I find it suspicious that for the paths our default is /var/log/containers/*${kubernetes.container.id}.lo but for the rotated_logs_paths it's a completely different /var/log/pods/${kubernetes.namespace}_${kubernetes.pod.name}_${kubernetes.pod.uid}/${kubernetes.container.name}/*.log.*.

I have a couple of questions here:

Why do we even need the new rotated_logs_paths parameters? Why can't we just add a second line to the paths default?

How did we determine that the rotated container logs are stored in /var/log/pods/${kubernetes.namespace}_${kubernetes.pod.name}_${kubernetes.pod.uid}/${kubernetes.container.name}/*.log.*? Can we put a documentation reference as a comment near this default?

Why do we even need the new rotated_logs_paths parameters?

I added some context on the issue, but the tl;dr is that the paths serve different purposes. The /var/log/containers/ is deprecated and is just a collection of symlinks pointing to the active log file. To get the rotated logs, we have to target the source directory, /var/log/pods/, directly.

As for why this is a new parameter, it was the cleanest and safest approach for a few reasons:

Isolating a beta feature: Since support for rotated/gzipped logs is new, it's safer to keep it separate.

Preventing data re-ingestion: Changing the existing default paths could cause users to re-ingest logs, which we must avoid.

Simplicity/limitations: With a different path, it can be added or not depending on rotated_logs toggle, without interfering with the default to active log files.

How did we determine that the rotated container logs are stored in /var/log/pods/${kubernetes.namespace}${kubernetes.pod.name}${kubernetes.pod.uid}/${kubernetes.container.name}/.log.?

A mix of docs and reading k8s code. Unfortunately I did not find authoritative and detailed docs on this path, just the logs are in /var/log/pods/. I also explain on the issue and link the docs/source code from where I got this path. I'll add some comment here

@AndersonQ thanks for the clarification!

My concern is our future plan for this feature. How are we going to take it out of beta? Would this extra parameter rotated_logs_paths stay?

While it's still in Beta it might be totally fine to hard-code /var/log/pods/${kubernetes.namespace}${kubernetes.pod.name}${kubernetes.pod.uid}/${kubernetes.container.name}/.log as a part of paths in the template if rotated_logs is switched to true without adding a new rotated_logs_paths parameter. How common is it to need to configure rotated_logs_paths?

If we add this new parameter rotated_logs_paths, we commit to having it in the future too.

I'd like to see @cmacknz's take on it.

My concern is our future plan for this feature. How are we going to take it out of beta? Would this extra parameter rotated_logs_paths stay?

No, we would make this the default, we just have to do it without data duplication in this case.

I think my overall preference would be to have a feature toggle for something like "ingest rotated gzip logs (beta)" that just makes all the changes necessary without exposing the individual pieces that are changed.

I think the toggles now are technically correct but way to technical for a lot of users to potentially understand. The integration is supposed to abstract away complexity and this just putting the underlying Kubernetes log structure in everyones view directly.

For reference this is what filelog does:

filelog: include: - /var/log/pods/*/*/*.log exclude: # Exclude logs from all containers named otel-collector - /var/log/pods/*/otel-collector/*.log start_at: end include_file_path: true include_file_name: false operators: # parse container logs - type: container id: container-parser

You will note they do not template an input per pod UID and they have an obvious way to exclude their own logs.

This something we should also do we are blocked by requiring data re-ingestion to make changes like this. We need some way to correct our logging configurations and change them at will without re-ingesting every log in a cluster. Several improvements are stuck behind this problem.

I'm happy to use a glob for /var/log/pod/. Also because k8s docs don't really document either pattern, the one for /var/logs/container/ and /var/log/pod/.

Because the input is templated using the container ID in the input ID, for each container we have one input, which currently works well with the paths matching a single file. However, if we add a glob in the paths, then for each container we will have one input instance ingesting files from all containers, that's gonna generate massive data duplication.

For reference this is what filelog does:

It's interesting that they set start_at: end, their default is to consider every existing file as fully ingested and only ingest new data. That helps a lot to avoid data duplication, but do not ingest old data like we do.

This something we should also do we are blocked by requiring data re-ingestion to make changes like this. We need some way to correct our logging configurations and change them at will without re-ingesting every log in a cluster. Several improvements are stuck behind this problem.

We already can migrate state from one Filestream input to another. The current implementation is very conservative and requires us to set the ID from the old input we're getting the state from, however we can loose this requirement and just have input look up any state in Filestream's registry, regardless of the input it belongs to.

The more we discuss about it, the more I believe we need to re-think how we store and scope file states in Filestream.

The more we discuss about it, the more I believe we need to re-think how we store and scope file states in Filestream.

Oh yeah

I'm happy to use a glob for /var/log/pod/. Also because k8s docs don't really document either pattern, the one for /var/logs/container/ and /var/log/pod/.

Because the input is templated using the container ID in the input ID, for each container we have one input, which currently works well with the paths matching a single file. However, if we add a glob in the paths, then for each container we will have one input instance ingesting files from all containers, that's gonna generate massive data duplication.

Oh, that's a good pint. Lat me think about it

It's fine. The variables are substituted all together. In a simplified way:

the variables of an input are only substituted if all variables are returned by the provider
the k8s provider returns all the variables in a set with a single value for each key. In other words, for a pod with 2 containers there would be 2 sets like:

namespace: default pod: uid: 1 name: pod-1 container: id: cont-1 name cont-name-1 namespace: default pod: uid: 1 name: pod-1 container: id: cont-2 name cont-name-2

that way, there is no risk of creating a combination that mixes 2 pods or containers from the same pod as each set of values are all valid and belonging to the same hierarchy

packages/kubernetes/_dev/build/docs/container-logs.md

packages/kubernetes/data_stream/container_logs/manifest.yml

AndersonQ · 2025-09-30T13:13:58Z

@rdner, @belimawr I also updated the screenshot in the PR description. please re-review at your earliest convenience

packages/kubernetes/data_stream/container_logs/manifest.yml

cmacknz · 2025-10-06T20:14:21Z

packages/kubernetes/data_stream/container_logs/manifest.yml

+      - name: rotated_logs
        type: bool
-        title: Use Symlinks
+        title: Include rotated log files


This needs to have GZIP in the name to explain why you'd want to do this.

packages/kubernetes/data_stream/container_logs/manifest.yml

AndersonQ · 2025-10-21T12:14:38Z

@cmacknz, @rdner is it ok now?

rdner · 2025-10-21T12:47:35Z

@AndersonQ I don't think this comment from @cmacknz is addressed, is it?

My initial question remains unanswered: why do we need the new configuration parameter rotated_logs_paths when we can just add values in paths based on one feature flag/toggle rotated_logs?

If I understood correctly, @cmacknz and I are aligned on this. We should minimize the amount of new configuration fields we introduce, especially when we know this parameter won't be necessary in the future. I believe it should be just one toggle.

AndersonQ · 2025-10-21T13:01:38Z

@AndersonQ I don't think this comment from @cmacknz is addressed, is it?

@rdner, I thought my explanation here addressed it.

My initial question remains unanswered: why do we need the new configuration parameter rotated_logs_paths when we can just add values in paths based on one feature flag/toggle rotated_logs?

If I understood correctly, @cmacknz and I are aligned on this. We should minimize the amount of new configuration fields we introduce, especially when we know this parameter won't be necessary in the future. I believe it should be just one toggle.

it's about allowing users to configure it if they have another path for it. I can definitely go with just a toggle, but then it won't be configurable.
I don't know how often it's needed, I'd guess pretty much as much as one needs to change the default path for logs. That minus the amount of users which change it to collect directly from /var/log/pod

Also, if for some reason the users have the logs in a custom location, other than /var/log/pod, I doubt the rotated logs would be there, thus, they need to configure it.

It's a sane default, just as we have for the active log. I don't see why we should not allow it to be configurable as well.

AndersonQ · 2025-10-21T13:02:29Z

I'm happy to remove it if you 2 want it removed. However I'd not be surprised if we get requests to make it configurable.

rdner · 2025-10-21T16:21:14Z

@AndersonQ It's not about what we two want, it's about what makes sense and how we're going to present this beta feature now and take it out of beta in the future.

I see what we do here in this PR as temporary – to let our customers try it out. When we figure out how to migrate from one path to another without data duplication, we just make the path we toggle now by the rotated_logs: true a new default.

What are you going to do with the rotated_logs_paths then?

AndersonQ · 2025-10-22T14:12:51Z

how we're going to present this beta feature now and take it out of beta in the future.

I see what we do here in this PR as temporary – to let our customers try it out. When we figure out how to migrate from one path to another without data duplication, we just make the path we toggle now by the rotated_logs: true a new default.

What are you going to do with the rotated_logs_paths then?

If, once we can add new paths and we migrate the default log path from the integration to /var/log/pods, it might not be necessary and we can remove it if we don't want to allow users to modify it anymore.

There is at least the windows case, the path for windows is different, so we might always need it just as we allow to modify the active logs path.
Or if we move to even removing the toggle to ingest rotated logs, again, we remove the toggle and the rotated_logs_paths.

For now, I believe we should let it be configurable. It's possible to change it later, so I don't see it as a problem.

@rdner, @cmacknz, what do you think?

rdner · 2025-10-22T15:33:58Z

@AndersonQ the problem is that in this PR we allow to configure the same Filebeat configuration parameter paths with 2 integration configuration parameters: paths and rotated_logs_paths. Furthermore, we now allow to configure rotated_logs_paths, our users will start to rely on it and then we remove it. This is a breaking change.

I'd rather avoid introducing something that is unnecessary and would lead to a breaking change later.

If we don't support the Windows case in Beta and document it – it's fine. The current objective is to give early access to the GZip ingestion, not to solve the entire use-case.

I'm going to leave the final decision to @cmacknz since it looks like my argumentation alone is not sufficient here.

cmacknz · 2025-10-22T20:34:50Z

When I first thought of this toggle approach I was only considering the gzip_experimental: true part which we expect to eventually become the default, so we could remove the toggle with no consequences. The additional need to change the paths complicates this quite a bit.

I also see that we can solve having people enable this purely via a documentation, for example if a user adds gzip_experimental: true in the {{custom}} block and changes the paths appropriately then we don't have to make any changes here. So my preference would be to do that.

We can change approach from this PR into instead making a PR to the beats+elastic agent documentation (because the same change works for standalone not just this integration) and include the same instructions in the relevant changelogs.

Then we don't have to worry about trying to remove these temporary settings later.

AndersonQ · 2025-10-23T07:02:44Z

We can change approach from this PR into instead making a PR to the beats+elastic agent documentation (because the same change works for standalone not just this integration) and include the same instructions in the relevant changelogs.

Then we don't have to worry about trying to remove these temporary settings later.

ok then.

@cmacknz, do you have a particular agent docs in mind or you meant the k8s integration docs? On top of my head I don't know which agent docs you're referring to.

Also I believe we can update the integrations doc to either mention it or to point to the agent's doc. Thus, so far I believe the following docs can be updated to mention ingesting GZIP files:

k8s integration docs
filebeat "run on k8s#log-rotation"
filebeat run on k8s#autodiscover snippet - I think it should not be under OpenShift. I don't know if it's intentional or a mistake.
filebeat deploy on k8s
filebeat Log rotation results in lost or duplicate events (this is unrelated to k8s, but can profit from at least us mentioning we can read gzip files)

do you have anything else in mind?

cmacknz · 2025-10-24T13:20:12Z

Also I believe we can update the integrations doc to either mention it or to point to the agent's doc.

Yes if we can document this in one single place, perhaps the filestream docs, and then just reference the new section everywhere that would be best.

The only places I see in addition we would want to mention this are:

The Beats and Agent changelog, which can also reference the new docs section
The Elastic Agent Helm chart in the readme probably, we need to confirm there is a way to configure this there.https://github.com/elastic/elastic-agent/tree/main/deploy/helm/elastic-agent

elasticmachine · 2025-10-28T16:52:12Z

💚 Build Succeeded

Buildkite Build
Commit: c29f083

History

💔 Build #33291 failed a5488fd
💚 Build #32549 succeeded 5935247
💚 Build #32121 succeeded 13b00fd
💚 Build #32118 succeeded 86f98b0
💚 Build #32081 succeeded 2132f29

cc @AndersonQ

AndersonQ · 2025-10-29T07:52:39Z

as per #15441 (comment) it'll be just a doc change. So to better track it and keep the discussions focused, I'll open another PR and use #15792 to track the task

AndersonQ self-assigned this Sep 23, 2025

AndersonQ added Integration:kubernetes Kubernetes Team:Elastic-Agent-Data-Plane Agent Data Plane team [elastic/elastic-agent-data-plane] labels Sep 23, 2025

AndersonQ force-pushed the 14238-kubernetes-gzip-support branch from 736eb8a to 2cbfd60 Compare September 23, 2025 16:57

AndersonQ force-pushed the 14238-kubernetes-gzip-support branch from 2cbfd60 to d64bdda Compare September 26, 2025 15:04

andrewkroh added the documentation Improvements or additions to documentation. Applied to PRs that modify *.md files. label Sep 26, 2025

AndersonQ force-pushed the 14238-kubernetes-gzip-support branch 3 times, most recently from 8bc127f to 2132f29 Compare September 29, 2025 08:00

AndersonQ changed the title ~~[kubernetes] adds beta support for rotated logs~~ [kubernetes] adds beta support for rotated logs (including GZIP) Sep 29, 2025

AndersonQ marked this pull request as ready for review September 29, 2025 08:48

AndersonQ requested a review from a team as a code owner September 29, 2025 08:49

andrewkroh added the Team:obs-ds-hosted-services Observability Hosted Services team [elastic/obs-ds-hosted-services] label Sep 29, 2025

pierrehilbert requested review from belimawr and rdner September 30, 2025 06:58

rdner reviewed Sep 30, 2025

View reviewed changes

belimawr reviewed Sep 30, 2025

View reviewed changes

AndersonQ requested review from belimawr and rdner September 30, 2025 13:14

belimawr approved these changes Sep 30, 2025

View reviewed changes

rdner mentioned this pull request Oct 2, 2025

filestream: track state for deleted files elastic/beats#46834

Open

cmacknz reviewed Oct 6, 2025

View reviewed changes

packages/kubernetes/data_stream/container_logs/manifest.yml Outdated Show resolved Hide resolved

cmacknz reviewed Oct 6, 2025

View reviewed changes

packages/kubernetes/data_stream/container_logs/manifest.yml Outdated Show resolved Hide resolved

AndersonQ requested review from a team as code owners October 9, 2025 08:41

AndersonQ added 2 commits October 9, 2025 10:50

improve docs and PR changes

8a55dbc

run build

37b750d

AndersonQ force-pushed the 14238-kubernetes-gzip-support branch from 8a0d219 to 37b750d Compare October 9, 2025 08:50

AndersonQ removed request for a team October 9, 2025 08:51

change toggle title

5935247

AndersonQ marked this pull request as draft October 23, 2025 07:22

AndersonQ added 3 commits October 28, 2025 14:52

remove integration changes

a86f1b7

update container log docs

a5488fd

add built docs

c29f083

AndersonQ mentioned this pull request Oct 29, 2025

[kubernetes] update docs to explain how to ingest rotated log files (including GZIP) #15792

Open

AndersonQ closed this Oct 29, 2025

[kubernetes] adds beta support for rotated logs (including GZIP) #15441

[kubernetes] adds beta support for rotated logs (including GZIP) #15441

Uh oh!

Conversation

AndersonQ commented Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

DO NOT MERGE BEFORE 9.2 IS RELEASED

Proposed commit message

Checklist

Author's Checklist

How to test this PR locally

Related issues

Screenshots

Uh oh!

elastic-vault-github-plugin-prod bot commented Sep 23, 2025

🚀 Benchmarks report

Uh oh!

elastic-sonarqube bot commented Sep 29, 2025

Quality Gate failed

Uh oh!

elasticmachine commented Sep 29, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

AndersonQ commented Sep 30, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

AndersonQ commented Oct 21, 2025

Uh oh!

rdner commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AndersonQ commented Oct 21, 2025

Uh oh!

AndersonQ commented Oct 21, 2025

Uh oh!

rdner commented Oct 21, 2025

Uh oh!

AndersonQ commented Oct 22, 2025

Uh oh!

rdner commented Oct 22, 2025

Uh oh!

cmacknz commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AndersonQ commented Oct 23, 2025

Uh oh!

AndersonQ commented Sep 23, 2025 •

edited

Loading

rdner commented Oct 21, 2025 •

edited

Loading

cmacknz commented Oct 22, 2025 •

edited

Loading