-
Notifications
You must be signed in to change notification settings - Fork 513
[kubernetes] adds beta support for rotated logs (including GZIP) #15441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
736eb8a to
2cbfd60
Compare
🚀 Benchmarks reportTo see the full report comment with |
2cbfd60 to
d64bdda
Compare
8bc127f to
2132f29
Compare
|
|
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
| title: Kubernetes container rotated log path | ||
| multi: true | ||
| default: | ||
| - /var/log/pods/${kubernetes.namespace}_${kubernetes.pod.name}_${kubernetes.pod.uid}/${kubernetes.container.name}/*.log.* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it suspicious that for the paths our default is /var/log/containers/*${kubernetes.container.id}.lo but for the rotated_logs_paths it's a completely different /var/log/pods/${kubernetes.namespace}_${kubernetes.pod.name}_${kubernetes.pod.uid}/${kubernetes.container.name}/*.log.*.
I have a couple of questions here:
- Why do we even need the new
rotated_logs_pathsparameters? Why can't we just add a second line to thepathsdefault? - How did we determine that the rotated container logs are stored in
/var/log/pods/${kubernetes.namespace}_${kubernetes.pod.name}_${kubernetes.pod.uid}/${kubernetes.container.name}/*.log.*? Can we put a documentation reference as a comment near this default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we even need the new rotated_logs_paths parameters?
I added some context on the issue, but the tl;dr is that the paths serve different purposes. The /var/log/containers/ is deprecated and is just a collection of symlinks pointing to the active log file. To get the rotated logs, we have to target the source directory, /var/log/pods/, directly.
As for why this is a new parameter, it was the cleanest and safest approach for a few reasons:
Isolating a beta feature: Since support for rotated/gzipped logs is new, it's safer to keep it separate.
Preventing data re-ingestion: Changing the existing default paths could cause users to re-ingest logs, which we must avoid.
Simplicity/limitations: With a different path, it can be added or not depending on rotated_logs toggle, without interfering with the default to active log files.
How did we determine that the rotated container logs are stored in /var/log/pods/${kubernetes.namespace}${kubernetes.pod.name}${kubernetes.pod.uid}/${kubernetes.container.name}/.log.?
A mix of docs and reading k8s code. Unfortunately I did not find authoritative and detailed docs on this path, just the logs are in /var/log/pods/. I also explain on the issue and link the docs/source code from where I got this path. I'll add some comment here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AndersonQ thanks for the clarification!
My concern is our future plan for this feature. How are we going to take it out of beta? Would this extra parameter rotated_logs_paths stay?
While it's still in Beta it might be totally fine to hard-code /var/log/pods/${kubernetes.namespace}${kubernetes.pod.name}${kubernetes.pod.uid}/${kubernetes.container.name}/.log as a part of paths in the template if rotated_logs is switched to true without adding a new rotated_logs_paths parameter. How common is it to need to configure rotated_logs_paths?
If we add this new parameter rotated_logs_paths, we commit to having it in the future too.
I'd like to see @cmacknz's take on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is our future plan for this feature. How are we going to take it out of beta? Would this extra parameter rotated_logs_paths stay?
No, we would make this the default, we just have to do it without data duplication in this case.
I think my overall preference would be to have a feature toggle for something like "ingest rotated gzip logs (beta)" that just makes all the changes necessary without exposing the individual pieces that are changed.
I think the toggles now are technically correct but way to technical for a lot of users to potentially understand. The integration is supposed to abstract away complexity and this just putting the underlying Kubernetes log structure in everyones view directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reference this is what filelog does:
filelog:
include:
- /var/log/pods/*/*/*.log
exclude:
# Exclude logs from all containers named otel-collector
- /var/log/pods/*/otel-collector/*.log
start_at: end
include_file_path: true
include_file_name: false
operators:
# parse container logs
- type: container
id: container-parserYou will note they do not template an input per pod UID and they have an obvious way to exclude their own logs.
This something we should also do we are blocked by requiring data re-ingestion to make changes like this. We need some way to correct our logging configurations and change them at will without re-ingesting every log in a cluster. Several improvements are stuck behind this problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to use a glob for /var/log/pod/. Also because k8s docs don't really document either pattern, the one for /var/logs/container/ and /var/log/pod/.
Because the input is templated using the container ID in the input ID, for each container we have one input, which currently works well with the paths matching a single file. However, if we add a glob in the paths, then for each container we will have one input instance ingesting files from all containers, that's gonna generate massive data duplication.
For reference this is what filelog does:
It's interesting that they set start_at: end, their default is to consider every existing file as fully ingested and only ingest new data. That helps a lot to avoid data duplication, but do not ingest old data like we do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This something we should also do we are blocked by requiring data re-ingestion to make changes like this. We need some way to correct our logging configurations and change them at will without re-ingesting every log in a cluster. Several improvements are stuck behind this problem.
We already can migrate state from one Filestream input to another. The current implementation is very conservative and requires us to set the ID from the old input we're getting the state from, however we can loose this requirement and just have input look up any state in Filestream's registry, regardless of the input it belongs to.
The more we discuss about it, the more I believe we need to re-think how we store and scope file states in Filestream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The more we discuss about it, the more I believe we need to re-think how we store and scope file states in Filestream.
Oh yeah
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to use a glob for /var/log/pod/. Also because k8s docs don't really document either pattern, the one for /var/logs/container/ and /var/log/pod/.
Because the input is templated using the container ID in the input ID, for each container we have one input, which currently works well with the paths matching a single file. However, if we add a glob in the paths, then for each container we will have one input instance ingesting files from all containers, that's gonna generate massive data duplication.
Oh, that's a good pint. Lat me think about it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine. The variables are substituted all together. In a simplified way:
the variables of an input are only substituted if all variables are returned by the provider
the k8s provider returns all the variables in a set with a single value for each key. In other words, for a pod with 2 containers there would be 2 sets like:
namespace: default
pod:
uid: 1
name: pod-1
container:
id: cont-1
name cont-name-1
namespace: default
pod:
uid: 1
name: pod-1
container:
id: cont-2
name cont-name-2
that way, there is no risk of creating a combination that mixes 2 pods or containers from the same pod as each set of values are all valid and belonging to the same hierarchy
| - name: rotated_logs | ||
| type: bool | ||
| title: Use Symlinks | ||
| title: Include rotated log files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to have GZIP in the name to explain why you'd want to do this.
8a0d219 to
37b750d
Compare
|
@AndersonQ I don't think this comment from @cmacknz is addressed, is it? My initial question remains unanswered: why do we need the new configuration parameter If I understood correctly, @cmacknz and I are aligned on this. We should minimize the amount of new configuration fields we introduce, especially when we know this parameter won't be necessary in the future. I believe it should be just one toggle. |
@rdner, I thought my explanation here addressed it.
it's about allowing users to configure it if they have another path for it. I can definitely go with just a toggle, but then it won't be configurable. Also, if for some reason the users have the logs in a custom location, other than It's a sane default, just as we have for the active log. I don't see why we should not allow it to be configurable as well. |
|
I'm happy to remove it if you 2 want it removed. However I'd not be surprised if we get requests to make it configurable. |
|
@AndersonQ It's not about what we two want, it's about what makes sense and how we're going to present this beta feature now and take it out of beta in the future. I see what we do here in this PR as temporary – to let our customers try it out. When we figure out how to migrate from one path to another without data duplication, we just make the path we toggle now by the What are you going to do with the |
If, once we can add new paths and we migrate the default log path from the integration to There is at least the windows case, the path for windows is different, so we might always need it just as we allow to modify the active logs path. For now, I believe we should let it be configurable. It's possible to change it later, so I don't see it as a problem. |
|
@AndersonQ the problem is that in this PR we allow to configure the same Filebeat configuration parameter I'd rather avoid introducing something that is unnecessary and would lead to a breaking change later. If we don't support the Windows case in Beta and document it – it's fine. The current objective is to give early access to the GZip ingestion, not to solve the entire use-case. I'm going to leave the final decision to @cmacknz since it looks like my argumentation alone is not sufficient here. |
|
When I first thought of this toggle approach I was only considering the I also see that we can solve having people enable this purely via a documentation, for example if a user adds We can change approach from this PR into instead making a PR to the beats+elastic agent documentation (because the same change works for standalone not just this integration) and include the same instructions in the relevant changelogs. Then we don't have to worry about trying to remove these temporary settings later. |
ok then. @cmacknz, do you have a particular agent docs in mind or you meant the k8s integration docs? On top of my head I don't know which agent docs you're referring to. Also I believe we can update the integrations doc to either mention it or to point to the agent's doc. Thus, so far I believe the following docs can be updated to mention ingesting GZIP files:
do you have anything else in mind? |
Yes if we can document this in one single place, perhaps the filestream docs, and then just reference the new section everywhere that would be best. The only places I see in addition we would want to mention this are:
|
💚 Build Succeeded
History
cc @AndersonQ |
|
as per #15441 (comment) it'll be just a doc change. So to better track it and keep the discussions focused, I'll open another PR and use #15792 to track the task |


DO NOT MERGE BEFORE 9.2 IS RELEASED
Proposed commit message
Checklist
changelog.ymlfile.[ ] I have verified that Kibana version constraints are current according to guidelines.[ ] I have verified that any added dashboard complies with Kibana's Dashboard good practicesAuthor's Checklist
How to test this PR locally
node-cfg.yaml
flog.yaml
185000were ingested and the ingested files include the rotated plain text and gzip filesRelated issues
Screenshots