Skip to content

Conversation

@joegallo
Copy link
Contributor

@joegallo joegallo commented Sep 10, 2025

Proposed commit message

Prefer the copy_from option of the set processor for certain high volume integrations.

When the field to be copied is already just a string, and so the set processor with mustache isn't being used for the side effect of converting to a string, then it's quite a bit faster to use copy_from rather than value (with mustache templating).

For example, in a large cluster that I was looking at a few minutes ago, the most expensive single set processor is this one:

{
  "set": {
    "field": "cloud.account.id",
    "if": "ctx.aws?.vpcflow?.account_id != null",
    "value": "{{aws.vpcflow.account_id}}"
  }
}

It's taking 2.8 microseconds per doc, as compared to the average of all set processor invocations for the same pipeline which is only .6 microseconds per doc. The cluster in question is processing billions and billions of documents per hour, though, so microseconds add up (and this particularly-expensive set processor is the eighth-most expensive processor for the entire pipeline).

Originally this PR also had some copy_from changes for the panw integration, but I've moved those changes into #15800.

because copy_from is faster than value (when the value is a mustache
template that merely does field access).
@elastic-vault-github-plugin-prod
Copy link

elastic-vault-github-plugin-prod bot commented Sep 10, 2025

🚀 Benchmarks report

Package gcp 👍(3) 💚(2) 💔(1)

Expand to view
Data stream Previous EPS New EPS Diff (%) Result
vpcflow 4545.45 3048.78 -1496.67 (-32.93%) 💔

To see the full report comment with /test benchmark fullreport

@joegallo

This comment was marked as outdated.

@andrewkroh andrewkroh added Integration:panw Palo Alto Next-Gen Firewall Integration:aws AWS Integration:gcp Google Cloud Platform labels Sep 10, 2025
@andrewkroh

This comment was marked as outdated.

@elastic-sonarqube
Copy link

@botelastic

This comment was marked as outdated.

@botelastic botelastic bot added the Stalled label Oct 11, 2025
@botelastic botelastic bot removed the Stalled label Oct 13, 2025
@joegallo
Copy link
Contributor Author

/test benchmark fullreport

@joegallo joegallo removed the Integration:panw Palo Alto Next-Gen Firewall label Oct 30, 2025
@elasticmachine
Copy link

💚 Build Succeeded

History

@joegallo joegallo marked this pull request as ready for review October 30, 2025 20:42
@joegallo joegallo requested review from a team as code owners October 30, 2025 20:42
Copy link
Contributor

@efd6 efd6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@andrewkroh andrewkroh added the Team:Security-Service Integrations Security Service Integrations team [elastic/security-service-integrations] label Oct 31, 2025
@elasticmachine
Copy link

Pinging @elastic/security-service-integrations (Team:Security-Service Integrations)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request Integration:aws AWS Integration:gcp Google Cloud Platform Team:Security-Service Integrations Security Service Integrations team [elastic/security-service-integrations]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants