Skip to content

Conversation

@jrasell
Copy link
Member

@jrasell jrasell commented Nov 26, 2025

This change introduces new optional client fingerprinter configuration fields which can be used to control how the env fingerprinters perform retries and whether errors should halt the agent startup.

The retry wrapper is used by the env_aws, env_azure, env_gce, and env_digitalocean fingerprinters and is the handler for retry and error logic on the main fingerprinter. The change is backwards compatible, so running this change without any new config options results in the same behaviour as previously.

  • retry_interval: Specifies the time to wait between fingerprint attempts. This will default to 2 seconds.
  • retry_attempts: Specifies the maximum number of fingerprint retries to be made. This will default to 0 and can be set to -1 if the operator wants infinite retries.
  • exit_on_failure: Determines how the agent handles failure in performing the fingerprint.

The change helps alleviate problems in cloud providers where a machine starts before the metadata service and endpoint is available. In this situation, Nomad timesout the fingerprinter quickly and marks it as skipped, thus assuming we are not running within that environment. Operators can use the new configuration options to handle these race conditions, and wait for the metadata service to be available and respond.

Links

Jira: https://hashicorp.atlassian.net/browse/NMD-1061

Contributor Checklist

  • Changelog Entry If this PR changes user-facing behavior, please generate and add a
    changelog entry using the make cl command.
  • Testing Please add tests to cover any new functionality or to demonstrate bug fixes and
    ensure regressions will be caught.
  • Documentation If the change impacts user-facing functionality such as the CLI, API, UI,
    and job configuration, please update the Nomad product documentation, which is stored in the
    web-unified-docs repo. Refer to the web-unified-docs contributor guide for docs guidelines.
    Please also consider whether the change requires notes within the upgrade
    guide
    . If you would like help with the docs, tag the nomad-docs team in this PR.

Reviewer Checklist

  • Backport Labels Please add the correct backport labels as described by the internal
    backporting document.
  • Commit Type Ensure the correct merge method is selected which should be "squash and merge"
    in the majority of situations. The main exceptions are long-lived feature branches or merges where
    history should be preserved.
  • Enterprise PRs If this is an enterprise only PR, please add any required changelog entry
    within the public repository.

This change introduces new optional client fingerprinter
configuration fields which can be used to control how the env
fingerprinters perform retries and whether errors should halt the
agent startup.

The retry wrapper is used by the env_aws, env_azure, env_gce,
and env_digitalocean fingerprinters and is the handler for retry
and error logic on the main fingerprinter. The change is backwards
compatible, so running this change without any new config options
results in the same behaviour as previously.

  - retry_interval: Specifies the time to wait between fingerprint
  attempts. This will default to 2 seconds.
  - retry_attempts: Specifies the maximum number of fingerprint
  retries to be made. This will default to 0 and can be set to -1
  if the operator wants infinite retries.
  - exit_on_failure: Determines how the agent handles failure in
  performing the fingerprint.

The change helps alleviate problems in cloud providers where a
machine starts before the metadata service and endpoint is
available. In this situation, Nomad times out the fingerprinter
quickly and marks it as skipped, thus assuming we are not running
within that environment. Operators can use the new configuration
options to handle these race conditions, and wait for the metadata
service to be available and respond.
pkazmierczak
pkazmierczak previously approved these changes Nov 28, 2025
Copy link
Contributor

@pkazmierczak pkazmierczak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Left some minor typo-related comments, but nothing blocking.

Co-authored-by: Piotr Kazmierczak <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport/1.11.x backport to 1.11.x release line

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants