feat(auth): auto retry on jwks key not found #17410
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/
Summary
currently, we maintain a cache for the jwks keys, and use the cache to validate the jwt authentications.
however, the jwks endpoint may rotate at any time. if the client rotated with the latest jwt, we may get a key not found 401 error on jwt authentication until the next jwks refresh (by default, it's 30s).
this error can be auto-recovered after 30s, but we want to go a step further about this to make it not noticeable for users with retry.
this pr added a retry for refreshing the jwks endpoint after a key not found error. and to avoid flooding the jwks endpoint, it limited a retry_interval.
there's also a risk that the server rotates the jwks first, but the client still uses the older jwt. to mitigate this risk, this pr also buffered the earlier jwks keys in memory.
Tests
Type of change
This change is