Skip to content

Conversation

@adutra
Copy link
Contributor

@adutra adutra commented Oct 27, 2025

This PR enacts the action items decided in the following ML thread:

https://lists.apache.org/thread/4gmlcn84o7n8d12qnpl7grfbm4zypk7b

The action items are:

  1. Restore the original functionality for request IDs, change the default header name back to x-request-id
  2. Remove RequestIdGenerator and related functionality.
  3. Update PolarisEvent, expose request ID if available, expose OTel context if available.
  4. Update events table SQL schema: insert request ID if available, insert OTel context if available.

Furthermore, this PR also fixes #2913 and adds a small integration test for the JDBC events sink: InMemoryBufferEventListenerIntegrationTest.

+ "-"
+ span.getSpanContext().getSpanId()
+ "-"
+ span.getSpanContext().getTraceFlags().asHex();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we embed some spec ID into this string? E.g. traceparent:00-0af7651916cd43dd8448eb211c80319c-b9c7c989f97918e1-01

if (requestId != null) { // can be null if request ID generation fails
response.getHeaders().add(loggingConfiguration.requestIdHeaderName(), requestId);
if (requestId != null) {
response.getHeaders().add(tracingConfiguration.requestId().headerName(), requestId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any specific use cases / requests for response headers? My impression from the dev ML discussion was that this feature was not in demand 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some archelogy prior to working on this, and my conclusion is that we probably want a response header.

First, Dropwizard has a request ID fitler:

https://github.com/dropwizard/dropwizard/blob/3833f0e1d9fb8cae256fe09379733b8d651f8b87/dropwizard-jersey/src/main/java/io/dropwizard/jersey/filter/RequestIdFilter.java#L42-L49

Note how this filter creates random request IDs (!!!) and sets a response header.

I don't know for sure if this filter is enabled by default. I think it is, because dropwizard-core depends on dropwizard-jersey. It exists since 2017.

OTOH, Polaris-Dropwizard only had this small line alluding to request IDs:

MDC.putCloseable("request_id", httpRequest.getHeader("request_id"))) {

But you'll notice that the header is now request_id 🤷 . I wonder what the intent was here.

Furthermore the only other usage of request IDs in Polaris-Dropwizard seems to be in the log pattern:

logFormat: "%-5p [%d{ISO8601} - %-6r] [%t] [%X{aid}%X{sid}%X{tid}%X{wid}%X{oid}%X{srv}%X{job}%X{rid}] %c{30}: %m %kvp%n%ex"

Notice the mysterious {rid} MDC context, that seems to mean "request ID". This is surprising. I would expect request_id instead, since that's the MDC variable name.

All of this to say:

  1. I have no idea how this could be working 🤣
  2. I think it doesn't hurt to echo the request ID in the response header.

WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said... I keep going back and forth with this, so I wouldn't mind removing the response header. What do others think?

Copy link
Contributor

@dimas-b dimas-b Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my POV it is easy to keep the response header, but it will be hard to remove it later, especially if the usage expectations are not clear 🤔 I'd prefer removing it now and re-adding if there's user demand.

@collado-mike : WDYT?

@adutra adutra force-pushed the restore-request-id branch 2 times, most recently from 8a176ed to 3c28e6b Compare October 28, 2025 11:23
principal_name TEXT,
resource_type TEXT NOT NULL,
resource_identifier TEXT NOT NULL,
otel_context TEXT,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is backfill needed for existing data?

Copy link
Contributor

@dimas-b dimas-b Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I do not think so. We mentioned this in 1.2.0 release notes: The persistence schema is subject to change in future releases, and previously stored event data MAY become unreadable (i.e., dropped) after an upgrade.

This data was not persisted before 1.2.0, AFAIK.

https://polaris.apache.org/downloads/#120

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's nullable.

CHANGELOG.md Outdated

### Breaking changes

- The default request ID header name has changed from `Polaris-Request-Id` to `x-request-id`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also change it in the doc page https://polaris.apache.org/in-dev/unreleased/telemetry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Will do.

This PR enacts the action items decided in the following ML thread:

https://lists.apache.org/thread/4gmlcn84o7n8d12qnpl7grfbm4zypk7b

The action items are:

1. Restore the original functionality for request IDs, change the
default header name back to x-request-id
2. Remove RequestIdGenerator and related functionality.
3. Update PolarisEvent, expose request ID if available, expose OTel
context if available.
4. Update events table SQL schema: insert request ID if available,
insert OTel context if available.

Furthermore, this PR also fixes apache#2913 and adds a small integration test for the JDBC events sink: `InMemoryBufferEventListenerIntegrationTest`.

Finally, it modifies the Helm chart to reflect the configuration changes.
@adutra adutra force-pushed the restore-request-id branch from fca059e to c206696 Compare October 29, 2025 11:58
.requestId(rs.getString("request_id"))
.eventType(rs.getString("event_type"))
.timestampMs(rs.getLong("timestamp_ms"))
.principalName(rs.getString("actor"))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change effectively fixes #2913.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

JDBC persistence: storage/retrieval of principal name from/to events is broken

3 participants