Skip to content

Conversation

@litianningdatadog
Copy link
Contributor

@litianningdatadog litianningdatadog commented Dec 12, 2025

A temp to address the data loss due to a fundamental mismatch between synchronous signals and asynchronous effects

How It Works:

The new shutdown sequence ensures messages are not lost by:

  1. Cancelling TelemetryListener and awaiting HTTP server shutdown
  2. Dropping TelemetryListener to close logs_tx channel
  3. Cancelling LogsAgent and awaiting its completion (ensuring all logs are drained)
  4. Sending tombstone event only after all messages are processed
  5. Dropping event_bus_tx and waiting for tombstone to be processed

====
Justification: instead of hoping async work completes in time, we now enforce the order:

  1. Cancel TelemetryListener → await HTTP shutdown (guarantees no new messages)
  2. Drop TelemetryListener → closes logs_tx (signals end of input)
  3. Cancel LogsAgent → await draining completion (guarantees all messages forwarded)
  4. Send tombstone → now we know all messages are processed
  5. Drop event_bus_tx → safe because no more senders

Each await transforms an asynchronous cancellation into a synchronous completion guarantee.

@litianningdatadog litianningdatadog marked this pull request as ready for review December 12, 2025 17:20
@litianningdatadog litianningdatadog requested a review from a team as a code owner December 12, 2025 17:20
@litianningdatadog litianningdatadog force-pushed the tianning.li/async-cancellation-data-loss branch 2 times, most recently from 072c2d7 to 38dc5c1 Compare December 12, 2025 19:09
… synchronous signals and asynchronous effects

How It Works:

The new shutdown sequence ensures messages are not lost by:
1. Cancelling TelemetryListener and awaiting HTTP server shutdown
2. Dropping TelemetryListener to close logs_tx channel
3. Cancelling LogsAgent and awaiting its completion (ensuring all logs are drained)
4. Sending tombstone event only after all messages are processed
5. Dropping event_bus_tx and waiting for tombstone to be processed
@litianningdatadog litianningdatadog force-pushed the tianning.li/async-cancellation-data-loss branch from 38dc5c1 to 6618299 Compare December 12, 2025 20:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants