Skip to content

Conversation

@p-datadog
Copy link
Member

What does this PR do?

Moves the call to start probe notifier worker (i.e. the background thread) to tracing Rack middleware, which also starts remote config worker.

Motivation:

On forking web servers like puma the probe notifier worker is currently started in the parent (supervisor) process rather than in the worker processes, causing DI to not send any events to the agent.

Change log entry
Yes: fix dynamic instrumentation event sending on forking web servers

Additional Notes:
A proper solution would, in my opinion, not invoke DI (or remote config, which is a prerequisite for DI) from tracing, however in practice all DI customers should have tracing enabled therefore the solution in this PR should work for them.

How to test the change?
I manually verified it against debugger-demo-ruby.
Current CI does not appear to have a configuration that reproduces debugger-demo-ruby environment (since without this PR everything works in CI but does not work in debugger-demo-ruby).

@p-datadog p-datadog requested review from a team as code owners February 10, 2025 14:23
@github-actions github-actions bot added integrations Involves tracing integrations tracing labels Feb 10, 2025
@p-datadog p-datadog force-pushed the di-worker-middleware branch from 566ffc3 to 2bf7740 Compare February 10, 2025 14:27
@datadog-datadog-prod-us1
Copy link
Contributor

datadog-datadog-prod-us1 bot commented Feb 10, 2025

Datadog Report

Branch report: di-worker-middleware
Commit report: 1293418
Test service: dd-trace-rb

✅ 0 Failed, 22035 Passed, 1477 Skipped, 5m 53.53s Total Time

@p-datadog p-datadog force-pushed the di-worker-middleware branch from 8ed2a67 to 63aea57 Compare February 10, 2025 14:39
@p-datadog p-datadog force-pushed the di-worker-middleware branch from 06540b7 to 95ad25b Compare February 10, 2025 14:47
@pr-commenter
Copy link

pr-commenter bot commented Feb 10, 2025

Benchmarks

Benchmark execution time: 2025-02-10 15:15:53

Comparing candidate commit 1293418 in PR branch di-worker-middleware with baseline commit fba70fd in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 31 metrics, 2 unstable metrics.

@codecov-commenter
Copy link

codecov-commenter commented Feb 10, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.74%. Comparing base (fba70fd) to head (1293418).
Report is 1406 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #4358   +/-   ##
=======================================
  Coverage   97.73%   97.74%           
=======================================
  Files        1347     1347           
  Lines       82388    82390    +2     
  Branches     4193     4195    +2     
=======================================
+ Hits        80522    80528    +6     
+ Misses       1866     1862    -4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@p-datadog
Copy link
Member Author

The red configuration is either refusing to rerun or keeps failing, it's confusing.

@probe_notifier_worker = ProbeNotifierWorker.new(settings, transport, logger, telemetry: telemetry)
@probe_notification_builder = ProbeNotificationBuilder.new(settings, serializer)
@probe_manager = ProbeManager.new(settings, instrumenter, probe_notification_builder, probe_notifier_worker, logger, telemetry: telemetry)
probe_notifier_worker.start
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason the approach we use in tracing doesn't work for DI?
In tracing, we always check if the process has forked when submitting data, and if it has, we restart the worker:

start if @worker.nil? || @pid != Process.pid

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can certainly do that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in #4363

if defined?(Datadog::DI.current_component)
# TODO: when would Datadog::DI be defined but
# Datadog::DI.current_component not be defined?
Datadog::DI.current_component&.probe_notifier_worker&.start
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also a question similar to TODO's question: Is the current_component&.probe_notifier_worker needs to have safe-navigation? Could it be not defined?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entire DI component should be missing on Ruby 2.5 and JRuby. I am not sure (per the comment, and I did not investigate) why the DI module is defined but current_component is not defined sometimes. If current_component is defined and not nil, the rest of DI infrastructure should be loaded.

@p-datadog
Copy link
Member Author

This PR was replaced by #4363.

@p-datadog p-datadog closed this Feb 12, 2025
@p-datadog p-datadog deleted the di-worker-middleware branch February 12, 2025 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

integrations Involves tracing integrations tracing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants