Skip to content

Conversation

@TonyCTHsu
Copy link
Contributor

This is an auto-generated PR to update documentation from here. Please merge (with a merge commit) when ready.

y9v and others added 30 commits February 6, 2025 17:00
**What does this PR do?**

This PR removes the `datadog_profiling_loader` native extension, as
it's no longer needed.

The profiling native loader was added in
#2003 .

Quoting from that PR's description:

> When Ruby loads a native extension (using `require`), it uses
> `dlopen(..., RTLD_LAZY | RTLD_GLOBAL)`
> (https://github.com/ruby/ruby/blob/67950a4c0a884bdb78d9beb4405ebf7459229b21/dln.c#L362).
>
> This means that every symbol exposed directly or indirectly by that
> native extension becomes visible to every other extension in the
> Ruby process. This can cause issues, see
> rubyjs/mini_racer#179 .
>
> Ruby's extension loading mechanism is not configurable -- there's
> no way to tell it to use different flags when calling `dlopen`.
> To get around this, this commit introduces introduces another
> extension (profiling loader) which has only a single responsibility:
> mimic Ruby's extension loading mechanism, but when calling
> `dlopen` use a different set of flags.
>
> This idea was shamelessly stolen from @lloeki's work in
> rubyjs/mini_racer#179, big thanks!

...and importantly ends with:

> Note that, that we know of, the profiling native extension only
> exposes one potentially problematic symbol:
> `rust_eh_personality` (coming from libddprof/libdatadog).
>
> Future versions of Rust have been patched not to expose this
> (see rust-lang/rust#95604 (comment))
> so we may want to revisit the need for this loader in the future,
> and perhaps delete it if we no longer require its services :)

And we have reached the situation predicted in that description:

1. Nowadays libdatadog no longer exposes `rust_eh_personality`

2. We have a test that validates that only expected symbols are
   exported by the libdatadog library
   (see DataDog/libdatadog#573 ).

   Any new symbols that show up would break shipping new
   libdatadog versions to rubygems.org until we review them.

3. The `libdatadog_api` extension, which we've been shipping for
   customers since release 2.3.0 back in July 2024 has always been
   loaded directly without a loader without issues.

Thus, I think it's the right time to get rid of the loader.

**Motivation:**

Having the loader around is not zero cost; we've run into/caused
a few issues ( #2250
and #3582 come to mind).

It also adds overhead to development: every time we need to rebuild
the extensions, it's an extra extension that needs to be prepared,
rebuilt, copied, etc.

I've been meaning to get rid of the loader for some time now,
and this came up again this week so I decided it was time to
do it.

**Additional Notes:**

In the future, we can always ressurrect this approach if we
figure out we need it again.

We've also discussed internally about proposing upstream
to the Ruby VM to maybe add an API to do what the loader was
doing, so that we wouldn't need the weird workaround.

We haven't yet decided if we're going to do that (help welcome!).

**How to test the change?**

The loader was responsible for loading the rest of profiling.

Thus, any test that uses profiling was also validating the loader
and now will validate that we're doing fine without it.

TL;DR green CI is good.
Remove reactive engine from AppSec rack instrumentation
This was an interesting one... Once we removed the profiling loader
extension, our tests using Ractors started failing with:

> Ractor::UnsafeError: ractor unsafe method called from not main ractor

When I started investigating with gdb, I discovered that because
we were initializing our extension without going through Ruby,
we were skipping this part:

https://github.com/ruby/ruby/blob/7178593558080ca529abb61ef27038236ab2687d/load.c#L1301-L1302 :

```c
ext_config_push(th, &prev_ext_config);
handle = rb_vm_call_cfunc(rb_vm_top_self(), load_ext,
                          path, VM_BLOCK_HANDLER_NONE, path);
```

that is, the `ext_config_push` is what's used by Ruby to validate
if a gem is safe to use from Ractors or not. (If you're curious,
that value then affects function definition, which controls
wether Ruby will check or not for being called from a Ractor).

Because we were skipping this entire mechanism, we implicitly
were getting the same result as `rb_ext_ractor_safe(true)`.

Once we removed the loader, this started failing. And I missed it
before opening my PR since for reasons documented in the profiler
ractor tests in detail, we don't run ractor-related tests by
default.

So this issue is one more reason why having the loader may create
its own set of issues and why I'm happy to get rid of it.
…ling-loader

[PROF-11311] Remove datadog_profiling_loader
…endencies

[🤖] Update Latest Dependency
**What does this PR do?**

This PR updates the crasktracker C code to build with the latest
libdatadog changes (in main) that will become part of libdatadog 15.

**Motivation:**

I'm working on a libdatadog branch and had to do this to unblock
my work, so I decided I'll create a PR with it so nobody needs to
repeat this work.

**Additional Notes:**

I'm opening this PR as draft as we shouldn't merge this until
libdatadog 15 is out.

**How to test the change?**

Existing test coverage is enough to validate this.
**What does this PR do?**

This PR upgrades the datadog gem to use libdatadog 16.0.1.

It includes a few changes to match breaking API updates in
crashtracking.

**Motivation:**

Libdatadog 16 is needed to unblock
#4331 .

This version also brings a few crashtracking improvements.

**Change log entry**

Yes. Upgrade libdatadog dependency to 16.0.1

**Additional Notes:**

As usual, I'm opening this PR as a draft as libdatadog 16.0.1 is not
yet available on rubygems.org, and I'll come back to re-trigger CI
and mark this as non-draft once it is.

**How to test the change?**

Our existing test coverage includes libdatadog testing, so a green C
is good here :)
Fixes these bogus warnings (that get turned into errors in our CI):

```
../../../../ext/datadog_profiling_native_extension/collectors_stack.c: In function ‘sample_thread’:
../../../../ext/datadog_profiling_native_extension/collectors_stack.c:303:7: error: missing initializer for field ‘build_id_id’ of ‘struct ddog_prof_Mapping’ [-Werror=missing-field-initializers]
  303 |       .mapping = {.filename = DDOG_CHARSLICE_C(""), .build_id = DDOG_CHARSLICE_C("")},
      |       ^
In file included from /usr/local/bundle/gems/libdatadog-16.0.1.1.0-x86_64-linux/vendor/libdatadog-16.0.1/x86_64-linux/libdatadog-x86_64-unknown-linux-gnu/lib/pkgconfig/../../include/datadog/profiling.h:13,
                 from ../../../../ext/datadog_profiling_native_extension/libdatadog_helpers.h:3,
                 from ../../../../ext/datadog_profiling_native_extension/collectors_stack.c:5:
/usr/local/bundle/gems/libdatadog-16.0.1.1.0-x86_64-linux/vendor/libdatadog-16.0.1/x86_64-linux/libdatadog-x86_64-unknown-linux-gnu/lib/pkgconfig/../../include/datadog/common.h:520:36: note: ‘build_id_id’ declared here
  520 |   struct ddog_prof_ManagedStringId build_id_id;
      |                                    ^~~~~~~~~~~
../../../../ext/datadog_profiling_native_extension/collectors_stack.c: In function ‘maybe_add_placeholder_frames_omitted’:
../../../../ext/datadog_profiling_native_extension/collectors_stack.c:382:5: error: missing initializer for field ‘build_id_id’ of ‘struct ddog_prof_Mapping’ [-Werror=missing-field-initializers]
  382 |     .mapping = {.filename = DDOG_CHARSLICE_C(""), .build_id = DDOG_CHARSLICE_C("")},
      |     ^
In file included from /usr/local/bundle/gems/libdatadog-16.0.1.1.0-x86_64-linux/vendor/libdatadog-16.0.1/x86_64-linux/libdatadog-x86_64-unknown-linux-gnu/lib/pkgconfig/../../include/datadog/profiling.h:13,
                 from ../../../../ext/datadog_profiling_native_extension/libdatadog_helpers.h:3,
                 from ../../../../ext/datadog_profiling_native_extension/collectors_stack.c:5:
/usr/local/bundle/gems/libdatadog-16.0.1.1.0-x86_64-linux/vendor/libdatadog-16.0.1/x86_64-linux/libdatadog-x86_64-unknown-linux-gnu/lib/pkgconfig/../../include/datadog/common.h:520:36: note: ‘build_id_id’ declared here
  520 |   struct ddog_prof_ManagedStringId build_id_id;
      |                                    ^~~~~~~~~~~
../../../../ext/datadog_profiling_native_extension/collectors_stack.c: In function ‘record_placeholder_stack’:
../../../../ext/datadog_profiling_native_extension/collectors_stack.c:429:5: error: missing initializer for field ‘build_id_id’ of ‘struct ddog_prof_Mapping’ [-Werror=missing-field-initializers]
  429 |     .mapping = {.filename = DDOG_CHARSLICE_C(""), .build_id = DDOG_CHARSLICE_C("")},
      |     ^
In file included from /usr/local/bundle/gems/libdatadog-16.0.1.1.0-x86_64-linux/vendor/libdatadog-16.0.1/x86_64-linux/libdatadog-x86_64-unknown-linux-gnu/lib/pkgconfig/../../include/datadog/profiling.h:13,
                 from ../../../../ext/datadog_profiling_native_extension/libdatadog_helpers.h:3,
                 from ../../../../ext/datadog_profiling_native_extension/collectors_stack.c:5:
/usr/local/bundle/gems/libdatadog-16.0.1.1.0-x86_64-linux/vendor/libdatadog-16.0.1/x86_64-linux/libdatadog-x86_64-unknown-linux-gnu/lib/pkgconfig/../../include/datadog/common.h:520:36: note: ‘build_id_id’ declared here
  520 |   struct ddog_prof_ManagedStringId build_id_id;
      |                                    ^~~~~~~~~~~
cc1: all warnings being treated as errors
```
* Add devise mock user support class
…_0_1-upgrade

[PROF-11306] Upgrade libdatadog dependency to 16.0.1
**What does this PR do?**

This PR fixes the following flaky profiler spec
(seen in
https://app.circleci.com/pipelines/github/DataDog/dd-trace-rb/19126/workflows/00361a70-d216-4fa9-9ce9-0f87e5f033f1/jobs/669239):

```
Failures:

  1) Datadog::Profiling::Collectors::CpuAndWallTimeWorker._native_allocation_count when CpuAndWallTimeWorker has been started when allocation profiling and allocation counting is enabled returns different numbers of allocations for different threads
     Failure/Error: expect(after_t1 - before_t1).to be 100

       expected #<Integer:201> => 100
            got #<Integer:217> => 108

       Compared using equal?, which compares object identity,
       but expected and actual are not the same object. Use
       `expect(actual).to eq(expected)` if you don't care about
       object identity in this example.
     # ./spec/datadog/profiling/collectors/cpu_and_wall_time_worker_spec.rb:1377:in `block (5 levels) in <top (required)>'
     # ./spec/spec_helper.rb:238:in `block (2 levels) in <top (required)>'
     # ./spec/spec_helper.rb:123:in `block (2 levels) in <top (required)>'
     # /usr/local/bundle/gems/webmock-3.23.0/lib/webmock/rspec.rb:39:in `block (2 levels) in <top (required)>'
     # /usr/local/bundle/gems/rspec-wait-0.0.9/lib/rspec/wait.rb:46:in `block (2 levels) in <top (required)>'
```

We had previously debugged the same issue on another spec in
#3554 (and left a nice
comment explaining what the issue is).

I just propagated the same fix (disabling GC for the duration
of the test) to this spec as well.

**Motivation:**

Zero known flaky profiler tests!

**Additional Notes:**

In practice what happens is that we're counting how many
Ruby objects get created during the spec, but if GC gets
triggered at the exact wrong time, it may trigger other
Ruby code running, and thus more objects than expected
being allocated.

By disabling GC for this test, hopefully we're able
to keep the Ruby VM from not allocating any incidental
things in the background.

This GC counting feature is also deprecated, so in the
future we'll remove it which is even nicer :)

**How to test the change?**

Validate that CI is still green.
…flaky_test

Address sucker_punch flaky test
ivoanjo and others added 15 commits February 21, 2025 10:06
This will be used to access the OTEL context after
open-telemetry/opentelemetry-ruby#1807 .
**What does this PR do?**

This PR adds support for correlating profiling wih otel-api 1.5+.

Context storage was moved in
open-telemetry/opentelemetry-ruby#1807
and we needed to update the profiler to account for this.

**Motivation:**

Keep profiling + otel correlation working.

**Additional Notes:**

N/A

**How to test the change?**

In #4421 I had already bootstrapped the new appraisal groups needed
for testing this. This PR enables them, and our existing test
coverage will cover the new code path when used with otel-api 1.5+.
As visible in repo `Insights` `Actions Usage Metrics` the
`arm-4core-linux-ubuntu24.04` variant is a `hosted-larger` which may not
be free.

Use `ubuntu-24.04-arm` instead.

See: https://github.blog/changelog/2025-01-16-linux-arm64-hosted-runners-now-available-for-free-in-public-repositories-public-preview/
Bump minimum libddwaf version to include mem-leak fix
…text-extraction

[PROF-11412] Support correlating profiling with otel-api 1.5+
@TonyCTHsu TonyCTHsu added the docs Involves documentation label Feb 24, 2025
@TonyCTHsu TonyCTHsu requested review from a team as code owners February 24, 2025 10:59
@github-actions
Copy link

👋 Hey @DataDog/ruby-guild, please fill "Change log entry" section in the pull request description.

If changes need to be present in CHANGELOG.md you can state it this way

**Change log entry**

Yes. A brief summary to be placed into the CHANGELOG.md

(possible answers Yes/Yep/Yeah)

Or you can opt out like that

**Change log entry**

None.

(possible answers No/Nope/None)

Visited at: 2025-02-24 10:59:59 UTC

@github-actions github-actions bot added core Involves Datadog core libraries integrations Involves tracing integrations profiling Involves Datadog profiling appsec Application Security monitoring product tracing labels Feb 24, 2025
@TonyCTHsu TonyCTHsu closed this Feb 24, 2025
@datadog-datadog-prod-us1
Copy link
Contributor

Datadog Report

Branch report: master
Commit report: 2d817ff
Test service: dd-trace-rb

✅ 0 Failed, 22852 Passed, 1412 Skipped, 3m 17.36s Total Time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

appsec Application Security monitoring product core Involves Datadog core libraries docs Involves documentation integrations Involves tracing integrations profiling Involves Datadog profiling tracing

Projects

None yet

Development

Successfully merging this pull request may close these issues.