Skip to content

Commit ae61732

Browse files
jbachorikclaude
andauthored
Fix calltracestorage concurrent and reentrant issues (#280)
--------- Co-authored-by: Claude <[email protected]>
1 parent 82bf608 commit ae61732

37 files changed

+3933
-341
lines changed
Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
#!/usr/bin/env bash
2+
set -euo pipefail
3+
4+
mkdir -p build/logs build/reports/claude .claude/out
5+
STAMP="$(date +%Y%m%d-%H%M%S)"
6+
7+
# Args (default to 'build')
8+
ARGS=("$@")
9+
if [ "${#ARGS[@]}" -eq 0 ]; then
10+
ARGS=(build)
11+
fi
12+
13+
# Label for the log file from the first arg
14+
LABEL="$(printf '%s' "${ARGS[0]}" | tr '/:' '__')"
15+
LOG="build/logs/${STAMP}-${LABEL}.log"
16+
17+
# Ensure we clean the tail on exit
18+
tail_pid=""
19+
cleanup() { [ -n "${tail_pid:-}" ] && kill "$tail_pid" 2>/dev/null || true; }
20+
trap cleanup EXIT INT TERM
21+
22+
echo "▶ Logging full Gradle output to: $LOG"
23+
echo "▶ Running: ./gradlew ${ARGS[*]} -i --console=plain"
24+
echo " (Console output here is minimized; the full log is in the file.)"
25+
echo
26+
27+
# Start Gradle fully redirected to the log (no stdout/stderr to this session)
28+
# Use stdbuf to make the output line-buffered in the log for timely tailing.
29+
( stdbuf -oL -eL ./gradlew "${ARGS[@]}" -i --console=plain ) >"$LOG" 2>&1 &
30+
gradle_pid=$!
31+
32+
# Minimal live progress: follow the log and print only interesting lines
33+
# - Task starts
34+
# - Final build status
35+
# - Test summary lines
36+
tail -n0 -F "$LOG" | awk '
37+
/^> Task / { print; fflush(); next }
38+
/^BUILD (SUCCESSFUL|FAILED)/ { print; fflush(); next }
39+
/[0-9]+ tests? (successful|failed|skipped)/ { print; fflush(); next }
40+
' &
41+
tail_pid=$!
42+
43+
# Wait for Gradle to finish
44+
wait "$gradle_pid"
45+
status=$?
46+
47+
# Stop the tail and print a compact summary from the log
48+
kill "$tail_pid" 2>/dev/null || true
49+
tail_pid=""
50+
51+
echo
52+
echo "=== Summary ==="
53+
# Grab the last BUILD line and nearest test summary lines
54+
awk '
55+
/^BUILD (SUCCESSFUL|FAILED)/ { lastbuild=$0 }
56+
/[0-9]+ tests? (successful|failed|skipped)/ { tests=$0 }
57+
END {
58+
if (lastbuild) print lastbuild;
59+
if (tests) print tests;
60+
}
61+
' "$LOG" || true
62+
63+
echo
64+
if [ $status -eq 0 ]; then
65+
echo "✔ Gradle completed. Full log at: $LOG"
66+
else
67+
echo "✖ Gradle failed with status $status. Full log at: $LOG"
68+
fi
69+
70+
# Hand over to your logs analyst agent — keep the main session output tiny.
71+
echo
72+
echo "Delegating to gradle-logs-analyst agent…"
73+
# If your CLI supports non-streaming, set it here to avoid verbose output.
74+
# Example (uncomment if supported): export CLAUDE_NO_STREAM=1
75+
claude "Act as the gradle-logs-analyst agent to parse the build log at: $LOG. Generate the required Gradle summary artifacts as specified in the gradle-logs-analyst agent definition."
Lines changed: 4 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -1,33 +1,7 @@
1-
---
2-
description: Run a Gradle task, capture console to a timestamped log, then delegate parsing to the sub-agent and reply briefly.
3-
usage: "/build-and-summarize <gradle-args...>"
4-
---
1+
# build-and-summarize
52

6-
**Task:** Build with Gradle (plain console, info level), capture output to `build/logs/`, then have `gradle-log-analyst` parse the log and write:
7-
- `build/reports/claude/gradle-summary.md`
8-
- `build/reports/claude/gradle-summary.json`
9-
10-
Make sure to use the JAVA_HOME environment variable is set appropriately.
3+
Runs `./gradlew` with full output captured to a timestamped log, shows minimal live progress (task starts + final build/test summary), then asks the `gradle-logs-analyst` agent to produce structured artifacts from the log.
114

5+
## Usage
126
```bash
13-
set -euo pipefail
14-
mkdir -p build/logs build/reports/claude
15-
STAMP="$(date +%Y%m%d-%H%M%S)"
16-
17-
# Default to 'build' if no args were given
18-
ARGS=("$@")
19-
if [ "${#ARGS[@]}" -eq 0 ]; then
20-
ARGS=(build)
21-
fi
22-
23-
# Make a filename-friendly label (first arg only)
24-
LABEL="$(echo "${ARGS[0]}" | tr '/:' '__')"
25-
LOG="build/logs/${STAMP}-${LABEL}.log"
26-
27-
echo "Running: ./gradlew ${ARGS[*]} -i --console=plain"
28-
# Capture both stdout and stderr to the log while streaming to terminal
29-
(./gradlew "${ARGS[@]}" -i --console=plain 2>&1 | tee "$LOG") || true
30-
31-
# Delegate parsing to the sub-agent
32-
echo "Delegating to gradle-logs-analyst agent..."
33-
claude "Act as the gradle-logs-analyst agent to parse the build log at: $LOG. Generate the required gradle summary artifacts as specified in the gradle-logs-analyst agent definition."
7+
./.claude/commands/build-and-summarize [<gradle-args>...]

.claude/settings.local.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,8 @@
1313
"Bash(grep:*)",
1414
"WebFetch(domain:github.com)",
1515
"WebFetch(domain:raw.githubusercontent.com)",
16-
"WebFetch(domain:raw.githubusercontent.com)"
16+
"WebFetch(domain:raw.githubusercontent.com)",
17+
"Bash(./.claude/commands/build-and-summarize:*)"
1718
],
1819
"deny": [],
1920
"ask": []

CLAUDE.md

Lines changed: 59 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ You are the **Main Orchestrator** for this repository.
4141
“Use `gradle-log-analyst` to parse LOG_PATH; write the two reports; reply with only a 3–6 line status and the two relative file paths.”
4242

4343
### Shortcuts I Expect
44-
- `/build-and-summarize <gradle-task...>` to do everything in one step.
44+
- `./gradlew <gradle-task...>` to do everything in one step.
4545
- If I just say “build assembleDebugJar”, interpret that as the shortcut above.
4646

4747
## Build Commands
@@ -50,74 +50,74 @@ Never use 'gradle' or 'gradlew' directly. Instead, use the '/build-and-summarize
5050
### Main Build Tasks
5151
```bash
5252
# Build release version (primary artifact)
53-
/build-and-summarize buildRelease
53+
./gradlew buildRelease
5454

5555
# Build all configurations
56-
/build-and-summarize assembleAll
56+
./gradlew assembleAll
5757

5858
# Clean build
59-
/build-and-summarize clean
59+
./gradlew clean
6060
```
6161

6262
### Development Builds
6363
```bash
6464
# Debug build with symbols
65-
/build-and-summarize buildDebug
65+
./gradlew buildDebug
6666

6767
# ASan build (if available)
68-
/build-and-summarize buildAsan
68+
./gradlew buildAsan
6969

7070
# TSan build (if available)
71-
/build-and-summarize buildTsan
71+
./gradlew buildTsan
7272
```
7373

7474
### Testing
7575
```bash
7676
# Run specific test configurations
77-
/build-and-summarize testRelease
78-
/build-and-summarize testDebug
79-
/build-and-summarize testAsan
80-
/build-and-summarize testTsan
77+
./gradlew testRelease
78+
./gradlew testDebug
79+
./gradlew testAsan
80+
./gradlew testTsan
8181

8282
# Run C++ unit tests only
83-
/build-and-summarize gtestDebug
84-
/build-and-summarize gtestRelease
83+
./gradlew gtestDebug
84+
./gradlew gtestRelease
8585

8686
# Cross-JDK testing
87-
JAVA_TEST_HOME=/path/to/test/jdk /build-and-summarize testDebug
87+
JAVA_TEST_HOME=/path/to/test/jdk ./gradlew testDebug
8888
```
8989

9090
### Build Options
9191
```bash
9292
# Skip native compilation
93-
/build-and-summarize buildDebug -Pskip-native
93+
./gradlew buildDebug -Pskip-native
9494

9595
# Skip all tests
96-
/build-and-summarize buildDebug -Pskip-tests
96+
./gradlew buildDebug -Pskip-tests
9797

9898
# Skip C++ tests
99-
/build-and-summarize buildDebug -Pskip-gtest
99+
./gradlew buildDebug -Pskip-gtest
100100

101101
# Keep JFR recordings after tests
102-
/build-and-summarize testDebug -PkeepJFRs
102+
./gradlew testDebug -PkeepJFRs
103103

104104
# Skip debug symbol extraction
105-
/build-and-summarize buildRelease -Pskip-debug-extraction=true
105+
./gradlew buildRelease -Pskip-debug-extraction=true
106106
```
107107

108108
### Code Quality
109109
```bash
110110
# Format code
111-
/build-and-summarize spotlessApply
111+
./gradlew spotlessApply
112112

113113
# Static analysis
114-
/build-and-summarize scanBuild
114+
./gradlew scanBuild
115115

116116
# Run stress tests
117-
/build-and-summarize :ddprof-stresstest:runStressTests
117+
./gradlew :ddprof-stresstest:runStressTests
118118

119119
# Run benchmarks
120-
/build-and-summarize runBenchmarks
120+
./gradlew runBenchmarks
121121
```
122122

123123
## Architecture
@@ -338,3 +338,39 @@ With separate debug symbol packages for production debugging support.
338338

339339
- Run tests with 'testdebug' gradle task
340340
- Use at most Java 21 to build and run tests
341+
342+
## Agentic Work
343+
344+
- Never run `./gradlew` directly.
345+
- Always invoke the wrapper command: `./.claude/commands/build-and-summarize`.
346+
- Pass through all arguments exactly as you would to `./gradlew`.
347+
- Examples:
348+
- Instead of:
349+
```bash
350+
./gradlew build
351+
```
352+
use:
353+
```bash
354+
./.claude/commands/build-and-summarize build
355+
```
356+
- Instead of:
357+
```bash
358+
./gradlew :prof-utils:test --tests "UpscaledMethodSampleEventSinkTest"
359+
```
360+
use:
361+
```bash
362+
./.claude/commands/build-and-summarize :prof-utils:test --tests "UpscaledMethodSampleEventSinkTest"
363+
```
364+
365+
- This ensures the full build log is captured to a file and only a summary is shown in the main session.
366+
367+
## Ground rules
368+
- Never replace the code you work on with stubs
369+
- Never 'fix' the tests by testing constants against constants
370+
- Never claim success until all affected tests are passing
371+
- Always provide javadoc for public classes and methods
372+
- Provide javadoc for non-trivial private and package private code
373+
- Always provide comprehensive tests for new functionality
374+
- Always provide tests for bug fixes - test fails before the fix, passes after the fix
375+
- All code needs to strive to be lean in terms of resources consumption and easy to follow -
376+
do not shy away from factoring out self containing code to shorter functions with explicit name

README.md

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -348,6 +348,60 @@ The project includes JMH-based stress tests:
348348
- ASan: `libasan`
349349
- TSan: `libtsan`
350350

351+
## Architectural Tidbits
352+
353+
This section documents important architectural decisions and enhancements made to the profiler core.
354+
355+
### Critical Section Management (2025)
356+
357+
Introduced race-free critical section management using atomic compare-and-swap operations instead of expensive signal blocking syscalls:
358+
359+
- **`CriticalSection` class**: Thread-local atomic flag-based protection against signal handler reentrancy
360+
- **Lock-free design**: Uses `compare_exchange_strong` for atomic claiming of critical sections
361+
- **Signal handler safety**: Eliminates race conditions between signal handlers and normal code execution
362+
- **Performance improvement**: Avoids costly `sigprocmask`/`pthread_sigmask` syscalls in hot paths
363+
364+
**Key files**: `criticalSection.h`, `criticalSection.cpp`
365+
366+
### Triple-Buffered Call Trace Storage (2025)
367+
368+
Enhanced the call trace storage system from double-buffered to triple-buffered architecture with hazard pointer-based memory reclamation:
369+
370+
- **Triple buffering**: Active, standby, and cleanup storage rotation for smoother transitions
371+
- **Hazard pointer system**: Per-instance thread-safe memory reclamation without global locks
372+
- **ABA protection**: Generation counter prevents race conditions during table swaps
373+
- **Instance-based trace IDs**: 64-bit IDs combining instance ID and slot for collision-free trace management
374+
- **Lock-free hot paths**: Atomic operations minimize contention during profiling events
375+
376+
**Key changes**:
377+
- Replaced `SpinLock` with atomic pointers and hazard pointer system
378+
- Added generation counter for safe table swapping
379+
- Enhanced liveness preservation across storage rotations
380+
- Improved thread safety for high-frequency profiling scenarios
381+
382+
**Key files**: `callTraceStorage.h`, `callTraceStorage.cpp`, `callTraceHashTable.h`, `callTraceHashTable.cpp`
383+
384+
### Enhanced Testing Infrastructure (2025)
385+
386+
Comprehensive testing improvements for better debugging and stress testing:
387+
388+
- **GTest crash handler**: Detailed crash reporting with backtraces and register state for native code failures
389+
- **Stress testing framework**: Multi-threaded stress tests for call trace storage under high contention
390+
- **Platform-specific debugging**: macOS and Linux register state capture in crash handlers
391+
- **Async-signal-safe reporting**: Crash handlers use only signal-safe functions for reliable diagnostics
392+
393+
**Key files**: `gtest_crash_handler.h`, `stress_callTraceStorage.cpp`
394+
395+
### TLS Priming Enhancements (2025)
396+
397+
Improved thread-local storage initialization to prevent race conditions:
398+
399+
- **Solid TLS priming**: Enhanced thread-local variable initialization timing
400+
- **Signal handler compatibility**: Ensures TLS is fully initialized before signal handler access
401+
- **Cross-platform consistency**: Unified TLS handling across Linux and macOS platforms
402+
403+
These architectural improvements focus on eliminating race conditions, improving performance in high-throughput scenarios, and providing better debugging capabilities for the native profiling engine.
404+
351405
## Contributing
352406
1. Fork the repository
353407
2. Create a feature branch

0 commit comments

Comments
 (0)