-
Notifications
You must be signed in to change notification settings - Fork 46
Fix process callbacks, PID capture, and getLogs race #267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🦋 Changeset detectedLatest commit: bbc5345 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Add documentation for process lifecycle callbacks (onStart, onOutput, onExit, onError) in startProcess() that enable real-time process monitoring without manual log streaming. Clarify that getProcessLogs() reliably captures output from fast-completing commands following race condition fix. Related to cloudflare/sandbox-sdk#267
commit: |
🐳 Docker Images PublishedDefault (no Python): FROM cloudflare/sandbox:0.0.0-pr-267-6f24c60With Python: FROM cloudflare/sandbox:0.0.0-pr-267-6f24c60-pythonVersion: Use the |
For background execution, the exit code file was written before output labelers finished writing to the log. Added a marker file that signals when labelers complete, ensuring all output is captured before getLogs reads the log file.
30d4a9d to
e892093
Compare
For fast commands like 'echo', getLogs() needs complete output without artificial delays. For long-running processes like servers, it should return current output without blocking. The subprocess PID provides a deterministic way to distinguish these cases: dead PID means command finished and we should wait for streaming callbacks to complete. Live PID means command is still running and we return immediately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Claude Code Review
Verdict: Ready to merge. The implementation correctly fixes all three stated problems with architecturally sound solutions.
What This PR Fixes
- PID capture:
session.ts:347-363now waits for PID file before yielding start event, andprocess-service.ts:153-154stores it immediately - Listener preservation:
process-store.ts:64-65mutates in-place withObject.assign()instead of creating new objects, preserving ProcessService's listener references - Output synchronization: Shell script creates
labelers.donemarker file,session.ts:419-443waits for it, andgetProcess()awaitsstreamingCompletebefore returning logs
Architecture Alignment
Excellent. All changes follow existing patterns:
- StreamingComplete promise for async coordination
- Marker files for IPC (following pid/exit file pattern)
- Reference preservation for event routing
- Synchronous first-event processing in SessionManager
Edge Cases Handled Well
- PID file timeout: logs warning, continues with undefined (process still works)
- Labelers timeout: logs warning, proceeds (5s is generous)
- Process read from disk: correctly skips streamingComplete await
- Early SDK streaming: safe because PID captured before return
Minor Note: PID Reuse
process-service.ts:289-298 uses process.kill(pid, 0) to detect dead processes. This has theoretical PID reuse risk (OS reuses PID for new process between check and await). However, the impact is minimal because:
- Terminal status check is the primary gate
- PID check is only a fallback for fast commands
- Time window is tiny (microseconds)
labelers.doneprovides deterministic sync
This is a pragmatic tradeoff and unlikely to cause real issues.
Suggestions (Non-blocking)
- Add comment explaining PID reuse limitation at
process-service.ts:286 - Consider unit test explicitly validating callback firing (currently covered implicitly by E2E tests)
Testing
E2E test properly validates:
- PID availability (line 244-246)
- Fast command log capture (line 258-259)
Changeset correctly formatted and scoped to @cloudflare/sandbox only.
Summary
Fixes #262
Problem
Three related issues with process execution:
Solution