Skip to content

Commit 75ba7c6

Browse files
committed
Create OPENAI_AGENTS.md
1 parent 9019e76 commit 75ba7c6

File tree

1 file changed

+323
-0
lines changed

1 file changed

+323
-0
lines changed

docs/OPENAI_AGENTS.md

Lines changed: 323 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,323 @@
1+
# OpenAI Agents Adapter
2+
3+
The Cloudflare Sandbox SDK provides adapters that integrate with the [OpenAI Agents SDK](https://github.com/openai/agents) to enable AI agents to execute shell commands and perform file operations inside sandboxed environments.
4+
5+
## Overview
6+
7+
The OpenAI Agents adapter consists of two main components:
8+
9+
- **`Shell`**: Implements the OpenAI Agents `Shell` interface, allowing agents to execute shell commands in the sandbox
10+
- **`Editor`**: Implements the OpenAI Agents `Editor` interface, enabling agents to create, update, and delete files using patch operations
11+
12+
Both adapters automatically collect results from operations, making it easy to track what commands were executed and what files were modified during an agent session.
13+
14+
## Installation
15+
16+
The adapters are part of the `@cloudflare/sandbox` package:
17+
18+
```typescript
19+
import { getSandbox } from '@cloudflare/sandbox';
20+
import { Shell, Editor } from '@cloudflare/sandbox/openai';
21+
import { Agent, applyPatchTool, run, shellTool } from '@openai/agents';
22+
```
23+
24+
## Basic Usage
25+
26+
### Setting Up an Agent
27+
28+
```typescript
29+
import { getSandbox } from '@cloudflare/sandbox';
30+
import { Shell, Editor } from '@cloudflare/sandbox/openai';
31+
import { Agent, applyPatchTool, run, shellTool } from '@openai/agents';
32+
33+
export default {
34+
async fetch(request: Request, env: Env): Promise<Response> {
35+
// Get a sandbox instance
36+
const sandbox = getSandbox(env.Sandbox, 'workspace-session');
37+
38+
// Create shell adapter (executes commands in /workspace by default)
39+
const shell = new Shell(sandbox);
40+
41+
// Create editor adapter (operates on /workspace by default)
42+
const editor = new Editor(sandbox, '/workspace');
43+
44+
// Create an agent with both tools
45+
const agent = new Agent({
46+
name: 'Sandbox Assistant',
47+
model: 'gpt-4',
48+
instructions:
49+
'You can execute shell commands and edit files in the workspace.',
50+
tools: [
51+
shellTool({ shell, needsApproval: false }),
52+
applyPatchTool({ editor, needsApproval: false })
53+
]
54+
});
55+
56+
// Run the agent with user input
57+
const { input } = await request.json();
58+
const result = await run(agent, input);
59+
60+
// Access collected results
61+
const commandResults = shell.results;
62+
const fileOperations = editor.results;
63+
64+
return new Response(
65+
JSON.stringify({
66+
naturalResponse: result.finalOutput,
67+
commandResults,
68+
fileOperations
69+
}),
70+
{
71+
headers: { 'Content-Type': 'application/json' }
72+
}
73+
);
74+
}
75+
};
76+
```
77+
78+
## Shell Adapter
79+
80+
The `Shell` class adapts Cloudflare Sandbox `exec` calls to the OpenAI Agents `Shell` contract.
81+
82+
### Features
83+
84+
- Executes commands sequentially in the sandbox
85+
- Preserves working directory (`/workspace` by default)
86+
- Handles timeouts and errors gracefully
87+
- Collects results with timestamps for each command
88+
- Separates stdout and stderr output
89+
90+
### Command Results
91+
92+
Each executed command is automatically collected in `shell.results`:
93+
94+
```typescript
95+
interface CommandResult {
96+
command: string; // The command that was executed
97+
stdout: string; // Standard output
98+
stderr: string; // Standard error
99+
exitCode: number | null; // Exit code (null for timeouts)
100+
timestamp: number; // Unix timestamp in milliseconds
101+
}
102+
```
103+
104+
### Example: Inspecting Workspace
105+
106+
```typescript
107+
const shell = new Shell(sandbox);
108+
109+
// Agent can execute commands like:
110+
// - ls -la
111+
// - cat package.json
112+
// - git status
113+
// - npm install
114+
115+
// After agent execution, access results:
116+
shell.results.forEach((result) => {
117+
console.log(`Command: ${result.command}`);
118+
console.log(`Exit code: ${result.exitCode}`);
119+
console.log(`Output: ${result.stdout}`);
120+
});
121+
```
122+
123+
### Error Handling
124+
125+
The Shell adapter handles various error scenarios:
126+
127+
- **Command failures**: Non-zero exit codes are captured in `exitCode`
128+
- **Timeouts**: Commands that exceed the timeout return `exitCode: null` and `outcome.type: 'timeout'`
129+
- **Network errors**: HTTP/network errors are caught and logged
130+
131+
## Editor Adapter
132+
133+
The `Editor` class implements file operations using the OpenAI Agents patch-based editing system.
134+
135+
### Features
136+
137+
- Creates files with initial content using diffs
138+
- Updates existing files by applying diffs
139+
- Deletes files
140+
- Automatically creates parent directories when needed
141+
- Validates paths to prevent operations outside the workspace
142+
- Collects results with timestamps for each operation
143+
144+
### File Operation Results
145+
146+
Each file operation is automatically collected in `editor.results`:
147+
148+
```typescript
149+
interface FileOperationResult {
150+
operation: 'create' | 'update' | 'delete';
151+
path: string; // Relative path from workspace root
152+
status: 'completed' | 'failed';
153+
output: string; // Human-readable status message
154+
error?: string; // Error message if status is 'failed'
155+
timestamp: number; // Unix timestamp in milliseconds
156+
}
157+
```
158+
159+
### Path Resolution
160+
161+
The Editor enforces security by:
162+
163+
- Resolving relative paths within the workspace root (`/workspace` by default)
164+
- Preventing path traversal attacks (e.g., `../../../etc/passwd`)
165+
- Normalizing path separators and removing redundant segments
166+
- Throwing errors for operations outside the workspace
167+
168+
### Example: Creating and Editing Files
169+
170+
```typescript
171+
const editor = new Editor(sandbox, '/workspace');
172+
173+
// Agent can use apply_patch tool to:
174+
// - Create new files with content
175+
// - Update existing files with diffs
176+
// - Delete files
177+
178+
// After agent execution, access results:
179+
editor.results.forEach((result) => {
180+
console.log(`${result.operation}: ${result.path}`);
181+
console.log(`Status: ${result.status}`);
182+
if (result.error) {
183+
console.log(`Error: ${result.error}`);
184+
}
185+
});
186+
```
187+
188+
### Custom Workspace Root
189+
190+
You can specify a custom workspace root:
191+
192+
```typescript
193+
// Use a different root directory
194+
const editor = new Editor(sandbox, '/custom/workspace');
195+
```
196+
197+
## Complete Example
198+
199+
Here's a complete example showing how to integrate the adapters in a Cloudflare Worker:
200+
201+
```typescript
202+
import { getSandbox } from '@cloudflare/sandbox';
203+
import { Shell, Editor } from '@cloudflare/sandbox/openai';
204+
import { Agent, applyPatchTool, run, shellTool } from '@openai/agents';
205+
206+
async function handleRunRequest(request: Request, env: Env): Promise<Response> {
207+
try {
208+
const { input } = await request.json();
209+
210+
if (!input || typeof input !== 'string') {
211+
return new Response(
212+
JSON.stringify({ error: 'Missing or invalid input field' }),
213+
{ status: 400, headers: { 'Content-Type': 'application/json' } }
214+
);
215+
}
216+
217+
// Get sandbox instance (reused for both shell and editor)
218+
const sandbox = getSandbox(env.Sandbox, 'workspace-session');
219+
220+
// Create adapters
221+
const shell = new Shell(sandbox);
222+
const editor = new Editor(sandbox, '/workspace');
223+
224+
// Create agent with tools
225+
const agent = new Agent({
226+
name: 'Sandbox Studio',
227+
model: 'gpt-4',
228+
instructions: `
229+
You can execute shell commands and edit files in the workspace.
230+
Use shell commands to inspect the repository and the apply_patch tool
231+
to create, update, or delete files. Keep responses concise and include
232+
command output when helpful.
233+
`,
234+
tools: [
235+
shellTool({ shell, needsApproval: false }),
236+
applyPatchTool({ editor, needsApproval: false })
237+
]
238+
});
239+
240+
// Run the agent
241+
const result = await run(agent, input);
242+
243+
// Format response with sorted results
244+
const response = {
245+
naturalResponse: result.finalOutput || null,
246+
commandResults: shell.results.sort((a, b) => a.timestamp - b.timestamp),
247+
fileOperations: editor.results.sort((a, b) => a.timestamp - b.timestamp)
248+
};
249+
250+
return new Response(JSON.stringify(response), {
251+
headers: { 'Content-Type': 'application/json' }
252+
});
253+
} catch (error) {
254+
return new Response(
255+
JSON.stringify({
256+
error: error instanceof Error ? error.message : 'Internal server error',
257+
naturalResponse: 'An error occurred while processing your request.',
258+
commandResults: [],
259+
fileOperations: []
260+
}),
261+
{
262+
status: 500,
263+
headers: { 'Content-Type': 'application/json' }
264+
}
265+
);
266+
}
267+
}
268+
269+
export default {
270+
async fetch(request: Request, env: Env): Promise<Response> {
271+
const url = new URL(request.url);
272+
273+
if (url.pathname === '/run' && request.method === 'POST') {
274+
return handleRunRequest(request, env);
275+
}
276+
277+
return new Response('Not found', { status: 404 });
278+
}
279+
};
280+
```
281+
282+
## Result Tracking
283+
284+
Both adapters automatically track all operations with timestamps. This makes it easy to:
285+
286+
- **Audit operations**: See exactly what commands were run and files were modified
287+
- **Debug issues**: Identify which operation failed and when
288+
- **Build UIs**: Display a timeline of agent actions
289+
- **Logging**: Export operation history for analysis
290+
291+
### Combining Results
292+
293+
You can combine and sort results from both adapters:
294+
295+
```typescript
296+
const allResults = [
297+
...shell.results.map((r) => ({ type: 'command' as const, ...r })),
298+
...editor.results.map((r) => ({ type: 'file' as const, ...r }))
299+
].sort((a, b) => a.timestamp - b.timestamp);
300+
301+
// allResults is now a chronological list of all operations
302+
```
303+
304+
## Best Practices
305+
306+
1. **Reuse sandbox instances**: Create one sandbox instance and share it between Shell and Editor
307+
2. **Set appropriate timeouts**: Configure command timeouts based on expected operation duration
308+
3. **Handle errors gracefully**: Check `status` fields in results and handle `failed` operations
309+
4. **Validate paths**: The Editor already validates paths, but be aware of workspace boundaries
310+
5. **Monitor resource usage**: Large command outputs or file operations may impact performance
311+
312+
## Limitations
313+
314+
- **Working directory**: Shell operations always execute in `/workspace` (or the configured root)
315+
- **Path restrictions**: File operations are restricted to the workspace root
316+
- **Sequential execution**: Commands execute sequentially, not in parallel
317+
- **Timeout handling**: Timeouts stop further command execution in a batch
318+
319+
## See Also
320+
321+
- [OpenAI Agents SDK Documentation](https://github.com/openai/openai-agents-js/)
322+
- [Session Execution Architecture](./SESSION_EXECUTION.md) - Understanding how commands execute in sandboxes
323+
- [Example Implementation](../examples/openai-agents/src/index.ts) - Full working example

0 commit comments

Comments
 (0)