You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+34Lines changed: 34 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -92,6 +92,7 @@ python main.py --query "Go to Google and type 'Hello World' into the search bar"
92
92
-`cloud-run`: Connects to a deployed Cloud Run service (default).
93
93
-`playwright`: Runs the browser locally using Playwright.
94
94
-`browserbase`: Connects to a Browserbase instance.
95
+
-`hud`: Integrates with hud's browser environment.
95
96
96
97
**Local Playwright**
97
98
@@ -115,6 +116,14 @@ Runs the agent using Browserbase as the browser backend. Ensure the proper Brows
115
116
python main.py --query="Go to Google and type 'Hello World' into the search bar" --env="browserbase"
116
117
```
117
118
119
+
**hud**
120
+
121
+
Runs the agent using hud's browser environment. This is the same environment used by `hud_eval.py` but can be run directly with `main.py` for individual tasks. Ensure the `HUD_API_KEY` environment variable is set.
122
+
123
+
```bash
124
+
python main.py --query="Go to Google and type 'Hello World' into the search bar" --env="hud"
125
+
```
126
+
118
127
**Cloud Run**
119
128
120
129
Connects to an [API Server](./apiserver/) deployed on Cloud Run for computer use.
@@ -157,6 +166,31 @@ The `main.py` script is the command-line interface (CLI) for running the browser
157
166
| API_SERVER_KEY | The API key for your deployed Cloud Run API server, if it's configured to require one. Can also be provided via the `--api_server_key` argument. | Conditionally (if API server requires it and not passed via CLI) |
158
167
| BROWSERBASE_API_KEY | Your API key for Browserbase. | Yes (when using the browserbase environment) |
159
168
| BROWSERBASE_PROJECT_ID | Your Project ID for Browserbase. | Yes (when using the browserbase environment) |
169
+
| HUD_API_KEY | Your API key for hud. Required for running evaluations with hud_eval.py. | Yes (when using the hud enviornment or running hud_eval.py) |
170
+
171
+
## Evaluations
172
+
173
+
The `hud_eval.py` script allows you to run automated evaluations against hud tasksets:
0 commit comments