Skip to content

Commit 7a2fd76

Browse files
author
Yue Deng
committed
remove redudant code
1 parent 9a9c2e5 commit 7a2fd76

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

docs/mkdocs/docs/hle.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -56,14 +56,14 @@ OPENAI_BASE_URL="https://api.openai.com/v1"
5656
### Step 3: Run the Evaluation
5757

5858
```bash title="Run HLE Evaluation"
59-
uv run main.py common-benchmark --config_file_name=agent_hle_claude37sonnet benchmark=hle output_dir="logs/hle/$(date +"%Y%m%d_%H%M")"
59+
uv run main.py common-benchmark --config_file_name=agent_hle_claude37sonnet output_dir="logs/hle/$(date +"%Y%m%d_%H%M")"
6060
```
6161

6262
!!! tip "Resume Interrupted Evaluation"
6363
Specify the same output directory to continue from where you left off:
6464

6565
```bash
66-
uv run main.py common-benchmark --config_file_name=agent_hle_claude37sonnet benchmark=hle output_dir="logs/hle/20251014_1504"
66+
uv run main.py common-benchmark --config_file_name=agent_hle_claude37sonnet output_dir="logs/hle/20251014_1504"
6767
```
6868

6969
### Step 4: Review Results
@@ -83,13 +83,13 @@ cat logs/hle/*/benchmark_results.jsonl
8383
### Test with Limited Tasks
8484

8585
```bash
86-
uv run main.py common-benchmark --config_file_name=agent_hle_claude37sonnet benchmark=hle benchmark.execution.max_tasks=10 output_dir="logs/hle/$(date +"%Y%m%d_%H%M")"
86+
uv run main.py common-benchmark --config_file_name=agent_hle_claude37sonnet benchmark.execution.max_tasks=10 output_dir="logs/hle/$(date +"%Y%m%d_%H%M")"
8787
```
8888

8989
### Adjust Concurrency
9090

9191
```bash
92-
uv run main.py common-benchmark --config_file_name=agent_hle_claude37sonnet benchmark=hle benchmark.execution.max_concurrent=5 output_dir="logs/hle/$(date +"%Y%m%d_%H%M")"
92+
uv run main.py common-benchmark --config_file_name=agent_hle_claude37sonnet benchmark.execution.max_concurrent=5 output_dir="logs/hle/$(date +"%Y%m%d_%H%M")"
9393
```
9494

9595
---

0 commit comments

Comments
 (0)