-
Notifications
You must be signed in to change notification settings - Fork 19
Evaluator
guanxinyi edited this page Jun 5, 2025
·
1 revision
├─ apps
│ ├─ eval
│ │ ├─ report
│ │ ├─ src
├─ common
├─ libraries
├─ projects
├─ tools
│ ├─ bench-agent
│ ├─ evaluator
│ ├─ http-agent
│ ├─ types
├─ ...others
-
app/eval: Main entry ofrush eval, and report will be generated in theapp/eval/report. -
common: Repository rush configuration. -
libraries: Scripts for projects. -
projects: Projects. -
tools: Tools used byrush eval,-
bench-agent: Web-Agent. -
evaluator: Evaluator. -
http-agent: HTTP agent. -
types: Types.
-
configuration of rush eval see Config Parameters.
rush eval workflow see Evaluator-Workflow.
Some environment variables will be injected during the evaluation runtime.
- EVAL_PROJECT_ROOT:Task test workspace.
- EVAL_PROJECT_PORT:Task test port.
- EVAL:
rush evalFlag.
- The report will be output in
app/eval/reportand report hierarchy like
├─ report
│ ├─ eval-202411012-194041
│ │ ├─ eval.report.md
│ │ ├─ proj1
│ │ │ ├─ proj1.report.md
│ │ │ ├─ proj1-model1-202411012-194041
│ │ │ │ ├─ proj1-model1.report.md
│ │ │ │ ├─ dev.log
│ │ │ │ ├─ ...others
│ │ │ ├─ proj1-model2-202411012-194041
│ │ │ │ ├─ proj1-model1.report.md
│ │ │ │ ├─ dev.log
│ │ │ │ ├─ ...others
│ │ ├─ proj2
- The codes will be output in
projects/xxxx/evaland source hierarchy like
├─ eval
│ ├─ eval-202411012-194041
│ │ ├─ model-name-1
│ │ │ ├─ init-1 // taskid-times
│ │ │ ├─ task-1-1
│ │ │ ├─ task-2-1
│ │ │ ├─ task-2-2
│ │ │ ├─ ...others
│ │ ├─ model-name-2
│ │ │ ├─ init-1
│ │ │ ├─ task-1-1
│ │ │ ├─ ...others