Skip to content

Conversation

@JubSteven
Copy link
Contributor

Describe this PR

Summary

Adds comprehensive support for the FinSearchComp benchmark, enabling financial search and analysis evaluation with dynamic judge prompts and regional analysis.

Key Changes

  • New Files: FinSearchComp agent config, benchmark config, documentation, evaluation scripts, and progress checker
    Core Updates: Enhanced pipeline with metadata supporting evaluation by LLM judges using various prompt templates.
  • Evaluation Features: Task type handling (T1/T2/T3), progress monitoring, and multiple run support.

Checklist for PR

Must Do

  • Write a good PR title and description, i.e. feat(agent): add pdf tool via mcp, perf: make llm client async and fix(utils): load custom config via importlib etc. CI job check-pr-title enforces Angular commit message format to PR title.
  • Run make precommit locally. CI job lint enforce ruff default format/lint rules on all new codes.
  • Run make pytest. Check test summary (located at report.html) and coverage report (located at htmlcov/index.html) on new codes.

Nice To Have

  • (Optional) Write/update tests under /tests for feat and test PR.
  • (Optional) Write/update docs under /docs for docs and ci PR.

@JubSteven JubSteven changed the title Explorations feat(finsearchcomp): add evaluation support for finsearchcomp Sep 25, 2025
@JubSteven JubSteven changed the title feat(finsearchcomp): add evaluation support for finsearchcomp feat(benchmark): add evaluation support for finsearchcomp Sep 25, 2025
@BinWang28 BinWang28 merged commit e276581 into MiroMindAI:miroflow-v0.3 Sep 25, 2025
0 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants