-
Notifications
You must be signed in to change notification settings - Fork 147
feat(hints, summary, docs): Refactor O3 parameters to generic names and restructure GAIA validation docs #58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request refactors O3-specific parameter names to generic model-agnostic names and restructures GAIA validation documentation. The changes make the codebase more flexible and maintainable by removing model-specific naming conventions.
Key changes include:
- Renamed parameters:
o3_hint→hint_generation,o3_final_answer→final_answer_extraction - Updated function names and references throughout the codebase
- Extracted common GAIA validation content into separate documentation files
- Added support for configurable base URLs for hint generation and final answer extraction
Reviewed Changes
Copilot reviewed 35 out of 36 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| utils/util_llm_simple_voting.py | Updated string replacement to use generic "LLM" instead of "O3" |
| utils/util_llm_parallel_thinking.py | Updated string replacement and improved function parameter formatting |
| src/utils/summary_utils.py | Renamed functions to generic names and added base_url parameters |
| src/core/orchestrator.py | Updated function calls and variable names to use generic terminology |
| config/*.yaml | Updated all configuration files to use new parameter names |
| docs/mkdocs/docs/*.md | Restructured GAIA validation docs and updated references |
| scripts/*.sh | Updated script to use new config file names |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
|
||
| yield task | ||
| return |
Copilot
AI
Oct 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the unnecessary return statement at the end of the generator function. Generator functions don't need explicit empty return statements.
| return |
src/utils/summary_utils.py
Outdated
| reasoning_effort="high", | ||
| ) | ||
|
|
||
| # response = await client.chat.completions.create(messages = [{"role": "user", "content": content}], model="dummy") |
Copilot
AI
Oct 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the commented-out debug code. This appears to be leftover debugging code that should be cleaned up.
| # response = await client.chat.completions.create(messages = [{"role": "user", "content": content}], model="dummy") |
src/utils/summary_utils.py
Outdated
| model="o3", | ||
| messages=[{"role": "user", "content": f"[{message_id}] {full_prompt}"}], | ||
| reasoning_effort="medium", | ||
| # reasoning_effort="medium", |
Copilot
AI
Oct 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the commented-out parameter. If this parameter is no longer needed, the commented line should be removed to keep the code clean.
| # reasoning_effort="medium", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 35 out of 36 changed files in this pull request and generated 3 comments.
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
src/utils/summary_utils.py
Outdated
| reasoning_effort="high", | ||
| ) | ||
|
|
||
| # response = await client.chat.completions.create(messages = [{"role": "user", "content": content}], model="dummy") |
Copilot
AI
Oct 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This commented-out line appears to be debug/test code that should be removed from production code.
| # response = await client.chat.completions.create(messages = [{"role": "user", "content": content}], model="dummy") |
src/utils/summary_utils.py
Outdated
| async def get_gaia_answer_type( | ||
| task_description: str, api_key: str, base_url: str = "https://api.openai.com/v1" | ||
| ) -> str: | ||
| # client = AsyncOpenAI(api_key=api_key, timeout=600) |
Copilot
AI
Oct 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This commented-out line should be removed as it's redundant with the active line below it.
| # client = AsyncOpenAI(api_key=api_key, timeout=600) |
| message_id = _generate_message_id() | ||
| response = await client.chat.completions.create( | ||
| model="o3", | ||
| messages=[{"role": "user", "content": f"[{message_id}] {full_prompt}"}], |
Copilot
AI
Oct 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This commented-out parameter should either be removed or properly documented if it's intentionally disabled for generic LLM compatibility.
| messages=[{"role": "user", "content": f"[{message_id}] {full_prompt}"}], | |
| messages=[{"role": "user", "content": f"[{message_id}] {full_prompt}"}], | |
| # The 'reasoning_effort' parameter is intentionally disabled for compatibility with generic LLM APIs. | |
| # Uncomment and set as needed if using an LLM that supports this parameter, e.g., OpenAI's proprietary models. |
Summary
gaia_validation_prerequisites.mdagent_gaia-validation_mirothinker.yaml,agent_gaia-validation_claude37sonnet.yamlChanges
o3_hint→hint_generation,o3_final_answer→final_answer_extraction, etc.orchestrator.pyandsummary_utils.py