Skip to content

Conversation

@BinWang28
Copy link
Member

Summary

  • Renamed O3-specific parameters to generic model-agnostic names across all configs and code
  • Extracted common GAIA validation content into gaia_validation_prerequisites.md
  • Renamed config files for consistency: agent_gaia-validation_mirothinker.yaml, agent_gaia-validation_claude37sonnet.yaml
  • Added multiple runs section for MiroThinker with performance variance explanation
  • Moved API keys configuration to model-specific documentation

Changes

  • Parameters: o3_hinthint_generation, o3_final_answerfinal_answer_extraction, etc.
  • Code: Updated function names and references in orchestrator.py and summary_utils.py
  • Configs: Updated all YAML files with new parameter names
  • Docs: Restructured GAIA validation documentation with model-specific files
  • Scripts: Updated script references to use new config file names

@BinWang28 BinWang28 requested a review from Copilot October 1, 2025 03:28
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request refactors O3-specific parameter names to generic model-agnostic names and restructures GAIA validation documentation. The changes make the codebase more flexible and maintainable by removing model-specific naming conventions.

Key changes include:

  • Renamed parameters: o3_hinthint_generation, o3_final_answerfinal_answer_extraction
  • Updated function names and references throughout the codebase
  • Extracted common GAIA validation content into separate documentation files
  • Added support for configurable base URLs for hint generation and final answer extraction

Reviewed Changes

Copilot reviewed 35 out of 36 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
utils/util_llm_simple_voting.py Updated string replacement to use generic "LLM" instead of "O3"
utils/util_llm_parallel_thinking.py Updated string replacement and improved function parameter formatting
src/utils/summary_utils.py Renamed functions to generic names and added base_url parameters
src/core/orchestrator.py Updated function calls and variable names to use generic terminology
config/*.yaml Updated all configuration files to use new parameter names
docs/mkdocs/docs/*.md Restructured GAIA validation docs and updated references
scripts/*.sh Updated script to use new config file names

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.


yield task
return
Copy link

Copilot AI Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the unnecessary return statement at the end of the generator function. Generator functions don't need explicit empty return statements.

Suggested change
return

Copilot uses AI. Check for mistakes.
reasoning_effort="high",
)

# response = await client.chat.completions.create(messages = [{"role": "user", "content": content}], model="dummy")
Copy link

Copilot AI Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the commented-out debug code. This appears to be leftover debugging code that should be cleaned up.

Suggested change
# response = await client.chat.completions.create(messages = [{"role": "user", "content": content}], model="dummy")

Copilot uses AI. Check for mistakes.
model="o3",
messages=[{"role": "user", "content": f"[{message_id}] {full_prompt}"}],
reasoning_effort="medium",
# reasoning_effort="medium",
Copy link

Copilot AI Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the commented-out parameter. If this parameter is no longer needed, the commented line should be removed to keep the code clean.

Suggested change
# reasoning_effort="medium",

Copilot uses AI. Check for mistakes.
@BinWang28 BinWang28 requested a review from Copilot October 1, 2025 03:32
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 35 out of 36 changed files in this pull request and generated 3 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

reasoning_effort="high",
)

# response = await client.chat.completions.create(messages = [{"role": "user", "content": content}], model="dummy")
Copy link

Copilot AI Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commented-out line appears to be debug/test code that should be removed from production code.

Suggested change
# response = await client.chat.completions.create(messages = [{"role": "user", "content": content}], model="dummy")

Copilot uses AI. Check for mistakes.
async def get_gaia_answer_type(
task_description: str, api_key: str, base_url: str = "https://api.openai.com/v1"
) -> str:
# client = AsyncOpenAI(api_key=api_key, timeout=600)
Copy link

Copilot AI Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commented-out line should be removed as it's redundant with the active line below it.

Suggested change
# client = AsyncOpenAI(api_key=api_key, timeout=600)

Copilot uses AI. Check for mistakes.
message_id = _generate_message_id()
response = await client.chat.completions.create(
model="o3",
messages=[{"role": "user", "content": f"[{message_id}] {full_prompt}"}],
Copy link

Copilot AI Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commented-out parameter should either be removed or properly documented if it's intentionally disabled for generic LLM compatibility.

Suggested change
messages=[{"role": "user", "content": f"[{message_id}] {full_prompt}"}],
messages=[{"role": "user", "content": f"[{message_id}] {full_prompt}"}],
# The 'reasoning_effort' parameter is intentionally disabled for compatibility with generic LLM APIs.
# Uncomment and set as needed if using an LLM that supports this parameter, e.g., OpenAI's proprietary models.

Copilot uses AI. Check for mistakes.
@BinWang28 BinWang28 merged commit 6ec4972 into miroflow-v0.3 Oct 1, 2025
1 of 2 checks passed
@BinWang28 BinWang28 deleted the mirothinker branch October 1, 2025 03:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants