-
Notifications
You must be signed in to change notification settings - Fork 155
feat(tool): incorporate open-source tools from MiroThinker #60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Resolved formatting conflicts in utils/extract_futurex_results.py - Resolved formatting conflicts in utils/prepare_benchmark/gen_futurex.py - Resolved formatting conflicts in utils/progress_check/check_futurex_progress.py All conflicts were due to code formatting differences (whitespace, line breaks, trailing commas). Functionality remains identical between branches.
…ress file to exclude T1.
… greater china respectively.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adapts and incorporates open-source tools from MiroThinker, adding three new MCP servers that provide vision, reasoning, and audio processing capabilities using open-source models.
- Added three new open-source MCP servers (vision, reasoning, and audio) with robust error handling
- Created comprehensive documentation for deploying and using the open-source models
- Added YAML configuration files to integrate the new tools into the existing tool system
Reviewed Changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
src/tool/mcp_servers/vision_mcp_server_os.py |
New vision MCP server for VQA using open-source models like Qwen2.5-VL |
src/tool/mcp_servers/reasoning_mcp_server_os.py |
New reasoning MCP server with retry logic for complex problem solving |
src/tool/mcp_servers/audio_mcp_server_os.py |
New audio transcription server using open-source Whisper models |
docs/mkdocs/mkdocs.yml |
Updated navigation to include documentation for new open-source tools |
docs/mkdocs/docs/tool_vqa_os.md |
Documentation for open-source vision tool deployment and usage |
docs/mkdocs/docs/tool_reasoning_os.md |
Documentation for open-source reasoning tool deployment and usage |
docs/mkdocs/docs/tool_audio_os.md |
Documentation for open-source audio tool deployment and usage |
config/tool/tool-reasoning-os.yaml |
Configuration file for reasoning tool integration |
config/tool/tool-image-video-os.yaml |
Configuration file for vision tool integration |
config/tool/tool-audio-os.yaml |
Configuration file for audio tool integration |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
|
||
| payload = {"model": VISION_MODEL_NAME, "messages": messages_for_llm} | ||
|
|
||
| response = requests.post(VISION_BASE_URL, json=payload, headers=headers) |
Copilot
AI
Oct 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using synchronous requests.post in an async function can block the event loop. Consider using aiohttp.ClientSession().post() instead since you're already importing and using aiohttp elsewhere in the function.
| if duration > 0: | ||
| return duration | ||
| except Exception as e: | ||
| return f"[ERROR]: Failed to get audio duration: {e}" |
Copilot
AI
Oct 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function _get_audio_duration should return a float according to its type hint and usage context, but this exception handler returns a string. This could cause type errors when the returned value is used in calculations.
| return f"[ERROR]: Failed to get audio duration: {e}" | |
| return 0.0 |
|
|
||
| @mcp.tool() | ||
| async def reasoning(question: str) -> str: | ||
| """You can use this tool use solve hard math problem, puzzle, riddle and IQ test question that requires a lot of chain of thought efforts. |
Copilot
AI
Oct 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Grammar error: 'use solve' should be 'to solve'. The sentence should read: 'You can use this tool to solve hard math problem...'
| """You can use this tool use solve hard math problem, puzzle, riddle and IQ test question that requires a lot of chain of thought efforts. | |
| """You can use this tool to solve hard math problem, puzzle, riddle and IQ test question that requires a lot of chain of thought efforts. |
docs/mkdocs/docs/tool_audio_os.md
Outdated
| --- | ||
|
|
||
| !!! info "Documentation Info" | ||
| **Last Updated:** January 2025 · **Doc Contributor:** Team @ MiroMind AI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be "October 2025"
Describe this PR
Adapted open-source tools from Mirothinker and add relevant docs on deploying open-source models.
Checklist for PR
Must Do
feat(agent): add pdf tool via mcp,perf: make llm client asyncandfix(utils): load custom config via importlibetc. CI jobcheck-pr-titleenforces Angular commit message format to PR title.make precommitlocally. CI joblintenforce ruff default format/lint rules on all new codes.make pytest. Check test summary (located atreport.html) and coverage report (located athtmlcov/index.html) on new codes.Nice To Have
/testsforfeatandtestPR./docsfordocsandciPR.