Skip to content
Merged
Show file tree
Hide file tree
Changes from 19 commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
56b235d
upd: add futurex evaluation support.
JubSteven Sep 18, 2025
287a7bc
upd: support multiple eval for futurex and add relavent doc.
JubSteven Sep 18, 2025
bf43b37
upd: fix bugs with doc for futurex.
JubSteven Sep 18, 2025
d1e1637
debug: fix wrong calling path.
JubSteven Sep 18, 2025
eb6f302
add preparation for finsearchcomp.
JubSteven Sep 24, 2025
4dabaee
update a premature version of finsearchcomp benchmark.
JubSteven Sep 24, 2025
5ea9b61
Resolve merge conflicts in FutureX utility files
JubSteven Sep 24, 2025
c086e41
clean redundent code in merging.
JubSteven Sep 24, 2025
d6a8715
upd: modify yaml to use Mirothinker as the main agent, add check prog…
JubSteven Sep 25, 2025
e7163d3
upd: check_progress function for finsearchcomp now consider globe and…
JubSteven Sep 25, 2025
b0e494f
Merge remote-tracking branch 'upstream/miroflow-v0.3' into explorations
JubSteven Sep 25, 2025
256ba2c
upd: add docs and shell script for multiple runs.
JubSteven Sep 25, 2025
835e590
fix: check_finsearchcomp_progress not displaying results from greater…
JubSteven Sep 25, 2025
5ffc269
Merge remote-tracking branch 'upstream/miroflow-v0.3' into explorations
JubSteven Sep 25, 2025
4918ee2
Merge branch 'miroflow-v0.3' into explorations
JubSteven Sep 25, 2025
72e9bb6
fix: catch ContextLimitError in more observed cases.
JubSteven Sep 25, 2025
e589468
initialize open source tools for audio, vision and reasoning.
JubSteven Oct 1, 2025
948d856
Merge remote-tracking branch 'upstream/miroflow-v0.3' into explorations
JubSteven Oct 1, 2025
15a7ef9
upd: docs for open-source tools.
JubSteven Oct 2, 2025
bf786ca
fix wrong date.
JubSteven Oct 2, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions config/tool/tool-audio-os.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
name: "tool-audio-os"
tool_command: "python"
args:
- "-m"
- "src.tool.mcp_servers.audio_mcp_server_os"
env:
WHISPER_API_KEY: "${oc.env:WHISPER_API_KEY}"
WHISPER_BASE_URL: "${oc.env:WHISPER_BASE_URL}"
WHISPER_MODEL_NAME: "${oc.env:WHISPER_MODEL_NAME}"
9 changes: 9 additions & 0 deletions config/tool/tool-image-video-os.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
name: "tool-image-video-os"
tool_command: "python"
args:
- "-m"
- "src.tool.mcp_servers.vision_mcp_server_os"
env:
VISION_API_KEY: "${oc.env:VISION_API_KEY}"
VISION_BASE_URL: "${oc.env:VISION_BASE_URL}"
VISION_MODEL_NAME: "${oc.env:VISION_MODEL_NAME}"
9 changes: 9 additions & 0 deletions config/tool/tool-reasoning-os.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
name: "tool-reasoning-os"
tool_command: "python"
args:
- "-m"
- "src.tool.mcp_servers.reasoning_mcp_server_os"
env:
REASONING_API_KEY: "${oc.env:REASONING_API_KEY}"
REASONING_BASE_URL: "${oc.env:REASONING_BASE_URL}"
REASONING_MODEL_NAME: "${oc.env:REASONING_MODEL_NAME}"
149 changes: 149 additions & 0 deletions docs/mkdocs/docs/tool_audio_os.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
# Audio Tools - Open Source (`audio_mcp_server_os.py`)

The Audio MCP Server (Open Source) enables audio transcription using open-source Whisper models. It provides comprehensive audio-to-text conversion with support for multiple audio formats, local files, and URLs.

!!! info "Available Functions"
This MCP server provides the following functions that agents can call:

- **Audio Transcription**: High-quality speech-to-text conversion
- **Multi-Format Support**: MP3, WAV, M4A, AAC, OGG, FLAC, WMA formats
- **Flexible Input**: Local file paths and web URLs
- **Open-Source Model Support**: Whisper-Large-v3-Turbo with automatic processing

---

## Environment Variables

!!! warning "Configuration Location"
The `audio_mcp_server_os.py` reads environment variables that are passed through the `tool-audio-os.yaml` configuration file, not directly from `.env` file.

**Open-Source Model Configuration:**

- `WHISPER_API_KEY`: Required API key for the open-source Whisper service
- `WHISPER_BASE_URL`: Base URL for the Whisper service API endpoint
- `WHISPER_MODEL_NAME`: Model name (default: `openai/whisper-large-v3-turbo`)

**Example Configuration:**
```bash
# API for Open-Source Audio Transcription Tool (for benchmark testing)
WHISPER_MODEL_NAME="openai/whisper-large-v3-turbo"
WHISPER_API_KEY=your_whisper_key
WHISPER_BASE_URL="https://your_whisper_base_url/v1"
```

---

## Local Deployment

### Using vLLM Server

For optimal performance with the Whisper-Large-v3-Turbo model, deploy using vLLM:

```bash
pip install vllm==0.10.0
pip install vllm[audio]
vllm serve /path/to/whisper \
--served-model-name whisper-large-v3-turbo \
--task transcription
```

### Configuration for Local Deployment

When using local deployment, configure your environment variables:

```bash
WHISPER_MODEL_NAME="openai/whisper-large-v3-turbo"
WHISPER_API_KEY="dummy_key" # Not required for local deployment
WHISPER_BASE_URL="http://localhost:8000/v1"
```

---

## Function Reference

The following function is provided by the `audio_mcp_server_os.py` MCP tool and can be called by agents:

### `audio_transcription(audio_path_or_url: str)`

Transcribe audio files to text using open-source Whisper models. Supports both local files and web URLs with automatic format detection and processing.

**Parameters:**

- `audio_path_or_url`: Local file path (accessible to server) or web URL

**Returns:**

- `str`: The transcription of the audio file

**Supported Audio Formats:**
- MP3 (.mp3)
- WAV (.wav)
- M4A (.m4a)
- AAC (.aac)
- OGG (.ogg)
- FLAC (.flac)
- WMA (.wma)

## Usage Examples

### Local File Transcription
```python
# Local file transcription
result = audio_transcription(
audio_path_or_url="/path/to/audio.mp3"
)
```

### URL-based Transcription
```python
# URL transcription
result = audio_transcription(
audio_path_or_url="https://example.com/audio.wav"
)
```

### Meeting Recording Transcription
```python
result = audio_transcription(
audio_path_or_url="meeting_recording.m4a"
)
```

### Podcast Transcription
```python
result = audio_transcription(
audio_path_or_url="podcast_episode.mp3"
)
```

---

## Technical Implementation

### Audio Processing Pipeline

1. **Input Validation**: Checks if input is local file or URL
2. **Format Detection**: Determines audio format from extension or content type
3. **File Handling**: Downloads URL files to temporary storage with proper extensions
4. **API Request**: Sends audio file to Whisper model for transcription
5. **Cleanup**: Removes temporary files after processing
6. **Response Processing**: Returns transcription text

### Error Handling

- **File Access Errors**: Graceful handling of inaccessible local files
- **Network Errors**: Robust URL fetching with retry logic (up to 3 attempts)
- **Format Errors**: Automatic format detection and validation
- **API Errors**: Clear error reporting for service issues
- **Sandbox Restrictions**: Prevents access to sandbox files with clear error messages

### Retry Logic

- **Maximum Retries**: 3 attempts for failed requests
- **Exponential Backoff**: 5, 10, 20 second delays between retries
- **Network Resilience**: Handles temporary network issues and service unavailability

---

!!! info "Documentation Info"
**Last Updated:** January 2025 · **Doc Contributor:** Team @ MiroMind AI
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be "October 2025"

135 changes: 135 additions & 0 deletions docs/mkdocs/docs/tool_reasoning_os.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
# Reasoning Tools - Open Source (`reasoning_mcp_server_os.py`)

The Reasoning MCP Server (Open Source) provides a **pure text-based reasoning engine** using open-source models. It supports logical analysis, problem solving, and planning, with robust retry mechanisms and exponential backoff for reliability.

!!! info "Available Functions"
This MCP server provides the following functions that agents can call:

- **Pure Text Reasoning**: Logical analysis and problem solving using open-source LLM backends
- **Step-by-Step Analysis**: Structured reasoning with detailed explanations
- **Open-Source Model Support**: Qwen3-235B-A22B-Thinking-2507 with automatic fallback
- **Robust Error Handling**: Exponential backoff retry logic (up to 10 attempts)

---

## Environment Variables

!!! warning "Configuration Location"
The `reasoning_mcp_server_os.py` reads environment variables that are passed through the `tool-reasoning-os.yaml` configuration file, not directly from `.env` file.

**Open-Source Model Configuration:**

- `REASONING_API_KEY`: Required API key for the open-source reasoning service
- `REASONING_BASE_URL`: Base URL for the reasoning service API endpoint
- `REASONING_MODEL_NAME`: Model name (default: `Qwen/Qwen3-235B-A22B-Thinking-2507`)

**Example Configuration:**
```bash
# API for Open-Source Reasoning Tool (for benchmark testing)
REASONING_MODEL_NAME="Qwen/Qwen3-235B-A22B-Thinking-2507"
REASONING_API_KEY=your_reasoning_key
REASONING_BASE_URL="https://your_reasoning_base_url/v1/chat/completions"
```

---

## Local Deployment

### Using SGLang Server

For optimal performance with the Qwen3-235B-A22B-Thinking model, deploy using SGLang:

```bash
python3 -m sglang.launch_server \
--model-path /path/to/Qwen3-235B-A22B-Thinking-2507 \
--tp 8 --host 0.0.0.0 --port 1234 \
--trust-remote-code --enable-metrics \
--log-level debug --log-level-http debug \
--log-requests --log-requests-level 2 \
--show-time-cost --context-length 131072
```

### Configuration for Local Deployment

When using local deployment, configure your environment variables:

```bash
REASONING_MODEL_NAME="Qwen/Qwen3-235B-A22B-Thinking-2507"
REASONING_API_KEY="dummy_key" # Not required for local deployment
REASONING_BASE_URL="http://localhost:1234/v1/chat/completions"
```

---

## Function Reference

The following function is provided by the `reasoning_mcp_server_os.py` MCP tool and can be called by agents:

### `reasoning(question: str)`

Perform step-by-step reasoning, analysis, and planning over a **text-only input**. This tool is specialized for **complex thinking tasks** that require deep analytical reasoning.

!!! note "Text-Only Processing"
This tool processes only the provided text input and will not fetch external data or context. Ensure all necessary information is included in the question.

**Parameters:**

- `question`: A detailed, complex question or problem statement that includes all necessary information

**Returns:**

- `str`: A structured, step-by-step reasoned answer

**Features:**

- **Open-Source Model**: Uses Qwen3-235B-A22B-Thinking-2507 for advanced reasoning
- **Robust Retry Logic**: Exponential backoff retry mechanism (up to 10 attempts)
- **Thinking Mode Support**: Automatically extracts reasoning content from thinking blocks
- **Error Handling**: Graceful fallback with informative error messages
- **Timeout Protection**: 600-second timeout for long-running reasoning tasks
- **Jittered Backoff**: Prevents thundering herd problems with randomized retry delays

**Retry Configuration:**
- Maximum retries: 10 attempts
- Initial backoff: 1.0 seconds
- Maximum backoff: 30.0 seconds
- Exponential backoff with jitter (0.8-1.2x multiplier)

---

## Usage Examples

### Complex Mathematical Problems
```python
question = """
Solve this complex optimization problem:
A company wants to minimize costs while maximizing production.
Given constraints: 2x + 3y ≤ 100, x + y ≤ 50, x ≥ 0, y ≥ 0
Cost function: C = 5x + 8y
Production function: P = 3x + 4y
Find the optimal values of x and y.
"""
```

### Logical Puzzles
```python
question = """
Three people are in a room: Alice, Bob, and Charlie.
- Alice says: "Bob is lying"
- Bob says: "Charlie is lying"
- Charlie says: "Alice is lying"
If exactly one person is telling the truth, who is it?
"""
```

### Strategic Planning
```python
question = """
Design a strategy for a startup to enter a competitive market
with limited resources. Consider market analysis, competitive
positioning, resource allocation, and risk mitigation.
"""
```

!!! info "Documentation Info"
**Last Updated:** January 2025 · **Doc Contributor:** Team @ MiroMind AI
Loading