Skip to content

Commit 6cd8689

Browse files
authored
feat(literature): add literatures (#63)
* update literature * to pass lint
1 parent db41377 commit 6cd8689

File tree

6 files changed

+192
-63
lines changed

6 files changed

+192
-63
lines changed

docs/mkdocs/docs/all_about_agents.md

Lines changed: 180 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -7,104 +7,230 @@ Welcome to our comprehensive resource collection for AI agents. This page curate
77
## Table of Contents
88

99
!!! abstract "Resource Categories"
10-
1. [Agent Frameworks](#agent-frameworks)
11-
2. [Agent Memory](#agent-memory)
12-
3. [Papers](#papers)
13-
4. [Evaluation](#evaluation)
10+
1. [Agent Papers](#papers)
11+
2. [Agent Frameworks](#agent-frameworks)
12+
3. [Evaluation](#evaluation)
13+
4. [Agent Memory](#agent-memory)
14+
5. [Blogs](#blogs)
1415

1516
---
1617

17-
## Agent Frameworks
18-
19-
!!! info "Popular Agent Development Frameworks"
20-
Comprehensive frameworks for building and deploying AI agents across different domains.
21-
22-
- **MiroFlow**: Build, manage, and scale your AI agents with ease
23-
- [:material-github: GitHub](https://github.com/MiroMindAI/MiroFlow)
24-
25-
- **Youtu-Agent**: A simple yet powerful agent framework that delivers with open-source models
26-
- [:material-github: GitHub](https://github.com/TencentCloudADP/youtu-agent)
27-
28-
- **OpenManus**: No fortress, purely open ground. OpenManus is Coming
29-
- [:material-github: GitHub](https://github.com/FoundationAgents/OpenManus)
30-
31-
- **OpenBB Platform**: Financial data platform for analysts, quants and AI agents
32-
- [:material-link: Project](https://github.com/OpenBB-finance/OpenBB)
33-
34-
---
35-
36-
## Agent Memory
37-
38-
!!! tip "Memory Systems for Persistent Agent Intelligence"
39-
Advanced memory solutions for building agents with long-term context and learning capabilities.
40-
41-
- **Mem0**: Building Production- Ready AI Agents with Scalable Long-Term Memory
42-
- [:material-github: GitHub](https://github.com/mem0ai/mem0)
43-
44-
- **memobase**: Profile-Based Long-Term Memory for AI Applications
45-
- [:material-github: GitHub](https://github.com/memodb-io/memobase)
46-
47-
- **Memento**: Fine-tuning LLM Agents without Fine-tuning LLMs
48-
- [:material-file-document: Paper](https://arxiv.org/abs/2508.16153) · [:material-github: GitHub](https://github.com/Agent-on-the-Fly/Memento)
49-
50-
---
51-
52-
53-
## Papers
18+
### Agent Papers
5419

5520
!!! note "Research Papers & Publications"
5621
Latest research in agent systems, methodologies, and theoretical foundations.
5722

23+
- **WebThinker**: Empowering Large Reasoning Models with Deep Research Capability
24+
- [:material-file-document: Paper](https://arxiv.org/abs/2504.21776) · [:material-github: GitHub](https://github.com/sunnynexus/WebThinker)
5825
- **Profile-Aware Maneuvering**: A Dynamic Multi-Agent System for Robust GAIA Problem Solving by AWorld
5926
- [:material-file-document: Paper](https://arxiv.org/abs/2508.09889)
60-
6127
- **AFlow**: Automating Agentic Workflow Generation
6228
- [:material-file-document: Paper](https://arxiv.org/abs/2410.10762)
63-
6429
- **AgentFly**: Fine-tuning LLM Agents without Fine-tuning LLMs
6530
- [:material-file-document: Paper](https://arxiv.org/abs/2508.16153v2)
66-
6731
- **Throttling Web Agents Using Reasoning Gates**
6832
- [:material-file-document: Paper](https://arxiv.org/abs/2509.01619)
69-
7033
- **The Landscape of Agentic Reinforcement Learning for LLMs**: A Survey
7134
- [:material-file-document: Paper](https://arxiv.org/abs/2509.02547)
35+
- **BrowseMaster**: Towards Scalable Web Browsing via Tool-Augmented Programmatic Agent Pair
36+
- [:material-file-document: Paper](https://arxiv.org/abs/2508.09129) · [:material-github: GitHub](https://github.com/sjtu-sai-agents/Browse-Master)
37+
- **Long Term Memory**: The Foundation of AI Self-Evolution
38+
- [:material-file-document: Paper](https://arxiv.org/abs/2410.15665)
39+
- **DeepResearcher**: Scaling Deep Research via Reinforcement Learning in Real-world Environments
40+
- [:material-file-document: Paper](https://arxiv.org/abs/2504.03160) · [:material-github: GitHub](https://github.com/GAIR-NLP/DeepResearcher)
41+
- **Web-Shepherd**: Advancing PRMs for Reinforcing Web Agents
42+
- [:material-file-document: Paper](https://arxiv.org/abs/2505.15277) · [:material-github: GitHub](https://github.com/kyle8581/Web-Shepherd)
43+
- **SimpleDeepSearcher**: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis
44+
- [:material-file-document: Paper](https://arxiv.org/abs/2505.16834) · [:material-github: GitHub](https://github.com/RUCAIBox/SimpleDeepSearcher)
45+
- **Alita**: Generalist Agent Enabling Scalable Agentic Reasoning with Minimal Predefinition and Maximal Self-Evolution
46+
- [:material-file-document: Paper](https://arxiv.org/abs/2505.20286) · [:material-github: GitHub](https://github.com/CharlesQ9/Alita)
47+
- **MCP-Zero**: Active Tool Discovery for Autonomous LLM Agents
48+
- [:material-file-document: Paper](https://arxiv.org/abs/2506.01056) · [:material-github: GitHub](https://github.com/xfey/MCP-Zero)
49+
- **AgentOrchestra**: Orchestrating Hierarchical Multi-Agent Intelligence with the Tool-Environment-Agent(TEA) Protocol
50+
- [:material-file-document: Paper](https://arxiv.org/abs/2506.12508) · [:material-github: GitHub](https://github.com/SkyworkAI/DeepResearchAgent)
51+
- **Deep Research Agents**: A Systematic Examination And Roadmap
52+
- [:material-file-document: Paper](https://arxiv.org/abs/2506.18096) · [:material-github: GitHub](https://github.com/ai-agents-2030/awesome-deep-research-agent)
53+
- **SciMaster**: Towards General-Purpose Scientific AI Agents, Part I. X-Master as Foundation: Can We Lead on Humanity's Last Exam?
54+
- [:material-file-document: Paper](https://arxiv.org/abs/2507.05241) · [:material-github: GitHub](https://github.com/sjtu-sai-agents/X-Master)
55+
- **Deep Researcher with Test-Time Diffusion**: Enhancing research capabilities through diffusion-based test-time adaptation
56+
- [:material-file-document: Paper](https://arxiv.org/abs/2507.16075)
57+
- **Multi-Agent Tool-Integrated Policy Optimization**: Enhancing multi-agent systems through integrated tool usage and policy optimization
58+
- [:material-file-document: Paper](https://arxiv.org/abs/2510.04678)
59+
- **WALT**: Web Agents that Learn Tools
60+
- [:material-file-document: Paper](https://arxiv.org/abs/2510.01524)
61+
- **Learning to Route**: A Rule-Driven Agent Framework for Hybrid-Source Retrieval-Augmented Generation
62+
- [:material-file-document: Paper](https://arxiv.org/abs/2510.02388)
63+
- **SurveyBench**: Can LLM(-Agents) Write Academic Surveys that Align with Reader Needs?
64+
- [:material-file-document: Paper](https://arxiv.org/abs/2510.03120)
65+
- **FocusAgent**: Simple Yet Effective Ways of Trimming the Large Context of Web Agents
66+
- [:material-file-document: Paper](https://arxiv.org/pdf/2510.03204)
67+
- **LLM-Based Data Science Agents**: A Survey of Capabilities, Challenges, and Future Directions
68+
- [:material-file-document: Paper](https://arxiv.org/pdf/2510.04023)
69+
- **Agentic Context Engineering**: Evolving Contexts for Self-Improving Language Models
70+
- [:material-file-document: Paper](https://arxiv.org/abs/2510.04618)
71+
- **MARS**: Optimizing Dual-System Deep Research via Multi-Agent Reinforcement Learning
72+
- [:material-file-document: Paper](https://arxiv.org/abs/2510.04935)
73+
- **QuantAgents**: Towards Multi-agent Financial System via Simulated Trading
74+
- [:material-file-document: Paper](https://arxiv.org/abs/2510.04643) · [:material-link: Project](https://quantagents.github.io/)
75+
- **Small Language Models for Agentic Systems**: A Survey of Architectures, Capabilities, and Deployment Trade-offs
76+
- [:material-file-document: Paper](https://arxiv.org/abs/2510.03847)
77+
- **Open Agent Specification (Agent Spec)**: Technical Report
78+
- [:material-file-document: Paper](https://arxiv.org/abs/2510.04173v1)
79+
- **AudioToolAgent**: An Agentic Framework for Audio-Language Models
80+
- [:material-file-document: Paper](https://arxiv.org/abs/2510.02995)
81+
- **ThinkBrake**: Mitigating Overthinking in Tool Reasoning
82+
- [:material-file-document: Paper](https://arxiv.org/abs/2510.00546)
83+
- **TOUCAN**: Synthesizing 1.5M Tool-Agentic Trajectories from Real Environments
84+
- [:material-file-document: Paper](https://arxiv.org/abs/2510.01179)
85+
- **ToolTweak**: An Attack on Tool Selection in LLM-Based Agents
86+
- [:material-file-document: Paper](https://arxiv.org/abs/2510.02554)
87+
- **ToolBrain**: A Flexible RL Framework for Agentic Tools
88+
- [:material-file-document: Paper](https://arxiv.org/abs/2510.00023)
89+
- **TUMIX**: Multi-Agent Test-Time Scaling with Tool-Use Mixture
90+
- [:material-file-document: Paper](https://arxiv.org/abs/2510.01279)
91+
7292

7393
---
7494

95+
### Agent Frameworks
96+
97+
!!! info "Popular Agent Development Frameworks"
98+
Comprehensive frameworks for building and deploying AI agents across different domains.
99+
100+
- **MiroFlow**: Build, manage, and scale your AI agents with ease
101+
- [:material-github: GitHub](https://github.com/MiroMindAI/MiroFlow)
102+
- **Youtu-Agent**: A simple yet powerful agent framework that delivers with open-source models
103+
- [:material-github: GitHub](https://github.com/TencentCloudADP/youtu-agent)
104+
- **OpenManus**: No fortress, purely open ground. OpenManus is Coming
105+
- [:material-github: GitHub](https://github.com/FoundationAgents/OpenManus)
106+
- **OpenBB Platform**: Financial data platform for analysts, quants and AI agents
107+
- [:material-link: Project](https://github.com/OpenBB-finance/OpenBB)
108+
- **TradingAgents**: Multi-Agents LLM Financial Trading Framework
109+
- [:material-file-document: Paper](https://arxiv.org/abs/2412.20138) · [:material-github: GitHub](https://github.com/TauricResearch/TradingAgents)
110+
- **JoyAgent-JDGenie**: Technical Report on the GAIA
111+
- [:material-file-document: Paper](https://arxiv.org/abs/2510.00510) · [:material-github: GitHub](https://github.com/jd-opensource/joyagent-jdgenie)
112+
75113

76114

77-
## Evaluation
115+
---
116+
117+
### Evaluation
78118

79119
!!! success "Benchmarks & Evaluation Frameworks"
80120
Comprehensive evaluation tools and benchmarks for measuring agent performance across various tasks.
81121

82122
- **LiveMCP-101**: Stress Testing and Diagnosing MCP-enabled Agents on Challenging Queries
83123
- [:material-file-document: Paper](https://arxiv.org/abs/2508.15760)
84-
85124
- **BrowseComp-Plus**: A More Fair and Transparent Evaluation Benchmark of Deep-Research Agent
86125
- [:material-file-document: Paper](https://arxiv.org/abs/2508.06600)
87-
88126
- **HotpotQA**: A Dataset for Diverse, Explainable Multi-hop Question Answering
89127
- [:material-file-document: Paper](https://arxiv.org/abs/1809.09600)
90-
91128
- **GAIA**: a benchmark for General AI Assistants
92129
- [:material-file-document: Paper](https://arxiv.org/abs/2311.12983) · [:material-trophy: Leaderboard](https://huggingface.co/spaces/gaia-benchmark/leaderboard)
93-
94130
- **xbench**: Tracking Agents Productivity Scaling with Profession-Aligned Real-World Evaluations
95131
- [:material-file-document: Paper](https://arxiv.org/abs/2506.13651)
96-
97132
- **MCP-Universe**: Benchmarking Large Language Models with Real-World Model Context Protocol Servers
98133
- [:material-file-document: Paper](https://arxiv.org/abs/2508.14704)
99-
100134
- **FutureX**: An Advanced Live Benchmark for LLM Agents in Future Prediction
101135
- [:material-file-document: Paper](https://arxiv.org/abs/2508.11987)
102-
103136
- **Terminal-Bench**: the benchmark for testing AI agents in real terminal environments
104137
- [:material-github: GitHub](https://github.com/laude-institute/terminal-bench)
105-
106138
- **Gaia2 and ARE**: Empowering the Community to Evaluate Agents
107139
- [:material-file-document: Blog Post](https://huggingface.co/blog/gaia2)
140+
- **GPQA**: A Graduate-Level Google-Proof Q&A Benchmark
141+
- [:material-file-document: Paper](https://arxiv.org/abs/2311.12022) · [:material-github: GitHub](https://github.com/idavidrein/gpqa/)
142+
- **WebWalkerQA**: WebWalker: Benchmarking LLMs in Web Traversal
143+
- [:material-file-document: Paper](https://arxiv.org/abs/2501.07572) · [:material-github: GitHub](https://github.com/Alibaba-NLP/DeepResearch) · [:material-trophy: Leaderboard](https://huggingface.co/spaces/callanwu/WebWalkerQALeaderboard)
144+
- **HLE**: Humanity's Last Exam
145+
- [:material-file-document: Paper](https://arxiv.org/abs/2501.14249) · [:material-link: Website](https://lastexam.ai/)
146+
- **BFCL**: Berkeley Function Calling Leaderboard
147+
- [:material-github: GitHub](https://github.com/ShishirPatil/gorilla) · [:material-trophy: Leaderboard](https://gorilla.cs.berkeley.edu/leaderboard.html)
148+
- **When2Call**: When (not) to Call Tools
149+
- [:material-file-document: Paper](https://arxiv.org/abs/2504.18851) · [:material-github: GitHub](https://github.com/NVIDIA/When2Call)
150+
- **ToolSandbox**: A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use Capabilities
151+
- [:material-file-document: Paper](https://arxiv.org/abs/2408.04682) · [:material-github: GitHub](https://github.com/apple/ToolSandbox)
152+
- **ToolBeHonest**: A Multi-level Hallucination Diagnostic Benchmark for Tool-Augmented Large Language Models
153+
- [:material-file-document: Paper](https://arxiv.org/abs/2406.20015) · [:material-github: GitHub](https://github.com/ToolBeHonest/ToolBeHonest)
154+
- **SuperGPQA**: Scaling LLM Evaluation across 285 Graduate Disciplines
155+
- [:material-file-document: Paper](https://arxiv.org/abs/2502.14739) · [:material-link: Website](https://supergpqa.github.io/)
156+
- **Terminal-Bench**: A benchmark for testing AI agents in terminal environments
157+
- [:material-trophy: Leaderboard](https://www.tbench.ai/leaderboard) · [:material-link: Website](https://www.tbench.ai/)
158+
- **τ-bench**: A Benchmark for Tool-Agent-User Interaction in Real-World Domains
159+
- [:material-file-document: Paper](https://arxiv.org/abs/2406.12045) · [:material-github: GitHub](https://github.com/sierra-research/tau-bench)
160+
- **τ2-Bench**: Evaluating Conversational Agents in a Dual-Control Environment
161+
- [:material-file-document: Paper](https://arxiv.org/abs/2506.07982) · [:material-github: GitHub](https://github.com/sierra-research/tau2-bench)
162+
- **Deep Research Bench**: Evaluating AI Web Research Agents
163+
- [:material-file-document: Paper](https://arxiv.org/abs/2506.06287) · [:material-link: Website](https://drb.futuresearch.ai/)
164+
- **Beyond the Final Answer**: Evaluating the Reasoning Trajectories of Tool-Augmented Agents
165+
- [:material-file-document: Paper](https://arxiv.org/abs/2510.02837)
166+
- **TRAJECT-Bench**: A Trajectory-Aware Benchmark for Evaluating Agentic Tool Use
167+
- [:material-file-document: Paper](https://arxiv.org/abs/2510.04550)
168+
- **ARC-AGI**: The General Intelligence Benchmark
169+
- [:material-link: Website](https://arcprize.org/arc-agi)
170+
171+
172+
---
173+
174+
### Agent Memory
175+
176+
!!! tip "Memory Systems for Persistent Agent Intelligence"
177+
Advanced memory solutions for building agents with long-term context and learning capabilities.
178+
179+
- **Mem0**: Building Production- Ready AI Agents with Scalable Long-Term Memory
180+
- [:material-github: GitHub](https://github.com/mem0ai/mem0)
181+
- **memobase**: Profile-Based Long-Term Memory for AI Applications
182+
- [:material-github: GitHub](https://github.com/memodb-io/memobase)
183+
- **Memento**: Fine-tuning LLM Agents without Fine-tuning LLMs
184+
- [:material-file-document: Paper](https://arxiv.org/abs/2508.16153) · [:material-github: GitHub](https://github.com/Agent-on-the-Fly/Memento)
185+
- **MEMTRACK**: Evaluating Long-Term Memory and State Tracking in Multi-Platform Dynamic Agent Environments
186+
- [:material-file-document: Paper](https://arxiv.org/abs/2510.01353)
187+
- **A-MEM**: Agentic Memory for LLM Agents
188+
- [:material-file-document: Paper](https://arxiv.org/abs/2502.12110) · [:material-github: GitHub](https://github.com/WujiangXu/A-mem)
189+
- **MemoryOS**: Memory OS of AI Agent
190+
- [:material-file-document: Paper](https://arxiv.org/abs/2506.06326) · [:material-github: GitHub](https://github.com/BAI-LAB/MemoryOS)
191+
- **Memory-R1**: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning
192+
- [:material-file-document: Paper](https://arxiv.org/abs/2508.19828)
193+
- **HippoRAG**: Neurobiologically Inspired Long-Term Memory for Large Language Models
194+
- [:material-file-document: Paper](https://arxiv.org/abs/2405.14831) · [:material-github: GitHub](https://github.com/OSU-NLP-Group/HippoRAG)
195+
- **MaxKB**: Open-source platform for building enterprise-grade agents
196+
- [:material-github: GitHub](https://github.com/1Panel-dev/MaxKB)
197+
- **MemAgent**: Reshaping Long-Context LLM with Multi-Conv RL-based Memory Agent
198+
- [:material-file-document: Paper](https://arxiv.org/abs/2507.02259) · [:material-link: Website](https://memagent-sialab.github.io/)
199+
- **LEGOMem**: Modular Procedural Memory for Multi-agent LLM Systems for Workflow Automation
200+
- [:material-file-document: Paper](https://arxiv.org/abs/2510.04851)
201+
- **Memp**: Exploring Agent Procedural Memory
202+
- [:material-file-document: Paper](https://www.arxiv.org/abs/2508.06433)
203+
- **MIRIX**: Multi-Agent Memory System for LLM-Based Agents
204+
- [:material-file-document: Paper](https://arxiv.org/abs/2507.07957) · [:material-link: Website](https://mirix.io/)
205+
- **A-MemGuard**: A Proactive Defense Framework for LLM-Based Agent Memory
206+
- [:material-file-document: Paper](https://www.arxiv.org/abs/2510.02373)
207+
208+
209+
---
210+
211+
## Blogs
212+
213+
!!! info "Blog Posts & Tutorials"
214+
Curated collection of blog posts, tutorials, and articles about AI agents from various sources and languages.
215+
216+
#### General Blogs
217+
218+
- **ChatGPT Agent**: Introducing ChatGPT Agent
219+
- [:material-link: Blog Post](https://openai.com/index/introducing-chatgpt-agent/) · OpenAI's latest agent capabilities and features
220+
221+
- **Tongyi DeepResearch**: Deep Research Agent for Complex Tasks
222+
- [:material-link: Blog Post](https://tongyi-agent.github.io/blog/introducing-tongyi-deep-research/) · Alibaba's advanced research agent system
223+
224+
#### Chinese Blogs
225+
226+
!!! quote "中文博客与资源"
227+
精选的中文AI智能体相关博客文章、教程和资源,帮助中文用户更好地理解和应用智能体技术。
228+
229+
- **17个主流 Agent 框架快速对比**
230+
- [:material-link: 博客链接](https://zhuanlan.zhihu.com/p/1957319210951746186) · 知乎专栏文章,对比分析主流智能体框架
231+
232+
- **通义 DeepResearch**
233+
- [:material-link: Blog Post](https://tongyi-agent.github.io/zh/blog/introducing-tongyi-deep-research/) · 阿里巴巴通义智能体深度研究系统介绍
108234

109235
---
110236

docs/mkdocs/docs/openai-gpt.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -6,14 +6,14 @@ OpenAI's latest models including GPT-5, GPT-4o and advanced reasoning models wit
66

77
`GPT5OpenAIClient`
88

9-
## Environment Setup
9+
### Environment Setup
1010

1111
```bash title="Environment Variables"
1212
export OPENAI_API_KEY="your-openai-key"
1313
export OPENAI_BASE_URL="https://api.openai.com/v1" # optional
1414
```
1515

16-
## Configuration
16+
### Configuration
1717

1818
```yaml title="Agent Configuration"
1919
main_agent:
@@ -31,7 +31,7 @@ main_agent:
3131
openai_base_url: "${oc.env:OPENAI_BASE_URL,https://api.openai.com/v1}"
3232
```
3333
34-
## Usage
34+
### Usage
3535
3636
```bash title="Example Command"
3737
# Create custom OpenAI config
@@ -43,14 +43,14 @@ uv run main.py trace --config_file_name=your_config_file \
4343

4444
`GPTOpenAIClient`
4545

46-
## Environment Setup
46+
### Environment Setup
4747

4848
```bash title="Environment Variables"
4949
export OPENAI_API_KEY="your-openai-key"
5050
export OPENAI_BASE_URL="https://api.openai.com/v1" # optional
5151
```
5252

53-
## Configuration
53+
### Configuration
5454

5555
```yaml title="Agent Configuration"
5656
main_agent:
@@ -61,7 +61,7 @@ main_agent:
6161
openai_base_url: "${oc.env:OPENAI_BASE_URL,https://api.openai.com/v1}"
6262
```
6363
64-
## Usage
64+
### Usage
6565
6666
```bash title="Example Command"
6767
# Create custom OpenAI config

src/llm/providers/claude_openrouter_client.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -411,4 +411,4 @@ def _apply_cache_control(self, messages):
411411
else:
412412
# Other messages add directly
413413
cached_messages.append(turn)
414-
return list(reversed(cached_messages))
414+
return list(reversed(cached_messages))

src/llm/providers/gpt5_openai_client.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -132,7 +132,7 @@ async def _create_message(
132132
extra_body["min_p"] = self.min_p
133133
if self.repetition_penalty != 1.0:
134134
extra_body["repetition_penalty"] = self.repetition_penalty
135-
135+
136136
assert self.model_name in ["gpt-5-2025-08-07", "gpt-5"]
137137
params = {
138138
"model": self.model_name,

0 commit comments

Comments
 (0)