@@ -7,104 +7,230 @@ Welcome to our comprehensive resource collection for AI agents. This page curate
77## Table of Contents
88
99!!! abstract "Resource Categories"
10- 1. [ Agent Frameworks] ( #agent-frameworks )
11- 2. [ Agent Memory] ( #agent-memory )
12- 3. [ Papers] ( #papers )
13- 4. [ Evaluation] ( #evaluation )
10+ 1. [ Agent Papers] ( #papers )
11+ 2. [ Agent Frameworks] ( #agent-frameworks )
12+ 3. [ Evaluation] ( #evaluation )
13+ 4. [ Agent Memory] ( #agent-memory )
14+ 5. [ Blogs] ( #blogs )
1415
1516---
1617
17- ## Agent Frameworks
18-
19- !!! info "Popular Agent Development Frameworks"
20- Comprehensive frameworks for building and deploying AI agents across different domains.
21-
22- - ** MiroFlow** : Build, manage, and scale your AI agents with ease
23- - [ :material-github: GitHub] ( https://github.com/MiroMindAI/MiroFlow )
24-
25- - ** Youtu-Agent** : A simple yet powerful agent framework that delivers with open-source models
26- - [ :material-github: GitHub] ( https://github.com/TencentCloudADP/youtu-agent )
27-
28- - ** OpenManus** : No fortress, purely open ground. OpenManus is Coming
29- - [ :material-github: GitHub] ( https://github.com/FoundationAgents/OpenManus )
30-
31- - ** OpenBB Platform** : Financial data platform for analysts, quants and AI agents
32- - [ :material-link: Project] ( https://github.com/OpenBB-finance/OpenBB )
33-
34- ---
35-
36- ## Agent Memory
37-
38- !!! tip "Memory Systems for Persistent Agent Intelligence"
39- Advanced memory solutions for building agents with long-term context and learning capabilities.
40-
41- - ** Mem0** : Building Production- Ready AI Agents with Scalable Long-Term Memory
42- - [ :material-github: GitHub] ( https://github.com/mem0ai/mem0 )
43-
44- - ** memobase** : Profile-Based Long-Term Memory for AI Applications
45- - [ :material-github: GitHub] ( https://github.com/memodb-io/memobase )
46-
47- - ** Memento** : Fine-tuning LLM Agents without Fine-tuning LLMs
48- - [ :material-file-document: Paper] ( https://arxiv.org/abs/2508.16153 ) · [ :material-github: GitHub] ( https://github.com/Agent-on-the-Fly/Memento )
49-
50- ---
51-
52-
53- ## Papers
18+ ### Agent Papers
5419
5520!!! note "Research Papers & Publications"
5621 Latest research in agent systems, methodologies, and theoretical foundations.
5722
23+ - ** WebThinker** : Empowering Large Reasoning Models with Deep Research Capability
24+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2504.21776 ) · [ :material-github: GitHub] ( https://github.com/sunnynexus/WebThinker )
5825- ** Profile-Aware Maneuvering** : A Dynamic Multi-Agent System for Robust GAIA Problem Solving by AWorld
5926 - [ :material-file-document: Paper] ( https://arxiv.org/abs/2508.09889 )
60-
6127- ** AFlow** : Automating Agentic Workflow Generation
6228 - [ :material-file-document: Paper] ( https://arxiv.org/abs/2410.10762 )
63-
6429- ** AgentFly** : Fine-tuning LLM Agents without Fine-tuning LLMs
6530 - [ :material-file-document: Paper] ( https://arxiv.org/abs/2508.16153v2 )
66-
6731- ** Throttling Web Agents Using Reasoning Gates**
6832 - [ :material-file-document: Paper] ( https://arxiv.org/abs/2509.01619 )
69-
7033- ** The Landscape of Agentic Reinforcement Learning for LLMs** : A Survey
7134 - [ :material-file-document: Paper] ( https://arxiv.org/abs/2509.02547 )
35+ - ** BrowseMaster** : Towards Scalable Web Browsing via Tool-Augmented Programmatic Agent Pair
36+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2508.09129 ) · [ :material-github: GitHub] ( https://github.com/sjtu-sai-agents/Browse-Master )
37+ - ** Long Term Memory** : The Foundation of AI Self-Evolution
38+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2410.15665 )
39+ - ** DeepResearcher** : Scaling Deep Research via Reinforcement Learning in Real-world Environments
40+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2504.03160 ) · [ :material-github: GitHub] ( https://github.com/GAIR-NLP/DeepResearcher )
41+ - ** Web-Shepherd** : Advancing PRMs for Reinforcing Web Agents
42+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2505.15277 ) · [ :material-github: GitHub] ( https://github.com/kyle8581/Web-Shepherd )
43+ - ** SimpleDeepSearcher** : Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis
44+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2505.16834 ) · [ :material-github: GitHub] ( https://github.com/RUCAIBox/SimpleDeepSearcher )
45+ - ** Alita** : Generalist Agent Enabling Scalable Agentic Reasoning with Minimal Predefinition and Maximal Self-Evolution
46+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2505.20286 ) · [ :material-github: GitHub] ( https://github.com/CharlesQ9/Alita )
47+ - ** MCP-Zero** : Active Tool Discovery for Autonomous LLM Agents
48+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2506.01056 ) · [ :material-github: GitHub] ( https://github.com/xfey/MCP-Zero )
49+ - ** AgentOrchestra** : Orchestrating Hierarchical Multi-Agent Intelligence with the Tool-Environment-Agent(TEA) Protocol
50+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2506.12508 ) · [ :material-github: GitHub] ( https://github.com/SkyworkAI/DeepResearchAgent )
51+ - ** Deep Research Agents** : A Systematic Examination And Roadmap
52+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2506.18096 ) · [ :material-github: GitHub] ( https://github.com/ai-agents-2030/awesome-deep-research-agent )
53+ - ** SciMaster** : Towards General-Purpose Scientific AI Agents, Part I. X-Master as Foundation: Can We Lead on Humanity's Last Exam?
54+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2507.05241 ) · [ :material-github: GitHub] ( https://github.com/sjtu-sai-agents/X-Master )
55+ - ** Deep Researcher with Test-Time Diffusion** : Enhancing research capabilities through diffusion-based test-time adaptation
56+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2507.16075 )
57+ - ** Multi-Agent Tool-Integrated Policy Optimization** : Enhancing multi-agent systems through integrated tool usage and policy optimization
58+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2510.04678 )
59+ - ** WALT** : Web Agents that Learn Tools
60+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2510.01524 )
61+ - ** Learning to Route** : A Rule-Driven Agent Framework for Hybrid-Source Retrieval-Augmented Generation
62+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2510.02388 )
63+ - ** SurveyBench** : Can LLM(-Agents) Write Academic Surveys that Align with Reader Needs?
64+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2510.03120 )
65+ - ** FocusAgent** : Simple Yet Effective Ways of Trimming the Large Context of Web Agents
66+ - [ :material-file-document: Paper] ( https://arxiv.org/pdf/2510.03204 )
67+ - ** LLM-Based Data Science Agents** : A Survey of Capabilities, Challenges, and Future Directions
68+ - [ :material-file-document: Paper] ( https://arxiv.org/pdf/2510.04023 )
69+ - ** Agentic Context Engineering** : Evolving Contexts for Self-Improving Language Models
70+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2510.04618 )
71+ - ** MARS** : Optimizing Dual-System Deep Research via Multi-Agent Reinforcement Learning
72+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2510.04935 )
73+ - ** QuantAgents** : Towards Multi-agent Financial System via Simulated Trading
74+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2510.04643 ) · [ :material-link: Project] ( https://quantagents.github.io/ )
75+ - ** Small Language Models for Agentic Systems** : A Survey of Architectures, Capabilities, and Deployment Trade-offs
76+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2510.03847 )
77+ - ** Open Agent Specification (Agent Spec)** : Technical Report
78+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2510.04173v1 )
79+ - ** AudioToolAgent** : An Agentic Framework for Audio-Language Models
80+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2510.02995 )
81+ - ** ThinkBrake** : Mitigating Overthinking in Tool Reasoning
82+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2510.00546 )
83+ - ** TOUCAN** : Synthesizing 1.5M Tool-Agentic Trajectories from Real Environments
84+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2510.01179 )
85+ - ** ToolTweak** : An Attack on Tool Selection in LLM-Based Agents
86+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2510.02554 )
87+ - ** ToolBrain** : A Flexible RL Framework for Agentic Tools
88+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2510.00023 )
89+ - ** TUMIX** : Multi-Agent Test-Time Scaling with Tool-Use Mixture
90+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2510.01279 )
91+
7292
7393---
7494
95+ ### Agent Frameworks
96+
97+ !!! info "Popular Agent Development Frameworks"
98+ Comprehensive frameworks for building and deploying AI agents across different domains.
99+
100+ - ** MiroFlow** : Build, manage, and scale your AI agents with ease
101+ - [ :material-github: GitHub] ( https://github.com/MiroMindAI/MiroFlow )
102+ - ** Youtu-Agent** : A simple yet powerful agent framework that delivers with open-source models
103+ - [ :material-github: GitHub] ( https://github.com/TencentCloudADP/youtu-agent )
104+ - ** OpenManus** : No fortress, purely open ground. OpenManus is Coming
105+ - [ :material-github: GitHub] ( https://github.com/FoundationAgents/OpenManus )
106+ - ** OpenBB Platform** : Financial data platform for analysts, quants and AI agents
107+ - [ :material-link: Project] ( https://github.com/OpenBB-finance/OpenBB )
108+ - ** TradingAgents** : Multi-Agents LLM Financial Trading Framework
109+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2412.20138 ) · [ :material-github: GitHub] ( https://github.com/TauricResearch/TradingAgents )
110+ - ** JoyAgent-JDGenie** : Technical Report on the GAIA
111+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2510.00510 ) · [ :material-github: GitHub] ( https://github.com/jd-opensource/joyagent-jdgenie )
112+
75113
76114
77- ## Evaluation
115+ ---
116+
117+ ### Evaluation
78118
79119!!! success "Benchmarks & Evaluation Frameworks"
80120 Comprehensive evaluation tools and benchmarks for measuring agent performance across various tasks.
81121
82122- ** LiveMCP-101** : Stress Testing and Diagnosing MCP-enabled Agents on Challenging Queries
83123 - [ :material-file-document: Paper] ( https://arxiv.org/abs/2508.15760 )
84-
85124- ** BrowseComp-Plus** : A More Fair and Transparent Evaluation Benchmark of Deep-Research Agent
86125 - [ :material-file-document: Paper] ( https://arxiv.org/abs/2508.06600 )
87-
88126- ** HotpotQA** : A Dataset for Diverse, Explainable Multi-hop Question Answering
89127 - [ :material-file-document: Paper] ( https://arxiv.org/abs/1809.09600 )
90-
91128- ** GAIA** : a benchmark for General AI Assistants
92129 - [ :material-file-document: Paper] ( https://arxiv.org/abs/2311.12983 ) · [ :material-trophy: Leaderboard] ( https://huggingface.co/spaces/gaia-benchmark/leaderboard )
93-
94130- ** xbench** : Tracking Agents Productivity Scaling with Profession-Aligned Real-World Evaluations
95131 - [ :material-file-document: Paper] ( https://arxiv.org/abs/2506.13651 )
96-
97132- ** MCP-Universe** : Benchmarking Large Language Models with Real-World Model Context Protocol Servers
98133 - [ :material-file-document: Paper] ( https://arxiv.org/abs/2508.14704 )
99-
100134- ** FutureX** : An Advanced Live Benchmark for LLM Agents in Future Prediction
101135 - [ :material-file-document: Paper] ( https://arxiv.org/abs/2508.11987 )
102-
103136- ** Terminal-Bench** : the benchmark for testing AI agents in real terminal environments
104137 - [ :material-github: GitHub] ( https://github.com/laude-institute/terminal-bench )
105-
106138- ** Gaia2 and ARE** : Empowering the Community to Evaluate Agents
107139 - [ :material-file-document: Blog Post] ( https://huggingface.co/blog/gaia2 )
140+ - ** GPQA** : A Graduate-Level Google-Proof Q&A Benchmark
141+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2311.12022 ) · [ :material-github: GitHub] ( https://github.com/idavidrein/gpqa/ )
142+ - ** WebWalkerQA** : WebWalker: Benchmarking LLMs in Web Traversal
143+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2501.07572 ) · [ :material-github: GitHub] ( https://github.com/Alibaba-NLP/DeepResearch ) · [ :material-trophy: Leaderboard] ( https://huggingface.co/spaces/callanwu/WebWalkerQALeaderboard )
144+ - ** HLE** : Humanity's Last Exam
145+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2501.14249 ) · [ :material-link: Website] ( https://lastexam.ai/ )
146+ - ** BFCL** : Berkeley Function Calling Leaderboard
147+ - [ :material-github: GitHub] ( https://github.com/ShishirPatil/gorilla ) · [ :material-trophy: Leaderboard] ( https://gorilla.cs.berkeley.edu/leaderboard.html )
148+ - ** When2Call** : When (not) to Call Tools
149+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2504.18851 ) · [ :material-github: GitHub] ( https://github.com/NVIDIA/When2Call )
150+ - ** ToolSandbox** : A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use Capabilities
151+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2408.04682 ) · [ :material-github: GitHub] ( https://github.com/apple/ToolSandbox )
152+ - ** ToolBeHonest** : A Multi-level Hallucination Diagnostic Benchmark for Tool-Augmented Large Language Models
153+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2406.20015 ) · [ :material-github: GitHub] ( https://github.com/ToolBeHonest/ToolBeHonest )
154+ - ** SuperGPQA** : Scaling LLM Evaluation across 285 Graduate Disciplines
155+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2502.14739 ) · [ :material-link: Website] ( https://supergpqa.github.io/ )
156+ - ** Terminal-Bench** : A benchmark for testing AI agents in terminal environments
157+ - [ :material-trophy: Leaderboard] ( https://www.tbench.ai/leaderboard ) · [ :material-link: Website] ( https://www.tbench.ai/ )
158+ - ** τ-bench** : A Benchmark for Tool-Agent-User Interaction in Real-World Domains
159+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2406.12045 ) · [ :material-github: GitHub] ( https://github.com/sierra-research/tau-bench )
160+ - ** τ2-Bench** : Evaluating Conversational Agents in a Dual-Control Environment
161+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2506.07982 ) · [ :material-github: GitHub] ( https://github.com/sierra-research/tau2-bench )
162+ - ** Deep Research Bench** : Evaluating AI Web Research Agents
163+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2506.06287 ) · [ :material-link: Website] ( https://drb.futuresearch.ai/ )
164+ - ** Beyond the Final Answer** : Evaluating the Reasoning Trajectories of Tool-Augmented Agents
165+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2510.02837 )
166+ - ** TRAJECT-Bench** : A Trajectory-Aware Benchmark for Evaluating Agentic Tool Use
167+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2510.04550 )
168+ - ** ARC-AGI** : The General Intelligence Benchmark
169+ - [ :material-link: Website] ( https://arcprize.org/arc-agi )
170+
171+
172+ ---
173+
174+ ### Agent Memory
175+
176+ !!! tip "Memory Systems for Persistent Agent Intelligence"
177+ Advanced memory solutions for building agents with long-term context and learning capabilities.
178+
179+ - ** Mem0** : Building Production- Ready AI Agents with Scalable Long-Term Memory
180+ - [ :material-github: GitHub] ( https://github.com/mem0ai/mem0 )
181+ - ** memobase** : Profile-Based Long-Term Memory for AI Applications
182+ - [ :material-github: GitHub] ( https://github.com/memodb-io/memobase )
183+ - ** Memento** : Fine-tuning LLM Agents without Fine-tuning LLMs
184+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2508.16153 ) · [ :material-github: GitHub] ( https://github.com/Agent-on-the-Fly/Memento )
185+ - ** MEMTRACK** : Evaluating Long-Term Memory and State Tracking in Multi-Platform Dynamic Agent Environments
186+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2510.01353 )
187+ - ** A-MEM** : Agentic Memory for LLM Agents
188+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2502.12110 ) · [ :material-github: GitHub] ( https://github.com/WujiangXu/A-mem )
189+ - ** MemoryOS** : Memory OS of AI Agent
190+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2506.06326 ) · [ :material-github: GitHub] ( https://github.com/BAI-LAB/MemoryOS )
191+ - ** Memory-R1** : Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning
192+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2508.19828 )
193+ - ** HippoRAG** : Neurobiologically Inspired Long-Term Memory for Large Language Models
194+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2405.14831 ) · [ :material-github: GitHub] ( https://github.com/OSU-NLP-Group/HippoRAG )
195+ - ** MaxKB** : Open-source platform for building enterprise-grade agents
196+ - [ :material-github: GitHub] ( https://github.com/1Panel-dev/MaxKB )
197+ - ** MemAgent** : Reshaping Long-Context LLM with Multi-Conv RL-based Memory Agent
198+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2507.02259 ) · [ :material-link: Website] ( https://memagent-sialab.github.io/ )
199+ - ** LEGOMem** : Modular Procedural Memory for Multi-agent LLM Systems for Workflow Automation
200+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2510.04851 )
201+ - ** Memp** : Exploring Agent Procedural Memory
202+ - [ :material-file-document: Paper] ( https://www.arxiv.org/abs/2508.06433 )
203+ - ** MIRIX** : Multi-Agent Memory System for LLM-Based Agents
204+ - [ :material-file-document: Paper] ( https://arxiv.org/abs/2507.07957 ) · [ :material-link: Website] ( https://mirix.io/ )
205+ - ** A-MemGuard** : A Proactive Defense Framework for LLM-Based Agent Memory
206+ - [ :material-file-document: Paper] ( https://www.arxiv.org/abs/2510.02373 )
207+
208+
209+ ---
210+
211+ ## Blogs
212+
213+ !!! info "Blog Posts & Tutorials"
214+ Curated collection of blog posts, tutorials, and articles about AI agents from various sources and languages.
215+
216+ #### General Blogs
217+
218+ - ** ChatGPT Agent** : Introducing ChatGPT Agent
219+ - [ :material-link: Blog Post] ( https://openai.com/index/introducing-chatgpt-agent/ ) · OpenAI's latest agent capabilities and features
220+
221+ - ** Tongyi DeepResearch** : Deep Research Agent for Complex Tasks
222+ - [ :material-link: Blog Post] ( https://tongyi-agent.github.io/blog/introducing-tongyi-deep-research/ ) · Alibaba's advanced research agent system
223+
224+ #### Chinese Blogs
225+
226+ !!! quote "中文博客与资源"
227+ 精选的中文AI智能体相关博客文章、教程和资源,帮助中文用户更好地理解和应用智能体技术。
228+
229+ - ** 17个主流 Agent 框架快速对比**
230+ - [ :material-link: 博客链接] ( https://zhuanlan.zhihu.com/p/1957319210951746186 ) · 知乎专栏文章,对比分析主流智能体框架
231+
232+ - ** 通义 DeepResearch**
233+ - [ :material-link: Blog Post] ( https://tongyi-agent.github.io/zh/blog/introducing-tongyi-deep-research/ ) · 阿里巴巴通义智能体深度研究系统介绍
108234
109235---
110236
0 commit comments