Skip to content

Commit a8f8eaf

Browse files
authored
Merge branch 'main' into feat-anthropic-cache-all
2 parents 7f317f0 + 5f4e078 commit a8f8eaf

25 files changed

+1337
-154
lines changed

.github/workflows/ci.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -157,6 +157,9 @@ jobs:
157157
env:
158158
CI: true
159159
COVERAGE_PROCESS_START: ./pyproject.toml
160+
# We only run the llama_cpp tests on the latest Python as they have been regularly failing in CI with `Fatal Python error: Illegal instruction`:
161+
# https://github.com/pydantic/pydantic-ai/actions/runs/19547773220/job/55970947389
162+
RUN_LLAMA_CPP_TESTS: ${{ matrix.python-version == '3.13' && matrix.install.name == 'all-extras' }}
160163
steps:
161164
- uses: actions/checkout@v4
162165

@@ -207,6 +210,7 @@ jobs:
207210
env:
208211
CI: true
209212
COVERAGE_PROCESS_START: ./pyproject.toml
213+
RUN_LLAMA_CPP_TESTS: false
210214
steps:
211215
- uses: actions/checkout@v4
212216

docs/mcp/client.md

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -365,6 +365,67 @@ async def main():
365365

366366
MCP tools can include metadata that provides additional information about the tool's characteristics, which can be useful when [filtering tools][pydantic_ai.toolsets.FilteredToolset]. The `meta`, `annotations`, and `output_schema` fields can be found on the `metadata` dict on the [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] object that's passed to filter functions.
367367

368+
## Resources
369+
370+
MCP servers can provide [resources](https://modelcontextprotocol.io/docs/concepts/resources) - files, data, or content that can be accessed by the client. Resources in MCP are application-driven, with host applications determining how to incorporate context manually, based on their needs. This means they will _not_ be exposed to the LLM automatically (unless a tool returns a `ResourceLink` or `EmbeddedResource`).
371+
372+
Pydantic AI provides methods to discover and read resources from MCP servers:
373+
374+
- [`list_resources()`][pydantic_ai.mcp.MCPServer.list_resources] - List all available resources on the server
375+
- [`list_resource_templates()`][pydantic_ai.mcp.MCPServer.list_resource_templates] - List resource templates with parameter placeholders
376+
- [`read_resource(uri)`][pydantic_ai.mcp.MCPServer.read_resource] - Read the contents of a specific resource by URI
377+
378+
Resources are automatically converted: text content is returned as `str`, and binary content is returned as [`BinaryContent`][pydantic_ai.messages.BinaryContent].
379+
380+
Before consuming resources, we need to run a server that exposes some:
381+
382+
```python {title="mcp_resource_server.py"}
383+
from mcp.server.fastmcp import FastMCP
384+
385+
mcp = FastMCP('Pydantic AI MCP Server')
386+
log_level = 'unset'
387+
388+
389+
@mcp.resource('resource://user_name.txt', mime_type='text/plain')
390+
async def user_name_resource() -> str:
391+
return 'Alice'
392+
393+
394+
if __name__ == '__main__':
395+
mcp.run()
396+
```
397+
398+
Then we can create the client:
399+
400+
```python {title="mcp_resources.py", requires="mcp_resource_server.py"}
401+
import asyncio
402+
403+
from pydantic_ai.mcp import MCPServerStdio
404+
405+
406+
async def main():
407+
server = MCPServerStdio('python', args=['-m', 'mcp_resource_server'])
408+
409+
async with server:
410+
# List all available resources
411+
resources = await server.list_resources()
412+
for resource in resources:
413+
print(f' - {resource.name}: {resource.uri} ({resource.mime_type})')
414+
#> - user_name_resource: resource://user_name.txt (text/plain)
415+
416+
# Read a text resource
417+
user_name = await server.read_resource('resource://user_name.txt')
418+
print(f'Text content: {user_name}')
419+
#> Text content: Alice
420+
421+
422+
if __name__ == '__main__':
423+
asyncio.run(main())
424+
```
425+
426+
_(This example is complete, it can be run "as is")_
427+
428+
368429
## Custom TLS / SSL configuration
369430

370431
In some environments you need to tweak how HTTPS connections are established –

docs/ui/ag-ui.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -178,6 +178,8 @@ validate state contained in [`RunAgentInput.state`](https://docs.ag-ui.com/sdk/j
178178

179179
If the `state` field's type is a Pydantic `BaseModel` subclass, the raw state dictionary on the request is automatically validated. If not, you can validate the raw value yourself in your dependencies dataclass's `__post_init__` method.
180180

181+
If AG-UI state is provided but your dependencies do not implement [`StateHandler`][pydantic_ai.ag_ui.StateHandler], Pydantic AI will emit a warning and ignore the state. Use [`StateDeps`][pydantic_ai.ag_ui.StateDeps] or a custom [`StateHandler`][pydantic_ai.ag_ui.StateHandler] implementation to receive and validate the incoming state.
182+
181183

182184
```python {title="ag_ui_state.py"}
183185
from pydantic import BaseModel

pydantic_ai_slim/pydantic_ai/_agent_graph.py

Lines changed: 24 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -113,9 +113,9 @@ def increment_retries(
113113
try:
114114
tool_call.args_as_dict()
115115
except Exception:
116-
max_tokens = (model_settings or {}).get('max_tokens') if model_settings else None
116+
max_tokens = model_settings.get('max_tokens') if model_settings else None
117117
raise exceptions.IncompleteToolCall(
118-
f'Model token limit ({max_tokens if max_tokens is not None else "provider default"}) exceeded while emitting a tool call, resulting in incomplete arguments. Increase max tokens or simplify tool call arguments to fit within limit.'
118+
f'Model token limit ({max_tokens or "provider default"}) exceeded while generating a tool call, resulting in incomplete arguments. Increase the `max_tokens` model setting, or simplify the prompt to result in a shorter response that will fit within the limit.'
119119
)
120120
message = f'Exceeded maximum retries ({max_result_retries}) for output validation'
121121
if error:
@@ -216,17 +216,14 @@ async def run( # noqa: C901
216216
ctx.state.message_history = messages
217217
ctx.deps.new_message_index = len(messages)
218218

219-
# Validate that message history starts with a user message
220-
if messages and isinstance(messages[0], _messages.ModelResponse):
221-
raise exceptions.UserError(
222-
'Message history cannot start with a `ModelResponse`. Conversations must begin with a user message.'
223-
)
224-
225219
if self.deferred_tool_results is not None:
226220
return await self._handle_deferred_tool_results(self.deferred_tool_results, messages, ctx)
227221

228222
next_message: _messages.ModelRequest | None = None
229223

224+
run_context: RunContext[DepsT] | None = None
225+
instructions: str | None = None
226+
230227
if messages and (last_message := messages[-1]):
231228
if isinstance(last_message, _messages.ModelRequest) and self.user_prompt is None:
232229
# Drop last message from history and reuse its parts
@@ -248,15 +245,19 @@ async def run( # noqa: C901
248245
ctx.deps.prompt = combined_content
249246
elif isinstance(last_message, _messages.ModelResponse):
250247
if self.user_prompt is None:
251-
# Skip ModelRequestNode and go directly to CallToolsNode
252-
return CallToolsNode[DepsT, NodeRunEndT](last_message)
248+
run_context = build_run_context(ctx)
249+
instructions = await ctx.deps.get_instructions(run_context)
250+
if not instructions:
251+
# If there's no new prompt or instructions, skip ModelRequestNode and go directly to CallToolsNode
252+
return CallToolsNode[DepsT, NodeRunEndT](last_message)
253253
elif last_message.tool_calls:
254254
raise exceptions.UserError(
255255
'Cannot provide a new user prompt when the message history contains unprocessed tool calls.'
256256
)
257257

258-
# Build the run context after `ctx.deps.prompt` has been updated
259-
run_context = build_run_context(ctx)
258+
if not run_context:
259+
run_context = build_run_context(ctx)
260+
instructions = await ctx.deps.get_instructions(run_context)
260261

261262
if messages:
262263
await self._reevaluate_dynamic_prompts(messages, run_context)
@@ -273,7 +274,7 @@ async def run( # noqa: C901
273274

274275
next_message = _messages.ModelRequest(parts=parts)
275276

276-
next_message.instructions = await ctx.deps.get_instructions(run_context)
277+
next_message.instructions = instructions
277278

278279
if not messages and not next_message.parts and not next_message.instructions:
279280
raise exceptions.UserError('No message history, user prompt, or instructions provided')
@@ -578,6 +579,14 @@ async def _run_stream( # noqa: C901
578579

579580
async def _run_stream() -> AsyncIterator[_messages.HandleResponseEvent]: # noqa: C901
580581
if not self.model_response.parts:
582+
# Don't retry if the model returned an empty response because the token limit was exceeded, possibly during thinking.
583+
if self.model_response.finish_reason == 'length':
584+
model_settings = ctx.deps.model_settings
585+
max_tokens = model_settings.get('max_tokens') if model_settings else None
586+
raise exceptions.UnexpectedModelBehavior(
587+
f'Model token limit ({max_tokens or "provider default"}) exceeded before any response was generated. Increase the `max_tokens` model setting, or simplify the prompt to result in a shorter response that will fit within the limit.'
588+
)
589+
581590
# we got an empty response.
582591
# this sometimes happens with anthropic (and perhaps other models)
583592
# when the model has already returned text along side tool calls
@@ -598,8 +607,8 @@ async def _run_stream() -> AsyncIterator[_messages.HandleResponseEvent]: # noqa
598607
try:
599608
self._next_node = await self._handle_text_response(ctx, text, text_processor)
600609
return
601-
except ToolRetryError:
602-
# If the text from the preview response was invalid, ignore it.
610+
except ToolRetryError: # pragma: no cover
611+
# If the text from the previous response was invalid, ignore it.
603612
pass
604613

605614
# Go back to the model request node with an empty request, which means we'll essentially

pydantic_ai_slim/pydantic_ai/_mcp.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
from __future__ import annotations
2+
13
import base64
24
from collections.abc import Sequence
35
from typing import Literal

0 commit comments

Comments
 (0)