pydantic
diff --git a/‎.github/workflows/ci.yml‎
Lines changed: 4 additions & 0 deletions b/‎.github/workflows/ci.yml‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎docs/mcp/client.md‎
Lines changed: 61 additions & 0 deletions b/‎docs/mcp/client.md‎
Lines changed: 61 additions & 0 deletions
diff --git a/‎docs/ui/ag-ui.md‎
Lines changed: 2 additions & 0 deletions b/‎docs/ui/ag-ui.md‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎pydantic_ai_slim/pydantic_ai/_agent_graph.py‎
Lines changed: 24 additions & 15 deletions b/‎pydantic_ai_slim/pydantic_ai/_agent_graph.py‎
Lines changed: 24 additions & 15 deletions
diff --git a/‎pydantic_ai_slim/pydantic_ai/_mcp.py‎
Lines changed: 2 additions & 0 deletions b/‎pydantic_ai_slim/pydantic_ai/_mcp.py‎
Lines changed: 2 additions & 0 deletions
@@ -157,6 +157,9 @@ jobs:
     env:
       CI: true
       COVERAGE_PROCESS_START: ./pyproject.toml
+      # We only run the llama_cpp tests on the latest Python as they have been regularly failing in CI with `Fatal Python error: Illegal instruction`:
+      # https://github.com/pydantic/pydantic-ai/actions/runs/19547773220/job/55970947389
+      RUN_LLAMA_CPP_TESTS: ${{ matrix.python-version == '3.13' && matrix.install.name == 'all-extras' }}
     steps:
       - uses: actions/checkout@v4
 
@@ -207,6 +210,7 @@ jobs:
     env:
       CI: true
       COVERAGE_PROCESS_START: ./pyproject.toml
+      RUN_LLAMA_CPP_TESTS: false
     steps:
       - uses: actions/checkout@v4
 
 
@@ -365,6 +365,67 @@ async def main():
 
 MCP tools can include metadata that provides additional information about the tool's characteristics, which can be useful when [filtering tools][pydantic_ai.toolsets.FilteredToolset]. The `meta`, `annotations`, and `output_schema` fields can be found on the `metadata` dict on the [`ToolDefinition`][pydantic_ai.tools.ToolDefinition] object that's passed to filter functions.
 
+## Resources
+
+MCP servers can provide [resources](https://modelcontextprotocol.io/docs/concepts/resources) - files, data, or content that can be accessed by the client. Resources in MCP are application-driven, with host applications determining how to incorporate context manually, based on their needs. This means they will _not_ be exposed to the LLM automatically (unless a tool returns a `ResourceLink` or `EmbeddedResource`).
+
+Pydantic AI provides methods to discover and read resources from MCP servers:
+
+- [`list_resources()`][pydantic_ai.mcp.MCPServer.list_resources] - List all available resources on the server
+- [`list_resource_templates()`][pydantic_ai.mcp.MCPServer.list_resource_templates] - List resource templates with parameter placeholders
+- [`read_resource(uri)`][pydantic_ai.mcp.MCPServer.read_resource] - Read the contents of a specific resource by URI
+
+Resources are automatically converted: text content is returned as `str`, and binary content is returned as [`BinaryContent`][pydantic_ai.messages.BinaryContent].
+
+Before consuming resources, we need to run a server that exposes some:
+
+```python {title="mcp_resource_server.py"}
+from mcp.server.fastmcp import FastMCP
+
+mcp = FastMCP('Pydantic AI MCP Server')
+log_level = 'unset'
+
+
+@mcp.resource('resource://user_name.txt', mime_type='text/plain')
+async def user_name_resource() -> str:
+    return 'Alice'
+
+
+if __name__ == '__main__':
+    mcp.run()
+```
+
+Then we can create the client:
+
+```python {title="mcp_resources.py", requires="mcp_resource_server.py"}
+import asyncio
+
+from pydantic_ai.mcp import MCPServerStdio
+
+
+async def main():
+    server = MCPServerStdio('python', args=['-m', 'mcp_resource_server'])
+
+    async with server:
+        # List all available resources
+        resources = await server.list_resources()
+        for resource in resources:
+            print(f' - {resource.name}: {resource.uri} ({resource.mime_type})')
+            #>  - user_name_resource: resource://user_name.txt (text/plain)
+
+        # Read a text resource
+        user_name = await server.read_resource('resource://user_name.txt')
+        print(f'Text content: {user_name}')
+        #> Text content: Alice
+
+
+if __name__ == '__main__':
+    asyncio.run(main())
+```
+
+_(This example is complete, it can be run "as is")_
+
+
 ## Custom TLS / SSL configuration
 
 In some environments you need to tweak how HTTPS connections are established –
 
@@ -178,6 +178,8 @@ validate state contained in [`RunAgentInput.state`](https://docs.ag-ui.com/sdk/j
 
     If the `state` field's type is a Pydantic `BaseModel` subclass, the raw state dictionary on the request is automatically validated. If not, you can validate the raw value yourself in your dependencies dataclass's `__post_init__` method.
 
+    If AG-UI state is provided but your dependencies do not implement [`StateHandler`][pydantic_ai.ag_ui.StateHandler], Pydantic AI will emit a warning and ignore the state. Use [`StateDeps`][pydantic_ai.ag_ui.StateDeps] or a custom [`StateHandler`][pydantic_ai.ag_ui.StateHandler] implementation to receive and validate the incoming state.
+
 
 ```python {title="ag_ui_state.py"}
 from pydantic import BaseModel
 
@@ -113,9 +113,9 @@ def increment_retries(
                 try:
                     tool_call.args_as_dict()
                 except Exception:
-                    max_tokens = (model_settings or {}).get('max_tokens') if model_settings else None
+                    max_tokens = model_settings.get('max_tokens') if model_settings else None
                     raise exceptions.IncompleteToolCall(
-                        f'Model token limit ({max_tokens if max_tokens is not None else "provider default"}) exceeded while emitting a tool call, resulting in incomplete arguments. Increase max tokens or simplify tool call arguments to fit within limit.'
+                        f'Model token limit ({max_tokens or "provider default"}) exceeded while generating a tool call, resulting in incomplete arguments. Increase the `max_tokens` model setting, or simplify the prompt to result in a shorter response that will fit within the limit.'
                     )
             message = f'Exceeded maximum retries ({max_result_retries}) for output validation'
             if error:
@@ -216,17 +216,14 @@ async def run(  # noqa: C901
         ctx.state.message_history = messages
         ctx.deps.new_message_index = len(messages)
 
-        # Validate that message history starts with a user message
-        if messages and isinstance(messages[0], _messages.ModelResponse):
-            raise exceptions.UserError(
-                'Message history cannot start with a `ModelResponse`. Conversations must begin with a user message.'
-            )
-
         if self.deferred_tool_results is not None:
             return await self._handle_deferred_tool_results(self.deferred_tool_results, messages, ctx)
 
         next_message: _messages.ModelRequest | None = None
 
+        run_context: RunContext[DepsT] | None = None
+        instructions: str | None = None
+
         if messages and (last_message := messages[-1]):
             if isinstance(last_message, _messages.ModelRequest) and self.user_prompt is None:
                 # Drop last message from history and reuse its parts
@@ -248,15 +245,19 @@ async def run(  # noqa: C901
                         ctx.deps.prompt = combined_content
             elif isinstance(last_message, _messages.ModelResponse):
                 if self.user_prompt is None:
-                    # Skip ModelRequestNode and go directly to CallToolsNode
-                    return CallToolsNode[DepsT, NodeRunEndT](last_message)
+                    run_context = build_run_context(ctx)
+                    instructions = await ctx.deps.get_instructions(run_context)
+                    if not instructions:
+                        # If there's no new prompt or instructions, skip ModelRequestNode and go directly to CallToolsNode
+                        return CallToolsNode[DepsT, NodeRunEndT](last_message)
                 elif last_message.tool_calls:
                     raise exceptions.UserError(
                         'Cannot provide a new user prompt when the message history contains unprocessed tool calls.'
                     )
 
-        # Build the run context after `ctx.deps.prompt` has been updated
-        run_context = build_run_context(ctx)
+        if not run_context:
+            run_context = build_run_context(ctx)
+            instructions = await ctx.deps.get_instructions(run_context)
 
         if messages:
             await self._reevaluate_dynamic_prompts(messages, run_context)
@@ -273,7 +274,7 @@ async def run(  # noqa: C901
 
             next_message = _messages.ModelRequest(parts=parts)
 
-        next_message.instructions = await ctx.deps.get_instructions(run_context)
+        next_message.instructions = instructions
 
         if not messages and not next_message.parts and not next_message.instructions:
             raise exceptions.UserError('No message history, user prompt, or instructions provided')
@@ -578,6 +579,14 @@ async def _run_stream(  # noqa: C901
 
             async def _run_stream() -> AsyncIterator[_messages.HandleResponseEvent]:  # noqa: C901
                 if not self.model_response.parts:
+                    # Don't retry if the model returned an empty response because the token limit was exceeded, possibly during thinking.
+                    if self.model_response.finish_reason == 'length':
+                        model_settings = ctx.deps.model_settings
+                        max_tokens = model_settings.get('max_tokens') if model_settings else None
+                        raise exceptions.UnexpectedModelBehavior(
+                            f'Model token limit ({max_tokens or "provider default"}) exceeded before any response was generated. Increase the `max_tokens` model setting, or simplify the prompt to result in a shorter response that will fit within the limit.'
+                        )
+
                     # we got an empty response.
                     # this sometimes happens with anthropic (and perhaps other models)
                     # when the model has already returned text along side tool calls
@@ -598,8 +607,8 @@ async def _run_stream() -> AsyncIterator[_messages.HandleResponseEvent]:  # noqa
                                     try:
                                         self._next_node = await self._handle_text_response(ctx, text, text_processor)
                                         return
-                                    except ToolRetryError:
-                                        # If the text from the preview response was invalid, ignore it.
+                                    except ToolRetryError:  # pragma: no cover
+                                        # If the text from the previous response was invalid, ignore it.
                                         pass
 
                     # Go back to the model request node with an empty request, which means we'll essentially
 
@@ -1,3 +1,5 @@
+from __future__ import annotations
+
 import base64
 from collections.abc import Sequence
 from typing import Literal
Original file line number	Diff line number	Diff line change
`@@ -1,3 +1,5 @@`
	`1`	`+from __future__ import annotations`
	`2`	`+`
`1`	`3`	`import base64`
`2`	`4`	`from collections.abc import Sequence`
`3`	`5`	`from typing import Literal`