Skip to content

How to properly handle guardrails #45

@htulipe

Description

@htulipe

Hello,

I'm trying to plug an agent with input guardrail into my chatkit custom implementation and stumbled upon a behavior which makes me wonder if I'm doing things correctly.

Let's say I have a simple agent with an input guardrail:

import asyncio
from pydantic import BaseModel
from agents import (
    Agent,
    GuardrailFunctionOutput,
    InputGuardrailTripwireTriggered,
    RunContextWrapper,
    Runner,
    TResponseInputItem,
    input_guardrail,
)


class MathHomeworkOutput(BaseModel):
    is_math_homework: bool
    reasoning: str


guardrail_agent = Agent(
    name="Guardrail check",
    instructions="Check if the user is asking you to do their math homework.",
    output_type=MathHomeworkOutput,
)


@input_guardrail
async def math_guardrail(
    ctx: RunContextWrapper[None], agent: Agent, input: str | list[TResponseInputItem]
) -> GuardrailFunctionOutput:
    result = await Runner.run(guardrail_agent, input, context=ctx.context)

    return GuardrailFunctionOutput(
        output_info=result.final_output,
        tripwire_triggered=result.final_output.is_math_homework,
    )


math_agent = Agent(
    name="Customer support agent",
    instructions="You are a customer support agent. You help customers with their questions.",
    input_guardrails=[math_guardrail],
)

(taken from the Agent doc)

And a custom ChatKit server:

from .agent import math_agent

class MyChatKitServer(ChatKitServer):
    def __init__(
        self, data_store: Store, attachment_store: AttachmentStore | None = None
    ):
        super().__init__(data_store, attachment_store)

    assistant_agent = math_agent

    async def respond(
        self,
        thread: ThreadMetadata,
        input: UserMessageItem | None,
        context: Any,
    ) -> AsyncIterator[ThreadStreamEvent]:
        context = AgentContext(
            thread=thread,
            store=self.store,
            request_context=context,
        )
        result = Runner.run_streamed(
            self.assistant_agent,
            await simple_to_agent_input(input) if input else [],
            context=context,
        )
        async for event in stream_agent_response(
            context,
            result,
        ):
            yield event

    # ...

(taken from the ChatKit doc)

When I interact with this chatkit server, asking him something that triggers the guardrail (Hello, can you help me solve for x: 2x + 3 = 11?) the server still sends me a response he should not have and then raise an error:

output.mov

I can see that the exception is raised here so I guess it's a wanted behavior. I also understand that guardrails are run in parallel (per the doc) which might be relevant.

Am I'm doing something wrong? My end goal is for my agent to respond by a message saying he can't answer the user question because it's out of its scope.

Thank for you help

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions