Skip to content

[Feature]: Add Masking Support and MCP Call Support to Pillar Security Guardrail #17958

@eagle-p

Description

@eagle-p

The Feature

[Feature]: Add Masking Support and MCP Call Support to Pillar Security Guardrail

The Feature

Add three key enhancements to the Pillar Security guardrail:

  1. Masking Support: Automatically sanitize sensitive content (PII, secrets, etc.) using masked_session_messages from the Pillar API response, allowing requests to proceed safely without exposing sensitive data to the LLM.

  2. MCP Call Support: Enable Pillar guardrails to work with MCP (Model Context Protocol) tool calls by adding support for pre_mcp_call and during_mcp_call event hooks.

  3. Conditional Exception Details: Control what details are included in exception responses based on include_scanners and include_evidence configuration, reducing payload size when detailed exception information isn't needed.

Proposed Changes

Masking Mode:

  • Add "mask" as a new option for on_flagged_action configuration
  • When on_flagged_action: "mask", replace request messages with masked_session_messages from Pillar API response
  • Request proceeds with sanitized content instead of being blocked

MCP Support:

  • Add pre_mcp_call and during_mcp_call to supported event hooks
  • Enable Pillar scanning for MCP tool calls (already supported in call_type Literal)

Exception Control:

  • Conditionally include scanners in exception only if include_scanners: true
  • Conditionally include evidence in exception only if include_evidence: true
  • Always include session_id for traceability

Configuration Example

guardrails:
  - guardrail_name: "pillar-security"
    litellm_params:
      guardrail: "pillar"
      mode: "pre_call"  # or "pre_mcp_call", "during_mcp_call"
      default_on: true
      on_flagged_action: "mask"  # Options: "block", "monitor", "mask"
      include_scanners: true      # Include scanners in exception when blocking
      include_evidence: true      # Include evidence in exception when blocking
      api_key: "${PILLAR_API_KEY}"

Example Usage

Masking Mode:

# Request with sensitive data
messages = [{"role": "user", "content": "My email is [email protected]"}]

# With on_flagged_action: "mask", messages are replaced with masked versions
# masked_messages = [{"role": "user", "content": "My email is [MASKED_EMAIL]"}]
# Request proceeds to LLM with masked content

MCP Call Support:

guardrails:
  - guardrail_name: "pillar-mcp"
    litellm_params:
      guardrail: "pillar"
      mode: "pre_mcp_call"  # Scans MCP tool call inputs
      on_flagged_action: "block"

Minimal Exception (when blocking):

include_scanners: false
include_evidence: false
# Exception only includes session_id

Motivation, pitch

The Problem:

  1. No Masking Option: Currently, Pillar guardrail can only block or monitor flagged content. There's no way to automatically sanitize sensitive data and allow requests to proceed safely. Users must either block legitimate requests containing PII or build custom guardrails to handle masking.

  2. No MCP Support: Pillar guardrails don't work with MCP (Model Context Protocol) tool calls, limiting security coverage for agent workflows. This creates a security gap where MCP tool calls bypass scanning entirely.

  3. Large Exception Payloads: When blocking, exception responses always include full scanner breakdowns and evidence, even when not needed. This increases payload size, latency, and costs for users who only need to know if content was flagged.

The Solution:

  • Masking: Allows automatic PII/secrets sanitization, enabling requests to proceed safely without manual intervention or custom guardrails. This improves UX by allowing workflows to continue with sanitized content instead of blocking.

  • MCP Support: Extends security coverage to MCP tool calls, ensuring comprehensive protection across all LLM endpoints including agent workflows.

  • Exception Control: Reduces payload size and improves performance when detailed exception information isn't required. Users can choose minimal exceptions (session_id only) or full details based on their needs.

Benefits:

  1. Better UX: Users can continue workflows with sanitized content instead of blocking, reducing false positive impact
  2. Comprehensive Coverage: Security scanning works across all LLM endpoints including MCP calls, closing security gaps
  3. Performance: Smaller exception payloads when detailed info isn't needed, reducing latency and costs
  4. Flexibility: Users can choose the right balance of detail vs. performance based on their use case

This follows existing LiteLLM patterns (similar to how other guardrails handle masking) and maintains full backward compatibility. All changes are additive and don't affect existing functionality.

Implementation Details

Files Modified

  • litellm/proxy/guardrails/guardrail_hooks/pillar/pillar.py - Core implementation
  • tests/test_litellm/proxy/guardrails/test_pillar_guardrails.py - Test coverage
  • docs/my-website/docs/proxy/guardrails/pillar_security.md - Documentation

Backward Compatibility

✅ All changes maintain backward compatibility:

  • Default values remain unchanged (include_scanners=True, include_evidence=True)
  • Existing block and monitor actions continue to work as before
  • New mask action is additive and doesn't affect existing functionality

Testing

  • Added tests for masking functionality
  • Added tests for conditional exception details
  • Added tests for MCP call support
  • All existing tests pass
  • No linting errors

Related Issues/PRs

LiteLLM is hiring a founding backend engineer, are you interested in joining us and shipping to all our users?

No

Twitter / LinkedIn details

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions