-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Description
The Feature
[Feature]: Add Masking Support and MCP Call Support to Pillar Security Guardrail
The Feature
Add three key enhancements to the Pillar Security guardrail:
-
Masking Support: Automatically sanitize sensitive content (PII, secrets, etc.) using
masked_session_messagesfrom the Pillar API response, allowing requests to proceed safely without exposing sensitive data to the LLM. -
MCP Call Support: Enable Pillar guardrails to work with MCP (Model Context Protocol) tool calls by adding support for
pre_mcp_callandduring_mcp_callevent hooks. -
Conditional Exception Details: Control what details are included in exception responses based on
include_scannersandinclude_evidenceconfiguration, reducing payload size when detailed exception information isn't needed.
Proposed Changes
Masking Mode:
- Add
"mask"as a new option foron_flagged_actionconfiguration - When
on_flagged_action: "mask", replace request messages withmasked_session_messagesfrom Pillar API response - Request proceeds with sanitized content instead of being blocked
MCP Support:
- Add
pre_mcp_callandduring_mcp_callto supported event hooks - Enable Pillar scanning for MCP tool calls (already supported in call_type Literal)
Exception Control:
- Conditionally include
scannersin exception only ifinclude_scanners: true - Conditionally include
evidencein exception only ifinclude_evidence: true - Always include
session_idfor traceability
Configuration Example
guardrails:
- guardrail_name: "pillar-security"
litellm_params:
guardrail: "pillar"
mode: "pre_call" # or "pre_mcp_call", "during_mcp_call"
default_on: true
on_flagged_action: "mask" # Options: "block", "monitor", "mask"
include_scanners: true # Include scanners in exception when blocking
include_evidence: true # Include evidence in exception when blocking
api_key: "${PILLAR_API_KEY}"Example Usage
Masking Mode:
# Request with sensitive data
messages = [{"role": "user", "content": "My email is [email protected]"}]
# With on_flagged_action: "mask", messages are replaced with masked versions
# masked_messages = [{"role": "user", "content": "My email is [MASKED_EMAIL]"}]
# Request proceeds to LLM with masked contentMCP Call Support:
guardrails:
- guardrail_name: "pillar-mcp"
litellm_params:
guardrail: "pillar"
mode: "pre_mcp_call" # Scans MCP tool call inputs
on_flagged_action: "block"Minimal Exception (when blocking):
include_scanners: false
include_evidence: false
# Exception only includes session_idMotivation, pitch
The Problem:
-
No Masking Option: Currently, Pillar guardrail can only
blockormonitorflagged content. There's no way to automatically sanitize sensitive data and allow requests to proceed safely. Users must either block legitimate requests containing PII or build custom guardrails to handle masking. -
No MCP Support: Pillar guardrails don't work with MCP (Model Context Protocol) tool calls, limiting security coverage for agent workflows. This creates a security gap where MCP tool calls bypass scanning entirely.
-
Large Exception Payloads: When blocking, exception responses always include full scanner breakdowns and evidence, even when not needed. This increases payload size, latency, and costs for users who only need to know if content was flagged.
The Solution:
-
Masking: Allows automatic PII/secrets sanitization, enabling requests to proceed safely without manual intervention or custom guardrails. This improves UX by allowing workflows to continue with sanitized content instead of blocking.
-
MCP Support: Extends security coverage to MCP tool calls, ensuring comprehensive protection across all LLM endpoints including agent workflows.
-
Exception Control: Reduces payload size and improves performance when detailed exception information isn't required. Users can choose minimal exceptions (session_id only) or full details based on their needs.
Benefits:
- Better UX: Users can continue workflows with sanitized content instead of blocking, reducing false positive impact
- Comprehensive Coverage: Security scanning works across all LLM endpoints including MCP calls, closing security gaps
- Performance: Smaller exception payloads when detailed info isn't needed, reducing latency and costs
- Flexibility: Users can choose the right balance of detail vs. performance based on their use case
This follows existing LiteLLM patterns (similar to how other guardrails handle masking) and maintains full backward compatibility. All changes are additive and don't affect existing functionality.
Implementation Details
Files Modified
litellm/proxy/guardrails/guardrail_hooks/pillar/pillar.py- Core implementationtests/test_litellm/proxy/guardrails/test_pillar_guardrails.py- Test coveragedocs/my-website/docs/proxy/guardrails/pillar_security.md- Documentation
Backward Compatibility
✅ All changes maintain backward compatibility:
- Default values remain unchanged (
include_scanners=True,include_evidence=True) - Existing
blockandmonitoractions continue to work as before - New
maskaction is additive and doesn't affect existing functionality
Testing
- Added tests for masking functionality
- Added tests for conditional exception details
- Added tests for MCP call support
- All existing tests pass
- No linting errors
Related Issues/PRs
- Related to previous PR feat(pillar): add masking support for automatic content sanitization #17206 (feat/pillar-masking) which was closed
- PR [feat]: Add opt-in evidence results for Pillar Security guardrail during monitoring #17812 was merged before this work and may include some overlapping changes
- Similar to issue [Feature]: Return Pillar guardrail detection data in response headers (monitor mode) #17809 which requested response headers for monitor mode
LiteLLM is hiring a founding backend engineer, are you interested in joining us and shipping to all our users?
No
Twitter / LinkedIn details
No response