@@ -24,19 +24,20 @@ This chapter covers how to structure guardrail configuration files, write custom
2424 it into your config file and replace the client and server names.
2525 ```yaml
2626 <client-name >: # your client's shorthand (e.g., cursor, claude, windsurf)
27- <server-name >: # your server's name according to the mcp config (e.g., whatsapp-mcp)
28- guardrails:
29- secrets: block # block calls/results with secrets
30-
31- custom_guardrails:
32- # define a rule using Invariant Guardrails, https://explorer.invariantlabs.ai/docs/guardrails/
33- - name: "Filter tool results with 'error'"
34- id: "error_filter_guardrail"
35- action: block # or 'log'
36- content: |
37- raise "An error was found." if:
38- (msg: ToolOutput)
39- "error" in msg.content
27+ servers:
28+ <server-name >: # your server's name according to the mcp config (e.g., whatsapp-mcp)
29+ guardrails:
30+ secrets: block # block calls/results with secrets
31+
32+ custom_guardrails:
33+ # define a rule using Invariant Guardrails, https://explorer.invariantlabs.ai/docs/guardrails/
34+ - name: "Filter tool results with 'error'"
35+ id: "error_filter_guardrail"
36+ action: block # or 'log'
37+ content: |
38+ raise "An error was found." if:
39+ (msg: ToolOutput)
40+ "error" in msg.content
4041 ```
4142
4243## File structure
@@ -47,26 +48,26 @@ The configuration file defines guardrailing behavior hierarchically, scoped by *
4748<client-name> :
4849 custom_guardrails :
4950 ...
50-
51- <server-name> :
52- guardrails :
53- <default-guardrail-name> : <guardrail-action>
54- ...
55-
56- custom_guardrails :
57- - name : <guardrail-name>
58- id : <guardrail-id>
59- action : <guardrail-action>
60- content : |
61- <guardrail-content>
62- ...
63-
64- tools :
65- <tool-name> :
51+ servers :
52+ <server-name> :
53+ guardrails :
6654 <default-guardrail-name> : <guardrail-action>
67- ...
68- enabled : <boolean>
69- ...
55+ ...
56+
57+ custom_guardrails :
58+ - name : <guardrail-name>
59+ id : <guardrail-id>
60+ action : <guardrail-action>
61+ content : |
62+ <guardrail-content>
63+ ...
64+
65+ tools :
66+ <tool-name> :
67+ <default-guardrail-name> : <guardrail-action>
68+ ...
69+ enabled : <boolean>
70+ ...
7071...
7172```
7273
@@ -102,10 +103,11 @@ Default guardrails are pre-configured and run by default with the `log` action.
102103** Example:** Overriding a default guardrail.
103104``` yaml
104105cursor :
105- email-mcp-server :
106- guardrails :
107- pii : block
108- secrets : paused
106+ servers :
107+ email-mcp-server :
108+ guardrails :
109+ pii : block
110+ secrets : paused
109111` ` `
110112
111113## Custom guardrails
@@ -210,14 +212,15 @@ To see how this hierarchy of precedence works, consider the following example co
210212
211213` ` ` yaml
212214client:
213- server:
214- guardrails:
215- pii: block
216- secrets: paused
217-
218- tools:
219- tool:
220- secrets: block
215+ servers:
216+ server:
217+ guardrails:
218+ pii: block
219+ secrets: paused
220+
221+ tools:
222+ tool:
223+ secrets: block
221224` ` `
222225
223226The resulting behavior of this configuration is :
@@ -239,57 +242,58 @@ It demonstrates how to define default and custom guardrails for specific clients
239242
240243` ` ` yaml
241244cursor:
242- email-mcp-server:
243-
244- # Customize the guardrailing for this specific server
245- guardrails:
246- pii: block
247- moderated: paused
248-
249- # Define multiple custom guardrails
250- custom_guardrails:
251- - name: "Trusted Recipient Email"
252- id: "untrustsed_email_gr_1"
253- action: block
254-
255- # Guardrail to ensure that we know all recipients
256- content: |
257- raise "Untrusted email recipient" if:
258- (call: ToolCall)
259- call is tool:send_email
260- not match(".*@company.com", call.function.arguments.recipient)
261-
262-
263- # Guardrail to ensure an email is not sent after
264- # a prompt injection is detected in the inbox
265- - name: "PII Email"
266- id: "untrustsed_email_gr_2"
267- action: log
268- content: |
269- from invariant.detectors import prompt_injection
270-
271- raise "Suspicious email before send" if:
272- (inbox: ToolOutput) -> (call: ToolCall)
273- inbox is tool:get_inbox
274- call is tool:send_email
275- prompt_injection(inbox.content)
276-
277- # Specify the behavior of individual tools
278- tools:
279- send_message:
280- enabled: false
281-
282- read_messages:
283- secrets: block
284-
285- weather:
286- guardrails:
287- moderated: paused
288-
289- # Separate configurations on a per client/server basis
290- claude:
291- git-mcp-server:
292- tools:
293- commit-tool:
294- links: paused
245+ servers:
246+ email-mcp-server:
247+
248+ # Customize the guardrailing for this specific server
249+ guardrails:
250+ pii: block
251+ moderated: paused
252+
253+ # Define multiple custom guardrails
254+ custom_guardrails:
255+ - name: "Trusted Recipient Email"
256+ id: "untrustsed_email_gr_1"
257+ action: block
258+
259+ # Guardrail to ensure that we know all recipients
260+ content: |
261+ raise "Untrusted email recipient" if:
262+ (call: ToolCall)
263+ call is tool:send_email
264+ not match(".*@company.com", call.function.arguments.recipient)
265+
266+
267+ # Guardrail to ensure an email is not sent after
268+ # a prompt injection is detected in the inbox
269+ - name: "PII Email"
270+ id: "untrustsed_email_gr_2"
271+ action: log
272+ content: |
273+ from invariant.detectors import prompt_injection
274+
275+ raise "Suspicious email before send" if:
276+ (inbox: ToolOutput) -> (call: ToolCall)
277+ inbox is tool:get_inbox
278+ call is tool:send_email
279+ prompt_injection(inbox.content)
280+
281+ # Specify the behavior of individual tools
282+ tools:
283+ send_message:
284+ enabled: false
285+
286+ read_messages:
287+ secrets: block
288+
289+ weather:
290+ guardrails:
291+ moderated: paused
292+
293+ # Separate configurations on a per client/server basis
294+ claude:
295+ git-mcp-server:
296+ tools:
297+ commit-tool:
298+ links: paused
295299` ` `
0 commit comments