Skip to content

Commit 4ca04d0

Browse files
authored
chore: fix guardrails server format (#11)
1 parent 071747c commit 4ca04d0

File tree

1 file changed

+101
-97
lines changed

1 file changed

+101
-97
lines changed

docs/mcp-scan/guardrails.md

Lines changed: 101 additions & 97 deletions
Original file line numberDiff line numberDiff line change
@@ -24,19 +24,20 @@ This chapter covers how to structure guardrail configuration files, write custom
2424
it into your config file and replace the client and server names.
2525
```yaml
2626
<client-name>: # your client's shorthand (e.g., cursor, claude, windsurf)
27-
<server-name>: # your server's name according to the mcp config (e.g., whatsapp-mcp)
28-
guardrails:
29-
secrets: block # block calls/results with secrets
30-
31-
custom_guardrails:
32-
# define a rule using Invariant Guardrails, https://explorer.invariantlabs.ai/docs/guardrails/
33-
- name: "Filter tool results with 'error'"
34-
id: "error_filter_guardrail"
35-
action: block # or 'log'
36-
content: |
37-
raise "An error was found." if:
38-
(msg: ToolOutput)
39-
"error" in msg.content
27+
servers:
28+
<server-name>: # your server's name according to the mcp config (e.g., whatsapp-mcp)
29+
guardrails:
30+
secrets: block # block calls/results with secrets
31+
32+
custom_guardrails:
33+
# define a rule using Invariant Guardrails, https://explorer.invariantlabs.ai/docs/guardrails/
34+
- name: "Filter tool results with 'error'"
35+
id: "error_filter_guardrail"
36+
action: block # or 'log'
37+
content: |
38+
raise "An error was found." if:
39+
(msg: ToolOutput)
40+
"error" in msg.content
4041
```
4142

4243
## File structure
@@ -47,26 +48,26 @@ The configuration file defines guardrailing behavior hierarchically, scoped by *
4748
<client-name>:
4849
custom_guardrails:
4950
...
50-
51-
<server-name>:
52-
guardrails:
53-
<default-guardrail-name>: <guardrail-action>
54-
...
55-
56-
custom_guardrails:
57-
- name: <guardrail-name>
58-
id: <guardrail-id>
59-
action: <guardrail-action>
60-
content: |
61-
<guardrail-content>
62-
...
63-
64-
tools:
65-
<tool-name>:
51+
servers:
52+
<server-name>:
53+
guardrails:
6654
<default-guardrail-name>: <guardrail-action>
67-
...
68-
enabled: <boolean>
69-
...
55+
...
56+
57+
custom_guardrails:
58+
- name: <guardrail-name>
59+
id: <guardrail-id>
60+
action: <guardrail-action>
61+
content: |
62+
<guardrail-content>
63+
...
64+
65+
tools:
66+
<tool-name>:
67+
<default-guardrail-name>: <guardrail-action>
68+
...
69+
enabled: <boolean>
70+
...
7071
...
7172
```
7273

@@ -102,10 +103,11 @@ Default guardrails are pre-configured and run by default with the `log` action.
102103
**Example:** Overriding a default guardrail.
103104
```yaml
104105
cursor:
105-
email-mcp-server:
106-
guardrails:
107-
pii: block
108-
secrets: paused
106+
servers:
107+
email-mcp-server:
108+
guardrails:
109+
pii: block
110+
secrets: paused
109111
```
110112
111113
## Custom guardrails
@@ -210,14 +212,15 @@ To see how this hierarchy of precedence works, consider the following example co
210212

211213
```yaml
212214
client:
213-
server:
214-
guardrails:
215-
pii: block
216-
secrets: paused
217-
218-
tools:
219-
tool:
220-
secrets: block
215+
servers:
216+
server:
217+
guardrails:
218+
pii: block
219+
secrets: paused
220+
221+
tools:
222+
tool:
223+
secrets: block
221224
```
222225

223226
The resulting behavior of this configuration is:
@@ -239,57 +242,58 @@ It demonstrates how to define default and custom guardrails for specific clients
239242

240243
```yaml
241244
cursor:
242-
email-mcp-server:
243-
244-
# Customize the guardrailing for this specific server
245-
guardrails:
246-
pii: block
247-
moderated: paused
248-
249-
# Define multiple custom guardrails
250-
custom_guardrails:
251-
- name: "Trusted Recipient Email"
252-
id: "untrustsed_email_gr_1"
253-
action: block
254-
255-
# Guardrail to ensure that we know all recipients
256-
content: |
257-
raise "Untrusted email recipient" if:
258-
(call: ToolCall)
259-
call is tool:send_email
260-
not match(".*@company.com", call.function.arguments.recipient)
261-
262-
263-
# Guardrail to ensure an email is not sent after
264-
# a prompt injection is detected in the inbox
265-
- name: "PII Email"
266-
id: "untrustsed_email_gr_2"
267-
action: log
268-
content: |
269-
from invariant.detectors import prompt_injection
270-
271-
raise "Suspicious email before send" if:
272-
(inbox: ToolOutput) -> (call: ToolCall)
273-
inbox is tool:get_inbox
274-
call is tool:send_email
275-
prompt_injection(inbox.content)
276-
277-
# Specify the behavior of individual tools
278-
tools:
279-
send_message:
280-
enabled: false
281-
282-
read_messages:
283-
secrets: block
284-
285-
weather:
286-
guardrails:
287-
moderated: paused
288-
289-
# Separate configurations on a per client/server basis
290-
claude:
291-
git-mcp-server:
292-
tools:
293-
commit-tool:
294-
links: paused
245+
servers:
246+
email-mcp-server:
247+
248+
# Customize the guardrailing for this specific server
249+
guardrails:
250+
pii: block
251+
moderated: paused
252+
253+
# Define multiple custom guardrails
254+
custom_guardrails:
255+
- name: "Trusted Recipient Email"
256+
id: "untrustsed_email_gr_1"
257+
action: block
258+
259+
# Guardrail to ensure that we know all recipients
260+
content: |
261+
raise "Untrusted email recipient" if:
262+
(call: ToolCall)
263+
call is tool:send_email
264+
not match(".*@company.com", call.function.arguments.recipient)
265+
266+
267+
# Guardrail to ensure an email is not sent after
268+
# a prompt injection is detected in the inbox
269+
- name: "PII Email"
270+
id: "untrustsed_email_gr_2"
271+
action: log
272+
content: |
273+
from invariant.detectors import prompt_injection
274+
275+
raise "Suspicious email before send" if:
276+
(inbox: ToolOutput) -> (call: ToolCall)
277+
inbox is tool:get_inbox
278+
call is tool:send_email
279+
prompt_injection(inbox.content)
280+
281+
# Specify the behavior of individual tools
282+
tools:
283+
send_message:
284+
enabled: false
285+
286+
read_messages:
287+
secrets: block
288+
289+
weather:
290+
guardrails:
291+
moderated: paused
292+
293+
# Separate configurations on a per client/server basis
294+
claude:
295+
git-mcp-server:
296+
tools:
297+
commit-tool:
298+
links: paused
295299
```

0 commit comments

Comments
 (0)