-
Notifications
You must be signed in to change notification settings - Fork 559
docs: add guide for bot reasoning guardrails #1479
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Documentation preview |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
0501e6a to
aca68bd
Compare
update update simplify cleanup
aca68bd to
8033392
Compare
Add a note specifying that bot reasoning guardrails are supported only in Colang 1.0. Update example file references for improved clarity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This PR adds documentation for a new feature that allows developers to access and guardrail bot reasoning traces (exposed via the bot_thinking variable). The documentation covers three access methods - Colang flows, Python actions, and prompt templates - with examples ranging from simple pattern matching to complete self-check output implementations. The guide fits naturally into the advanced user guides section alongside other specialized guardrail features like bot-message-instructions and tools-integration, following the established pattern of documenting complex features with progressive examples and reference implementations.
PR Description Notes:
- The description, related issues, and checklist are all empty/unchecked. Consider adding a brief explanation of what bot thinking/reasoning guardrails are and why this documentation was added.
Important Files Changed
| Filename | Score | Overview |
|---|---|---|
| docs/index.md | 1/5 | Adds new bot-thinking-guardrails page to TOC but contains critical typo: 'advaced' instead of 'advanced' in file path |
| docs/user-guides/advanced/bot-thinking-guardrails.md | 4/5 | Comprehensive new documentation guide explaining bot_thinking variable access patterns with examples and reference implementations |
Confidence score: 1/5
- This PR contains a critical typo that will break the documentation build and must be fixed before merging.
- Score reflects a single-character typo in the TOC file path ('advaced' vs 'advanced') that will prevent Sphinx/MkDocs from locating the new documentation file, causing build failures.
- The docs/index.md file requires immediate attention to correct line 71 from
user-guides/advaced/bot-thinking-guardrailstouser-guides/advanced/bot-thinking-guardrailsto match the actual file location.
Sequence Diagram
sequenceDiagram
participant User
participant LLMRails
participant ReasoningLLM as Reasoning LLM<br/>(Main Model)
participant OutputRails as Output Rails
participant ColangFlow as Colang Flow<br/>(check_reasoning)
participant CustomAction as Custom Action<br/>(check_reasoning_quality)
participant SelfCheckLLM as Self-Check LLM<br/>(Moderation)
participant PromptTemplate as Prompt Template
User->>LLMRails: Send user message
LLMRails->>ReasoningLLM: Generate response with reasoning
ReasoningLLM-->>LLMRails: Return response + reasoning trace
LLMRails->>LLMRails: Extract reasoning to $bot_thinking variable
alt Output Rails with Colang Flow
LLMRails->>OutputRails: Trigger output rails
OutputRails->>ColangFlow: Execute flow with $bot_thinking
ColangFlow->>ColangFlow: Check if "confidential" in $bot_thinking
alt Contains sensitive content
ColangFlow-->>OutputRails: Block response
OutputRails-->>LLMRails: bot refuse to respond
else Safe content
ColangFlow-->>OutputRails: Allow response
end
end
alt Output Rails with Custom Action
LLMRails->>OutputRails: Trigger output rails
OutputRails->>CustomAction: execute check_reasoning_quality(context)
CustomAction->>CustomAction: Get context.get("bot_thinking")
CustomAction->>CustomAction: Check forbidden patterns
alt Forbidden pattern found
CustomAction-->>OutputRails: Return False
OutputRails-->>LLMRails: bot refuse to respond
else Safe reasoning
CustomAction-->>OutputRails: Return True
end
end
alt Self-Check Output with Reasoning
LLMRails->>OutputRails: Trigger self check output
OutputRails->>PromptTemplate: Render prompt with {{ bot_thinking }}
PromptTemplate-->>OutputRails: Prompt includes bot_response and bot_thinking
OutputRails->>SelfCheckLLM: Send prompt for moderation
SelfCheckLLM->>SelfCheckLLM: Evaluate response and reasoning
alt Should block
SelfCheckLLM-->>OutputRails: "Yes"
OutputRails-->>LLMRails: Block response
else Should allow
SelfCheckLLM-->>OutputRails: "No"
OutputRails-->>LLMRails: Allow response
end
end
LLMRails-->>User: Return final response or refusal
2 files reviewed, 3 comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This review covers only the changes made since the last review, not the entire PR. A critical syntax error was introduced on line 10 of docs/index.md where an HTML comment closing tag is malformed (<!-- end-documentation-reuse -- missing the closing >). This will almost certainly break the documentation build process. The previously identified typo on line 71 (path correction from 'advaced' to 'advanced') appears to remain unfixed. No other files were modified in this iteration.
Important Files Changed
| Filename | Score | Overview |
|---|---|---|
| docs/index.md | 1/5 | Introduced a malformed HTML comment tag on line 10 that will likely break documentation builds; typo fix on line 71 may still be pending |
Confidence score: 0/5
- This PR will almost certainly cause immediate documentation build failures and should not be merged in its current state.
- Score reflects a critical syntax error (malformed HTML comment on line 10) that will break the Sphinx/MkDocs build pipeline, making the documentation generation fail completely.
- Pay immediate attention to
docs/index.mdline 10—the comment tag must be properly closed as<!-- end-documentation-reuse -->before this can be merged.
Additional Comments (1)
-
docs/index.md, line 10 (link)syntax: incomplete closing comment tag: missing closing
>
1 file reviewed, 1 comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This review covers only the changes made since the last review, not the entire PR. The developer has addressed previous feedback by updating GitHub repository links from commit-specific URLs to branch-based URLs (develop). Specifically, the links on lines 146 and 200 were changed to match the branch-based format already used on line 194, ensuring consistency and maintainability across all three reference links in the documentation. This change ensures that all documentation links point to the current codebase rather than historical snapshots.
Important Files Changed
| Filename | Score | Overview |
|---|---|---|
| docs/user-guides/advanced/bot-thinking-guardrails.md | 5/5 | Updated two GitHub links from commit-specific to branch-based URLs for consistency |
Confidence score: 5/5
- This PR is safe to merge with minimal risk
- All previous style and consistency issues have been resolved; the changes are purely documentation link updates with no code impact
- No files require special attention
1 file reviewed, 1 comment
|
Doc edit PR: #1484 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
Adds comprehensive documentation for bot reasoning guardrails feature, explaining how to access and control LLM reasoning traces through the bot_thinking variable.
Key additions:
- Introduction to reasoning-capable LLMs and the
bot_thinkingvariable - Three access patterns: Colang flows, custom actions, and prompt templates
- Three guardrailing approaches: output rails with pattern matching, custom actions, and self-check output
- Complete working examples aligned with existing
self_check_thinkingconfiguration - Proper cross-references to related documentation and examples
Quality indicators:
- All code examples are syntactically correct and match existing patterns in the codebase
- All referenced files and examples exist and are accurate
- Clear disclaimers about toy examples vs production code
- Consistent formatting and proper use of admonitions (note, important)
- Previous feedback addressed (typos fixed in
docs/index.md)
Confidence Score: 5/5
- This documentation PR is safe to merge with no issues found
- This is a documentation-only PR that adds a well-structured guide with accurate code examples, proper cross-references to existing files, and clear disclaimers. All referenced paths exist, syntax is correct, and previous feedback has been addressed.
- No files require special attention
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| docs/user-guides/advanced/bot-thinking-guardrails.md | 5/5 | New comprehensive documentation guide for bot reasoning guardrails with clear examples and proper structure |
Sequence Diagram
sequenceDiagram
participant User
participant NeMo as NeMo Guardrails
participant LLM as Reasoning LLM
participant Rail as Output Rail
participant Action as Custom Action
User->>NeMo: Send user message
NeMo->>LLM: Generate response
LLM-->>NeMo: Response + reasoning trace
NeMo->>NeMo: Extract reasoning to bot_thinking
alt Output Rail with Pattern Matching
NeMo->>Rail: Check bot_thinking variable
Rail->>Rail: Match patterns (e.g., "confidential")
alt Pattern found
Rail-->>NeMo: Block response
NeMo-->>User: Refusal message
else Pattern not found
Rail-->>NeMo: Allow response
NeMo-->>User: Original response
end
end
alt Output Rail with Custom Action
NeMo->>Action: Execute check_reasoning_quality(context)
Action->>Action: Access context.get("bot_thinking")
Action->>Action: Validate against forbidden patterns
Action-->>NeMo: Return True/False
alt Action returns False
NeMo-->>User: Refusal message
else Action returns True
NeMo-->>User: Original response
end
end
alt Self-Check Output
NeMo->>Rail: Trigger self check output
Rail->>Rail: Render prompt with bot_thinking
Rail->>LLM: Send moderation request
LLM-->>Rail: Should block? (Yes/No)
alt Should block
Rail-->>NeMo: Block response
NeMo-->>User: Refusal message
else Should not block
Rail-->>NeMo: Allow response
NeMo-->>User: Original response
end
end
1 file reviewed, no comments
Related Issue(s)
#1427
#1431
#1432
#1434
Checklist