-
Notifications
You must be signed in to change notification settings - Fork 18
Description
Hi everyone :)
I see that PR #51 was recently added to make jailbreak guardrails more rigorous, thank you!
I wanted to ask if there was the possibility to add the same functionality to off topic prompts, or even if both these LLM-based guardrails could be parameterized by something like "context length"/"message history length". I see the jailbreak check already uses a constant variable "MAX_CONTEXT_TURNS", which instead could be passed as argument when initializing LLM guardrails. This would enhance usability:
- LLM guardrail strength/rigor is directly parameterized through conversation length,
- token (and hence cost) management is more easily actionable.
I can try opening a PR for this myself based on MAX_CONTEXT_TURNS, but since this param would go up to the very entry point of guardrail initialization (maybe into the guardrail config?), I would like to ask you for your opinion and help.
Please let me know what you think! :)