Skip to content

Conversation

@srivatsan0611
Copy link

Purpose of this PR

The NRP and PERSON PII entities use regex patterns that are way too broad and cause massive false positives in production. This especially breaks non-English language support and makes the pre-flight masking mode basically unusable.

Current broken behavior

// Input (Spanish text)
"crea un nuevo cliente con email [email protected]"

// Current output
"<NRP> <NRP> con email <EMAIL_ADDRESS>"

// Problem: "crea un" and "nuevo cliente" are incorrectly flagged as PII

Why these patterns are problematic

NRP pattern: /\b[A-Za-z]+ [A-Za-z]+\b/g

  • Matches literally any two consecutive words
  • Examples: "crea un", "nuevo cliente", "hello world", "the user"

PERSON pattern: /\b[A-Z][a-z]+ [A-Z][a-z]+\b/g

  • Matches any two capitalized words
  • Examples: "New York", "The User", "European Union", "United States"

Impact

  • Breaks pre-flight masking for non-English content (Spanish, French, Italian, etc.)
  • Masks legitimate text like city names, country names, and common phrases
  • Makes default configuration unusable for international applications
  • No documentation explaining what "NRP" means or its limitations

Solution

1. Remove from default entity list

Keep the patterns available but exclude them from defaults:

const DEFAULT_PII_ENTITIES = Object.values(PIIEntity).filter(
  (entity) => entity !== PIIEntity.NRP && entity !== PIIEntity.PERSON
);

This makes the default config actually usable while maintaining backward compatibility.

2. Add deprecation warnings

When users explicitly include these entities, show a clear warning:

console.warn(
  `[openai-guardrails-js] DEPRECATION WARNING: PIIEntity.${entity} has been removed from default entities due to high false positive rates.
  - This pattern causes false positives in normal conversation, especially in non-English languages.
  - Consider using more specific region-based patterns like SG_NRIC_FIN, UK_NINO, etc.
  - See: https://github.com/openai/openai-guardrails-js/issues/47`
);

The warning only shows once per entity per session to avoid log spam.

3. Update documentation

Added clear documentation explaining:

  • Why these entities were removed from defaults
  • Migration path for users who need similar functionality
  • Region-specific alternatives (SG_NRIC_FIN, UK_NINO, FI_PERSONAL_IDENTITY_CODE, KR_RRN)
  • Recommendation to use NER services for actual person name detection

Why this works long-term

Backward compatibility: No breaking changes. Users with explicit entity configurations continue to work exactly as before.

Better defaults: The default configuration now works for international applications without masking normal text.

Clear migration path: Users who actually need person name detection or national registration numbers have better alternatives:

  • Use region-specific patterns (recommended)
  • Use NER services like OpenAI API or spaCy (best accuracy)
  • Explicitly opt-in to NRP/PERSON with warnings (if they accept false positives)

Prevents future issues: Documentation and warnings educate users upfront about the limitations.

Testing

All tests pass (27/27 including 8 new tests):

  • Verify NRP and PERSON excluded from defaults
  • Verify Spanish text no longer produces false positives
  • Verify capitalized phrases like city names no longer masked
  • Verify explicit opt-in still works with deprecation warning
  • Verify warning deduplication
  • All existing PII detection tests continue passing

Manual verification with examples from the issue

Test 1: Spanish text (from issue #47)
Input:  "crea un nuevo cliente con email [email protected]"
Output: "crea un nuevo cliente con email <EMAIL_ADDRESS>"
Result: PASS - only email masked

Test 2: Capitalized phrases (from issue #47)
Input:  "Welcome to New York, The User can access the system"
Output: "Welcome to New York, The User can access the system"
Result: PASS - no false positives

Test 3: Other PII still works
Input:  "Contact me at [email protected]"
Output: "Contact me at <EMAIL_ADDRESS>"
Result: PASS - email detection still works

Files changed

  • src/checks/pii.ts - Core implementation with new defaults and deprecation warnings
  • src/__tests__/unit/checks/pii.test.ts - Added 8 comprehensive tests
  • docs/ref/checks/pii.md - Updated with migration guide

Closes #47

@srivatsan0611
Copy link
Author

Hi @gabor-openai , please do review this PR if possible, I've been using OpenAI tools alot and I really appreciate the kind of work you all do in the Agentic Workspace. Do consider this a very humble contribution if possible :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

NRP and PERSON PII entities cause false positives in natural language text

1 participant