PII Protection Guide
This guide covers how to protect personally identifiable information (PII) in your Sagewai agents. You will learn how to detect, redact, and monitor PII in both agent inputs and outputs.
Why PII Protection Matters
When users interact with AI agents, they may inadvertently share sensitive information:
- Email addresses in support tickets
- Phone numbers in contact requests
- Social security numbers in financial documents
- Credit card numbers in payment conversations
Without protection, this data flows to the LLM provider's API, creating compliance and privacy risks.
Quick Setup
Add PIIGuard to any agent in one line:
from sagewai.engines.universal import UniversalAgent
from sagewai.safety.pii import PIIGuard
agent = UniversalAgent(
name="support-agent",
model="gpt-4o",
guardrails=[PIIGuard(action="redact")],
)
This detects and redacts all 7 PII entity types on both input and output.
Choosing an Action
PIIGuard supports 5 actions, depending on your compliance requirements:
block — Reject the message entirely
PIIGuard(action="block")
The agent raises GuardrailViolationError if PII is detected. The message never reaches the LLM. Use this when PII must never be processed under any circumstances.
from sagewai.safety.guardrails import GuardrailViolationError
try:
response = await agent.chat("My SSN is 123-45-6789")
except GuardrailViolationError as e:
print(f"Blocked: {e}")
# Handle the blocked message
redact — Replace PII with labels
PIIGuard(action="redact")
PII is replaced with labels like [REDACTED_EMAIL] before the message reaches the LLM. The agent can still process the request, but without the sensitive data.
Input: "Contact me at john@example.com or 555-123-4567" What the LLM sees: "Contact me at [REDACTED_EMAIL] or [REDACTED_PHONE]"
Note: The redact action currently blocks (raises an error) rather than passing the redacted text. For pass-through redaction, use the
detectandredactmethods directly.
warn — Log and allow through
PIIGuard(action="warn")
The PII violation is logged but the message is sent to the LLM unchanged. Use this during development or for audit trails where you need to track PII exposure without blocking.
escalate — Emit event and allow
PIIGuard(action="escalate")
Emits a GUARDRAIL_ESCALATION event that can be handled by event listeners. The message is sent to the LLM unchanged.
from sagewai.core.events import AgentEvent
async def handle_escalation(event: AgentEvent, data: dict):
if event == AgentEvent.GUARDRAIL_ESCALATION:
print(f"PII escalation: {data}")
# Log to compliance system, notify admin, etc.
agent.on_event(handle_escalation)
log_only — Detect and log silently
PIIGuard(action="log_only")
Behaves like warn — detects and logs PII but does not block or modify the message.
Selecting Entity Types
By default, PIIGuard detects all 7 entity types. You can narrow the scope:
from sagewai.safety.pii import PIIGuard, PIIEntityType
# Only detect financial PII
financial_guard = PIIGuard(
action="block",
entity_types=[
PIIEntityType.CREDIT_CARD,
PIIEntityType.IBAN,
PIIEntityType.SSN,
],
)
# Only detect contact information
contact_guard = PIIGuard(
action="redact",
entity_types=[
PIIEntityType.EMAIL,
PIIEntityType.PHONE,
],
)
Entity Types and Patterns
| Entity | Example | Regex Pattern |
|---|---|---|
EMAIL | user@example.com | Standard email format |
PHONE | (555) 123-4567, +1-555-123-4567 | US phone with optional country code |
SSN | 123-45-6789 | US Social Security Number format |
CREDIT_CARD | 4111 1111 1111 1111 | 16-digit card with optional separators |
IBAN | DE89370400440532013000 | International Bank Account Number |
IP_ADDRESS | 192.168.1.1 | IPv4 address |
PASSPORT | AB1234567 | Alphanumeric passport number |
Standalone PII Detection
Use PIIGuard outside of an agent for direct text processing:
from sagewai.safety.pii import PIIGuard, PIIEntityType
guard = PIIGuard(action="redact")
# Detect PII in text
text = "Send invoice to john@acme.com, call 555-123-4567, card 4111-1111-1111-1111"
findings = guard.detect(text)
for entity_type, matched_text in findings:
print(f" {entity_type.value}: {matched_text}")
# Output:
# EMAIL: john@acme.com
# PHONE: 555-123-4567
# CREDIT_CARD: 4111-1111-1111-1111
# Redact PII
clean = guard.redact(text)
print(clean)
# "Send invoice to [REDACTED_EMAIL], call [REDACTED_PHONE], card [REDACTED_CARD]"
Batch Processing
documents = ["Doc 1 with email@test.com", "Doc 2 with 555-0123", "Doc 3 clean"]
for doc in documents:
findings = guard.detect(doc)
if findings:
print(f"PII found: {len(findings)} entities")
clean_doc = guard.redact(doc)
# Process clean_doc instead of original
else:
# Safe to process as-is
pass
Combining with Other Guardrails
PIIGuard works alongside other guardrails. Order matters — put PIIGuard first:
from sagewai.safety.guardrails import ContentFilter, TokenBudgetGuard
from sagewai.safety.hallucination import HallucinationGuard
agent = UniversalAgent(
name="production-agent",
model="gpt-4o",
guardrails=[
# 1. PII protection first — redact before anything else
PIIGuard(action="redact", entity_types=[
PIIEntityType.EMAIL,
PIIEntityType.SSN,
PIIEntityType.CREDIT_CARD,
]),
# 2. Content filtering — block injection attacks
ContentFilter(blocklist=["DROP TABLE", "DELETE FROM"]),
# 3. Hallucination guard — check output grounding
HallucinationGuard(threshold=0.3, action="warn"),
# 4. Budget guard — prevent cost overruns
TokenBudgetGuard(max_usd=2.0),
],
)
Monitoring PII Events
Track PII detections using the Analytics API:
from sagewai.admin.analytics import AnalyticsStore
store = AnalyticsStore()
# Record PII events (called automatically by PIIGuard with escalate/log_only)
store.record_guardrail_event(
agent_name="support-agent",
event_type="pii",
entity_types=["EMAIL", "PHONE"],
count=2,
)
# Query PII risk metrics
risks = store.get_risks(agent_name="support-agent")
print(f"PII events: {risks['pii_events']}")
print(f"Total guardrail events: {risks['total_events']}")
Best Practices
-
Use
blockfor high-sensitivity agents (financial, healthcare). It is better to reject a message than to risk PII exposure. -
Use
redactfor general-purpose agents where PII in the input is likely accidental and the request can still be fulfilled without it. -
Use
escalatein production to track PII exposure without disrupting the user experience. Feed events into your compliance monitoring system. -
Select only relevant entity types. If your agent only handles financial data, there is no need to detect passport numbers.
-
Test with realistic data. Verify that the regex patterns match the PII formats relevant to your users and regions.
-
Layer PIIGuard with ContentFilter. PIIGuard catches structured PII (emails, SSNs). ContentFilter catches domain-specific sensitive terms (project names, internal codes).
-
Audit regularly. Review PII event logs from the Analytics API to understand exposure patterns and adjust your guardrail configuration.