Safety & Guardrails

Guardrails validate agent inputs and outputs. Attach them to any agent to enforce content policies, detect PII, prevent hallucinations, and control costs.

from sagewai import UniversalAgent, ContentFilter, PIIGuard

agent = UniversalAgent(
    name="safe-agent",
    model="gpt-4o",
    guardrails=[
        ContentFilter(blocklist=["password"], action="block"),
        PIIGuard(action="redact"),
    ],
)

Guardrail (ABC)

Abstract base class for all guardrails. Implement check_input and check_output to create custom guardrails.

from sagewai import Guardrail, GuardrailResult

class MyGuardrail(Guardrail):
    async def check_input(self, message: str, context: dict) -> GuardrailResult:
        if "forbidden" in message:
            return GuardrailResult(passed=False, violation="Forbidden content")
        return GuardrailResult(passed=True)

    async def check_output(self, response: str, context: dict) -> GuardrailResult:
        return GuardrailResult(passed=True)

Required Methods

MethodSignatureReturnsDescription
check_inputasync check_input(message: str, context: dict)GuardrailResultValidate input before LLM call
check_outputasync check_output(response: str, context: dict)GuardrailResultValidate output before returning to user

GuardrailResult

Result of a guardrail check.

from sagewai import GuardrailResult

# Passed
result = GuardrailResult(passed=True)

# Failed
result = GuardrailResult(
    passed=False,
    violation="Blocked content detected",
    action="block",
)

Fields

FieldTypeDefaultDescription
passedboolrequiredWhether the check passed
violationstr | NoneNoneDescription of the violation
actionLiteral["block", "warn", "escalate"]"block"Action to take on violation

GuardrailViolationError

Raised when a guardrail blocks an input or output (action = "block").

from sagewai import GuardrailViolationError

try:
    response = await agent.chat("forbidden content")
except GuardrailViolationError as e:
    print(e.result.violation)

Attributes

AttributeTypeDescription
resultGuardrailResultThe guardrail result that triggered the error

ContentFilter

Block messages containing forbidden words or regex patterns.

from sagewai import ContentFilter

guard = ContentFilter(
    blocklist=["password", "secret"],
    patterns=[r"\d{3}-\d{2}-\d{4}"],  # SSN pattern
    action="block",
)

Constructor

ParameterTypeDefaultDescription
blocklistlist[str] | NoneNoneForbidden words (case-insensitive)
patternslist[str] | NoneNoneRegex patterns to block
actionLiteral["block", "warn", "escalate"]"block"Action on detection

PIIGuard

Detect and handle personally identifiable information in agent inputs and outputs.

from sagewai import PIIGuard
from sagewai.safety.pii import PIIEntityType

guard = PIIGuard(
    action="redact",
    entity_types=[PIIEntityType.EMAIL, PIIEntityType.PHONE],
)

# Direct detection
findings = guard.detect("Contact user@example.com")
# Returns: [(PIIEntityType.EMAIL, "user@example.com")]

# Redaction
clean = guard.redact("Contact user@example.com")
# Returns: "Contact [REDACTED_EMAIL]"

Constructor

ParameterTypeDefaultDescription
actionLiteral["block", "redact", "warn", "escalate", "log_only"]"block"How to handle detected PII
entity_typeslist[PIIEntityType] | NoneNoneWhich PII types to detect (all by default)

PIIEntityType Enum

ValueRedaction Label
EMAIL[REDACTED_EMAIL]
PHONE[REDACTED_PHONE]
SSN[REDACTED_SSN]
CREDIT_CARD[REDACTED_CARD]
IBAN[REDACTED_IBAN]
IP_ADDRESS[REDACTED_IP]
PASSPORT[REDACTED_PASSPORT]

Methods

MethodReturnsDescription
detect(text)list[tuple[PIIEntityType, str]]Detect all PII in text
redact(text)strReplace PII with redaction labels

HallucinationGuard

Detect potential hallucinations by checking response grounding against RAG context. Uses keyword overlap scoring (lightweight, no additional LLM calls).

from sagewai import HallucinationGuard

guard = HallucinationGuard(threshold=0.3, action="warn")

Constructor

ParameterTypeDefaultDescription
thresholdfloat0.3Minimum grounding score (0-1). Below this triggers violation
actionLiteral["block", "warn", "escalate"]"warn"Action on low grounding

TokenBudgetGuard

Block requests that would exceed a cost budget.

from sagewai import TokenBudgetGuard

guard = TokenBudgetGuard(max_usd=1.0)

Constructor

ParameterTypeDefaultDescription
max_usdfloatrequiredMaximum cost in USD before blocking

OutputSchemaGuard

Validate that agent output conforms to a JSON schema. Checks for valid JSON and required fields.

from sagewai import OutputSchemaGuard

guard = OutputSchemaGuard(
    schema={"type": "object", "required": ["title", "body"]},
)

Constructor

ParameterTypeDefaultDescription
schemadict[str, Any]requiredJSON Schema to validate against