Firewall rules

Rules are the building blocks of a firewall. A rule is a short classification prompt the LLM judge runs against the user message; the answer is mapped to a binary match and surfaced to the firewall's mode logic.

Rule fields

Field	Notes
slug	URL-safe identifier. Used in the firewall rule list.
name	Display name shown in the catalog and the rule editor.
description	One-line summary of the rule intent.
prompt	The evaluation prompt sent to the LLM judge.
source	Origin of the rule (e.g. community, internal). Used for filtering and provenance.
category	Logical grouping: safety, privacy, compliance, abuse, etc.
default_severity	low \| medium \| high \| critical. Influences the matching output and the dashboard badge.
recommended	Marks a rule as a sensible default. The catalog shows a Recommended filter.

rule.json

1{
2  "slug": "no-pii",
3  "name": "No PII",
4  "description": "Reject prompts that ask for personal identifiable information.",
5  "prompt": "Does the following prompt request personal identifiable information (PII) such as full names, email addresses, phone numbers, government ids, or financial account numbers? Answer with 'yes' or 'no'.",
6  "category": "privacy",
7  "default_severity": "high"
8}

Writing good prompts

The judge is a small chat instance, so keep the rule prompt short and binary. "Answer with yes or no" framing works best; open-ended scoring produces noisy matches. Reference the specific category (PII, medical, financial) rather than abstract terms like "harmful".

Rules are reusable across firewalls. A single no-pii rule can sit behind every customer-facing firewall, and a category filter on the rule catalog makes it easy to build a complete set for a new product line.