PII Detection
The PII detection endpoint identifies personally identifiable information (PII) in text and returns structured spans with a ready-to-use redacted string. It is built on OpenAI Privacy Filter — a 1.5B sparse-MoE token classifier.
Detected categories: account_number, private_address, private_email, private_person, private_phone, private_url, private_date, secret
Quick start
Section titled “Quick start”curl https://api.casola.ai/openai/v1/pii \ -H "Authorization: Bearer $CASOLA_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "openai/privacy-filter", "input": "My name is Harry Potter and my email is harry.potter@hogwarts.edu" }'Response:
{ "object": "pii.detection", "model": "openai/privacy-filter", "results": [ { "spans": [ { "label": "private_person", "start": 11, "end": 23, "score": 0.9987, "text": "Harry Potter" }, { "label": "private_email", "start": 40, "end": 65, "score": 0.9994, "text": "harry.potter@hogwarts.edu" } ], "redacted_text": "My name is [PRIVATE_PERSON] and my email is [PRIVATE_EMAIL]" } ]}Batch inputs
Section titled “Batch inputs”Pass an array of strings to process multiple texts in a single request:
{ "model": "openai/privacy-filter", "input": [ "Call me at 555-867-5309.", "Ship to 1 Infinite Loop, Cupertino, CA 95014." ]}The results array matches the order of the input array.
Response fields
Section titled “Response fields”| Field | Type | Description |
|---|---|---|
results[].spans | array | Detected PII spans (see below) |
results[].redacted_text | string | Input text with each span replaced by [LABEL] |
Each span has:
| Field | Type | Description |
|---|---|---|
label | string | PII category (e.g. private_email) |
start | integer | Character start offset (inclusive) |
end | integer | Character end offset (exclusive) |
score | float | Average token-level confidence for this span |
text | string | The matched substring |
Async mode
Section titled “Async mode”For large inputs or batch workloads, add "async": true to get a job ID you can poll:
curl https://api.casola.ai/openai/v1/pii \ -H "Authorization: Bearer $CASOLA_API_KEY" \ -H "Content-Type: application/json" \ -d '{"model":"openai/privacy-filter","input":"...","async":true}'Poll the returned job ID at GET /api/jobs/{id}.
Model notes
Section titled “Model notes”- Context limit: the underlying banded-attention model supports up to 512 tokens per call; longer inputs are silently truncated. Split very long documents into paragraphs before processing.
- Language support: optimized for English. Cross-lingual detection is partially supported for proper names and email/URL/phone patterns.
- Not a generative model: Privacy Filter runs as a pure token classifier — it does not generate text and has no system prompt.