What your privacy tools miss.

A woman living in London with 101 black and white dogs. No direct identifiers. No regex match. No named entity. Anyone who has ever seen a Disney film can identify her. This is the category of sensitive data that redaction, tokenisation, and conventional DLP tools cannot see. It is the category Presential was built to catch.

What counts as PII, and what doesn’t.

Direct identifiers are easy. Names, addresses, national insurance numbers, phone numbers, email addresses, account numbers. Regulations define them. Regex catches them. Any competent DLP tool finds them.

Indirect identifiers are harder. A postcode plus a date of birth. A medical condition plus an age band. These don’t identify a person on their own. Combined, they reidentify individuals with surprising accuracy.

Circumstantial identifiers are hardest of all. A narrative detail that pins identity without naming anyone. A description of an event so unusual that only one person could be the subject. A pattern of facts that reveals who you’re reading about to anyone with enough context. A small-sample event, like an in-game medical emergency or a rare diagnosis, that identifies the individual the moment the incident is described.

Why redaction and tokenisation fail.

A redaction tool scans for patterns. It finds a name and replaces it with [NAME-1]. It finds an account number and replaces it with [ACCOUNT-1]. Anything it doesn’t recognise as a pattern, it leaves intact.

A tokenisation tool replaces known entities with opaque identifiers. Same strength. Same weakness. The category of data it doesn’t know to protect stays fully exposed.

Neither approach sees a woman in London with 101 dogs. The quasi-identifier pattern is invisible to regex and to the named-entity models most DLP tools are built on. A combination of postcode, age, and job title identifies one specific employee without any of those three fields being flagged as sensitive in isolation. A sentence where the event itself is the identifier, like a heart attack during a specific televised match, cannot be anonymised by removing names.

These aren’t edge cases. In clinical notes, legal filings, complaint records, and performance reports, the most commercially valuable data is precisely the contextual narrative that conventional tools cannot redact without destroying.

What Presential sees.

Presential’s detection is trained on the full range of identifiability. Direct identifiers. Combinations of indirect identifiers that become identifying at a threshold. Circumstantial narrative that pins identity without any single field being sensitive. Small-sample events that cannot be stripped of PII because the event is itself the PII.

Where conventional tools report a clean scan, Presential reports what the document actually reveals. Where conventional tools replace a name with a placeholder, Presential replaces a narrative pattern with a different but semantically equivalent one. The downstream AI sees a coherent story. The original identities are gone.

The boundary that actually matters.

The line between “safe to send” and “privacy breach” isn’t drawn by whether names were removed. It’s drawn by whether identification is possible. Every privacy tool on the market is built around the first definition. Presential is built around the second.

If you’ve tried a DLP or redaction tool and found it insufficient, this is usually why.

Request a Technical Briefing