Why haven’t the hyperscalers built this?

It’s a fair question. Three of the largest research budgets on Earth have been addressing this problem since LLMs became a board-level priority. Thousands of engineers. Direct commercial access to every regulated enterprise on the planet. If semantic pseudonymisation with reversibility were buildable from inside a hyperscaler, it would already exist. So what’s stopping them?

They tried. They chose redaction.

AWS ships Comprehend. Azure ships Presidio and Purview. Google Cloud ships Sensitive Data Protection with format-preserving encryption. All three identified PII protection as a priority. All three built production tooling. All three stopped at redaction or structural tokenisation.

None shipped semantic pseudonymisation with reversibility.

This isn’t a gap in anyone’s roadmap. It’s the outcome of a product decision three different companies made independently. They built redaction because redaction fits a hyperscaler business model. Semantic pseudonymisation does not.

A hyperscaler cannot sit across hyperscalers.

No regulated enterprise that matters runs on one cloud. A typical Tier-1 bank runs workloads on AWS for one business unit, GCP for another, Azure for Microsoft-tied productivity, Snowflake for analytics, Databricks for machine learning, and an on-premises estate for everything that cannot leave the building.

The privacy transformation layer has to sit across all of them. An AWS-native product cannot protect data flowing into GCP. An Azure-native product cannot transform a Snowflake query. The problem is structurally cross-vendor. The solution has to be architecturally outside every vendor.

That’s not a business-model choice. It’s a constraint.

A hyperscaler cannot run a sidecar inside your cluster.

Every hyperscaler privacy product has the same architectural shape. An SDK in a language your developers use, or an API your application calls, or a managed service behind that API. All of them require your developers to remember to call them. All of them can be bypassed by a developer who forgets.

Presential integrates differently. An Envoy sidecar runs inside your cluster, with a WASM filter at the network layer. Traffic crossing the pod passes through the filter whether or not the developer configured it. The enforcement is structural, not procedural.

A hyperscaler cannot ship this without running inside your cluster. Running inside your cluster means running code your security team inspected, on infrastructure your team operates, under policies you defined. That is the inverse of the hyperscaler business model, which is selling you infrastructure you rent from them.

The incentives point the other way.

Cloud providers want enterprise data flowing through their stack. Any layer that reduces the data crossing their perimeter reduces telemetry, downstream service attach, and the lock-in that makes the hyperscaler business work. Pseudonymising data before it leaves the enterprise works against their commercial interest.

Model providers want the opposite problem. They want maximum context to produce maximum accuracy. A privacy layer that transforms PII reduces the signal their models receive. Building that layer themselves would mean building a product designed to make their core product slightly worse.

This is the same structural dynamic that protected Stripe from banks and Twilio from telcos. Both sides need the function. Neither side wants to own it.

It’s a compliance product, not an infrastructure product.

Semantic pseudonymisation requires knowing that a counterparty reference in a trade confirmation is PII, but the same phrase in a legal contract is not. That a patient identifier in a clinical note is PII, but a diagnostic code is not. That a regulator reference in a supervisory filing is metadata, not a named entity.

That knowledge is specific to industry and regulator, and it compounds with every deployment. It cannot be built from outside the enterprise. A hyperscaler can build detection. It cannot build judgement.

What this means for your decision.

The question isn’t whether a hyperscaler could, in principle, build what Presential builds. The question is whether it’s rational for them to do so against the grain of their business model, their multi-vendor-hostile architecture, and their structural incentives.

The answer has been the same since LLMs became a board-level priority. They haven’t. They won’t. The infrastructure layer between every enterprise and every model is a category that has to be built by someone who isn’t any of the parties in that transaction.

If you’ve heard a version of this objection and want to stress-test the argument, get in touch. We’d rather be pushed on it than not.

Request a Technical Briefing