Ontology Calls for Human Verification in AI Training Data Without Sacrificing Privacy

aii-blackwhite2

Ontology is calling attention to a growing problem in the AI world: how do you prove that a piece of training data came from a real person without turning the whole process into a privacy nightmare?

In a recent post , the project argued that the answer should not be more surveillance. Instead of asking contributors to hand over selfies, IDs, biometric scans, and other personal details, Ontology says the industry should lean on verifiable credentials and selective disclosure so people can prove they are human without giving away everything about themselves.

That idea matters more now than it did a year ago. The AI training-data conversation has clearly shifted. It used to be mostly about scale, volume, and how much data you could gather. Now the bigger question is where that data came from, whether it is actually human-made, and how much of it has already been polluted by synthetic content.

That concern is no longer a niche issue. It has become one of the biggest headaches facing AI teams trying to build cleaner, more reliable models. Ontology says the market is already starting to treat proof of personhood like a valuable asset. Verified human data is becoming something companies may have to pay extra for.

The demand is rising, but the supply is limited, and the way many platforms plan to verify people is, in the company’s view, deeply flawed. The easiest path for most platforms is also the most invasive one.

If they want to know if someone is human, they usually ask for more and more personal information. They may require a selfie, a government ID, a liveness check, behavioral tracking, device fingerprinting, or some mix of all of the above.

Each layer may make verification more confident, but it also means the user gives up more privacy. Over time, the person trying to prove they are real gets broken down into a set of data points stored on someone else’s systems. Ontology argues that this is the wrong tradeoff.

The company says the problem is not that people need to be verified. The problem is that the current model assumes verification has to come with permanent exposure. That is what happens when the industry uses centralized tools designed to collect as much data as possible. In practice, the human becomes the cost of trust.

The Real Breakthrough

The alternative Ontology is pointing to is built around the W3C Verifiable Credentials Data Model 2.0, which was announced as a Recommendation in May 2025. The idea is pretty simple, even if the cryptography behind it is not: a trusted issuer, such as a government, bank, or verification provider, can confirm something about a person once, and that credential can live on the user’s own device.

When a platform later needs to know whether that person is human, the user can present a cryptographic proof instead of handing over the whole underlying record. That means the verifier gets what it needs, and nothing more.

It learns that a trusted issuer has confirmed the person is human. It does not see the person’s full identity file, biometric data, or other extra details. The issuer does not need to be contacted every time the credential is used, and the user does not end up leaving a trail of linkable identifiers across different platforms.

Ontology says the real breakthrough here is selective disclosure. That is what makes the system genuinely privacy-preserving. A credential can contain lots of information, but the user only reveals the pieces that matter for the specific request. So if a platform only needs proof of personhood, it gets exactly that and nothing else.

No extra personal data, no biometrics, no reusable profile fragments that could be stitched together later. The company also pointed to its own work in decentralized identity , including ONT ID and the ONTO Wallet, as examples of this approach in practice.

According to Ontology, these tools are designed to keep credentials on-device and let users generate proofs locally, without exposing their private data to issuers or verifiers. The larger point, though, is not just about Ontology. It is about where AI infrastructure is heading.

As companies race to clean up their training data and figure out what can still be trusted, the pressure to verify human contributors is only going to grow. The real question is whether the industry solves that problem by building more surveillance into the stack, or by using systems that let people prove they are real without giving up their privacy in the process.

Ontology is clearly betting on the second option. And with AI companies now worrying more about provenance than raw quantity, that bet may start looking less like a niche privacy argument and more like a practical requirement for the next phase of AI data collection.

Disclaimer: This article is copyrighted by the original author and does not represent MyToken’s views and positions. If you have any questions regarding content or copyright, please contact us.(www.mytokencap.com)contact
More exciting content is available on
X(https://x.com/MyTokencap)
or join the community to learn more:MyToken-English Telegram Group
https://t.me/mytokenGroup