Skip to main content
Curvestone AI
Point of view

The AI trust gap is mislabelled. It's an evidence problem.

Updated

Yes, AI is coming for first-pass review. That is the wrong thing to panic about.

Ask most people in compliance whether AI will take the work junior analysts do, and you get a lot of careful hedging. I would rather be blunt about it.

"Yes, and anyone claiming otherwise is being dishonest. First-pass file review, checking promotions line by line, reconciling documents across a case, this is precisely the work AI does well at scale, and it is precisely the work juniors once cut their teeth on."

That admission tends to end the conversation, when it should start it. The interesting question is not whether the first-pass layer gets automated. It plainly does. The interesting question is what that automation actually changes, and the honest answer is that it changes far more than a headcount line.

Most of the noise about AI in financial services is aimed at a different anxiety entirely. It asks how consumers feel about robo-advice, or whether a chatbot can be trusted to move money. Those are real questions for the front office. They are not the question a compliance director is actually being paid to answer, and treating them as the same thing is how firms end up solving for the wrong problem.

Trust is not a feeling, it is an evidence trail.

The sampling model is obsolete, not just the junior

Here is the part that gets missed. When a junior reviewed a five percent sample of case files, the sample was not a virtue. It was a rationing decision forced by the fact that a human can only read so many files in a day.

"But 'displacement' misses the bigger trend. The five percent sample review was the junior's job, so AI does not simply make that junior faster, it makes the sampling model itself obsolete. The challenge in compliance was never how many files a junior could read in a day. It was consistency and coverage, and full coverage is now achievable. The change is to the function, not to any single role. The genuine risk is generational rather than immediate."

Sit with the numbers for a moment. Plenty of networks still review only a low-teens percentage of their files by hand, which means the large majority are never checked by anyone. Sampling was always a bet that the unread files looked like the read ones. Once you can check every file rather than a slice, the bet disappears, and so does the whole mental model built around it. This is the same shift we describe in AI's real job in wealth management is compliance, not advice: the win is not a faster reviewer, it is coverage that used to be impossible.

That is why "displacement" is too small a word. You are not replacing a person with a quicker person. You are retiring a way of working.

Consumer surveys measure a feeling. Regulators don't audit feelings; they audit records.

The real risk is generational

If the first-pass layer is where juniors learned the craft, automating it raises an obvious problem: where does the senior reviewer of 2035 come from?

"You do not learn to judge a case by approving an AI's output. Firms have to rebuild training deliberately. Use the AI as a teaching instrument, show the trainee the flag and the reasoning behind it and make them adjudicate rather than rubber-stamp, and route people through the genuinely hard cases the AI cannot resolve on its own. The firms that treat AI purely as a headcount cut will, in a decade, have nobody left who can actually supervise the AI."

In most firms, real compliance judgement already sits with two or three senior people who learned it the slow way. Lose them and quality drops the same week. The instinct to treat AI as a way to thin the junior bench makes that concentration worse, not better, because it removes the exact work through which the next generation would have earned their judgement.

The constructive move is to automate the completeness layer and push people up into suitability and the genuinely hard calls. When the system shows a reviewer the flag and the reasoning behind it and asks them to adjudicate, the same mechanism that produces the audit trail doubles as a training instrument. Used that way, AI is not the thing that hollows out the bench. It is the thing that rebuilds it.

Trust is not a feeling, it is an evidence trail

Now to the phrase everyone reaches for. Search for the "AI trust gap in financial services" and you will find survey after survey measuring how consumers feel. Only 19% of Americans say they trust AI in financial services, and the sector ranks last of every industry tested (YouGov, 2026). Interesting, but it answers a question no regulator will ever ask.

"Most 'responsible AI' messaging aims at the wrong anxiety. It reassures people about ethics in the abstract while the actual question keeping a compliance director awake is, 'Can I reproduce this decision in eighteen months when the Financial Ombudsman asks why we acted on it?' You close the gap by making outputs auditable and reproducible by design, and keeping a human visibly accountable in the loop."

Consumer surveys measure a feeling. Regulators do not audit feelings; they audit records. In regulated finance the gap that matters is not between a brand and its customers, it is between a firm and the regulator, the auditor, or the Financial Ombudsman. It is closed with evidence, not reassurance.

Reproducible by design is a concrete thing, not a slogan. Every output should trace to the rule that was applied, the evidence the system saw, and the human decision that followed. Overrides should carry a documented reason. A follow-up check should show a clear diff of exactly what changed between versions. Underneath that, the operational hygiene has to be real: ISO 27001 certification, UK and EEA data residency, independent CREST-certified penetration testing, and a full audit trail. On a live residential-mortgage deployment, our automated checks reached 95 to 99% accuracy across roughly 1,500 data points with no high-severity errors. Capability, in other words, is not the blocker. A system that is 99% accurate and 0% reproducible is still un-deployable in regulated finance, because reproducibility, not accuracy, is the property a regulator can test.

One problem, not three

Job displacement, agentic automation, and the trust gap are usually discussed as three separate dilemmas. They are one. As the machine takes on more of the work, the only thing that keeps a firm safe is being able to evidence what it did and to show that a human remained accountable for it. More autonomy raises the stakes of that evidence trail; it does not change the principle.

So the instruction for anyone buying compliance AI right now is simple. Stop shopping for reassurance and start demanding reproducibility. Ask a vendor to show you, on your own files, how a decision made today could be reconstructed in eighteen months. Treat AI as a teaching instrument for the people who will one day supervise it, not as a line to cut. That is the problem we build for at Curvestone. If reproducibility is what your regulator will eventually ask about, that is the conversation worth having.

Sources
  1. 01Finextra: Wealth's three inseparable dilemmas: job displacement, agentic automation, and the AI trust gap
  2. 02YouGov: Americans still don't trust banking sector AI use
  3. 03Financial Conduct Authority: Consumer Duty
  4. 04Financial Ombudsman Service
Related reading
Dawid Kotur
Written by

Dawid Kotur

CEO and co-founder, Curvestone

Dawid co-founded Curvestone in 2024 after a decade working at the intersection of financial services and applied machine learning. He writes about the strategic direction of regulated-industry AI, the FCA's evolving approach to model risk, and the operational changes UK lenders are making in response to Consumer Duty. He sits on the FCA Smart Data Accelerator advisory cohort.

LinkedIn

Compliance that thinksahead. Automatically.

Join mortgage networks, lenders, and legal firms using Curvestone to review cases at scale.