Blog / March 16, 2026

The Trust Deficit in AI: Why It Matters

Loren CossetteMarch 16, 202613 min read

artificial intelligenceAI trustexplainabilityAI governancehuman-centered AIorganizational leadershipenterprise AIresponsible technologyethical AIalgorithmic accountabilityAI failureinstitutional trustcentaur scientistsAI designfuture of workAI strategytechnology ethicsdigital governance

The thing about trust in AI that most organizations miss — and it's the blind spot that derails their most ambitious projects — is that trust isn't a technical problem with a technical solution. Herremans discovered this when she analyzed why 34% of AI initiatives fail outright, finding that companies keep throwing engineering talent at what's fundamentally a human perception challenge. The numbers tell one story: only 15% of AI projects deliver their intended business value. But the real story lives in the gap between what these systems can do and what people believe they'll do with that capability.

When Da Silva and her team studied AI governance in healthcare, they found something that extends far beyond medical settings: the technologies that work flawlessly in controlled environments often collapse when they encounter the messy realities of human judgment and institutional politics. A radiologist might trust an AI system to flag potential tumors, but that same doctor will resist algorithmic recommendations about treatment protocols because the stakes feel different, the context more complex, the margin for error more personal. This isn't irrationality — it's pattern recognition of a different kind, one that accounts for variables the AI never learned to see.

The trust deficit operates on multiple levels simultaneously. Sigfrids and his colleagues identified how current human-centered AI frameworks focus on individual user experiences while ignoring the broader social systems where these tools actually function. You can perfect the interface, optimize the accuracy metrics, and satisfy every regulatory requirement, but if the community doesn't understand why the AI makes the decisions it makes, trust remains fragile. The system works until it doesn't, and when it fails, it fails spectacularly because no one knew how to intervene.

This creates what Thaler calls the "centaur scientist" problem, though it applies far beyond scientific research: the people best positioned to bridge human judgment and machine capability are exactly the people most organizations struggle to develop or retain. They need deep technical understanding and social intuition, domain expertise and systems thinking. The MIT programs Thaler describes produce students who can navigate both worlds, but they're still rare enough that most organizations operate with translators who understand one side deeply and the other only superficially.

But here's where the trust problem gets genuinely interesting: it's not just about explaining what AI does, it's about revealing what it fundamentally cannot do. The chatbots that Brehm's students design for young adults work precisely because they acknowledge their limitations upfront — News Nest doesn't pretend to replace human editorial judgment, it provides transparency about sources and political leanings so users can make informed decisions. The trust emerges from clarity about boundaries, not claims about omniscience.

The question that keeps surfacing across all these contexts — whether it's healthcare AI that needs to integrate with human clinical reasoning, financial systems that must operate within regulatory frameworks, or educational tools that shape how young people engage with information — isn't whether we can make AI more trustworthy, but whether we can make trust itself more intelligent.

Explainability as a Cornerstone of Trust

The moment AI systems can articulate why they made a particular decision, something fundamental shifts in the relationship between human and machine intelligence. De Santis and his team discovered this when they built concept bottleneck models that could extract and explain the internal logic driving computer vision systems — not just delivering accurate predictions about bird species or skin lesions, but walking users through the five specific visual concepts that led to each conclusion. What emerged wasn't just better performance than traditional black-box models, but a different kind of interaction entirely, one where the AI's reasoning process becomes visible and therefore debatable, refinable, contestable.

But explainability proves more complex than simply opening the algorithmic hood and pointing to gears. Thaler's work with interdisciplinary scientists reveals that true explainability requires what he calls "centaur scientists" — researchers who can bridge the gap between AI pattern recognition and human scientific reasoning, translating between the mathematical abstractions that drive machine learning and the conceptual frameworks that guide human understanding. The most sophisticated AI systems learn to identify patterns that humans haven't even named yet, which means explaining their decisions often requires inventing new vocabulary for phenomena we're only beginning to recognize. When an AI system identifies early markers of cardiac events in ECG data that cardiologists haven't learned to see, explainability becomes less about transparency and more about expanding human knowledge itself.

This is where the technical challenge of explainability crashes into the social reality of trust. Sigfrids and his colleagues found that even perfectly explainable AI systems fail to build trust when the explanations themselves feel alien to human decision-making processes, or when the explanations reveal biases that users find unacceptable. An AI hiring system that can perfectly articulate why it ranked certain candidates higher — detailing the specific resume keywords, educational backgrounds, and work experiences that drove its recommendations — might actually erode trust if those explanations expose patterns of discrimination that mirror historical hiring biases. Explainability becomes a double-edged tool: it can validate AI decisions by making them comprehensible, but it can also illuminate uncomfortable truths about both the data we've fed these systems and the assumptions we've embedded in their design.

The students in Brehm's MIT class discovered this tension when building News Nest, their chatbot designed to help young adults engage with credible news sources. Making the AI's news curation process explainable meant revealing not just which articles it recommended, but why it weighted certain sources over others, how it balanced different political perspectives, and what assumptions it made about user preferences and information needs. The explainability requirement forced them to confront questions they hadn't anticipated: whose definition of credibility should the system use, how should it handle the gap between what users say they want and what they actually click on, and how transparent should it be about the psychological techniques it uses to keep users engaged without falling into the attention-economy traps they're trying to avoid.

Explainability, it turns out, doesn't just make AI systems more trustworthy — it makes visible the entire ecosystem of human values, institutional priorities, and social assumptions that these systems inevitably encode. Which raises a more fundamental question: if we can build AI systems that perfectly explain their reasoning, but that reasoning reflects biases and priorities we're not prepared to defend, what kind of governance structures do we need to ensure that explainable AI actually serves human flourishing?

Human-Centered Governance: Bridging the Trust Gap

The problem with explainable AI isn't that the explanations are wrong — it's that they're answering the wrong question entirely. When Sigfrids and his research team examined human-centered AI governance across different organizational contexts, they discovered that people don't just want to know how an algorithm arrived at a decision; they want to know whether they can trust the entire system that produced, deployed, and oversees that algorithm. This shifts the conversation from technical transparency to institutional accountability, from debugging code to building governance structures that can hold up under the weight of genuine democratic scrutiny.

What emerges from the work at Fidelity, where Kadıoğlu has spent years building modular AI systems at enterprise scale, is a recognition that trust operates at multiple layers simultaneously. The technical architecture matters — those twelve open-source libraries need to talk to each other reliably — but the organizational architecture matters more. When Da Silva and her colleagues studied AI governance in healthcare settings, they found that even perfectly functional AI tools failed to gain adoption when the governance structures around them felt opaque or unresponsive to the people actually using the technology. The nurses spending 25% of their time on administrative tasks didn't need another black box; they needed systems designed with their workflow realities built into the decision-making process from the beginning.

This is where Herremans's findings about AI project failure rates become particularly illuminating. That 34% failure rate isn't primarily a technical problem — it's a governance problem. The companies that succeed with AI aren't necessarily the ones with the most sophisticated algorithms; they're the ones that figure out how to align technical capabilities with human decision-making processes in ways that feel sustainable to everyone involved. Brehm's work with MIT students designing chatbots like "News Nest" offers a glimpse of what this looks like in practice: AI systems that don't just provide information but actively support the kind of critical thinking that healthy democratic participation requires.

The concept bottleneck models that De Santis developed point toward something deeper than just better explainability — they suggest a way of building AI systems that make their reasoning legible not just to experts but to the communities that will live with the consequences of their decisions. When an AI system can articulate its reasoning through concepts that humans actually use to think about the world, it becomes possible to have meaningful conversations about whether those concepts are the right ones, whether the reasoning is sound, whether the system is serving the values it claims to serve.

But even this technical achievement runs up against a more fundamental challenge: trust isn't just about understanding what AI systems do, it's about believing that the people and institutions controlling those systems are acting in good faith. The question hovering over every attempt at human-centered AI governance is whether these technologies will amplify existing power structures or create new possibilities for genuine democratic participation in the decisions that shape our lives.

Real-World Applications and Their Impact

The most telling test of any governance framework isn't how it handles the clean cases but how it performs when the systems escape the laboratory and collide with messy reality. Kadıoğlu's team at Fidelity discovered this when they attempted to scale their modular AI architecture across financial services — what looked elegant in design documents became a sprawling coordination challenge involving twelve different open-source libraries, each with its own update cycles, compatibility quirks, and failure modes. The promise of modularity, that components could be swapped in and out like Lego blocks, ran headlong into the fact that financial systems don't pause for maintenance windows, and a single library upgrade could cascade through dependencies in ways that took days to trace.

This isn't a failure of the modular approach so much as a revelation about what real-world deployment actually demands. When Brehm's students at MIT designed their News Nest chatbot with ten different bird characters representing news categories, they built something that worked beautifully in controlled settings — users engaged with credible sources, avoided doomscrolling, understood the political leanings of their information. But the moment you try to scale that kind of nuanced interaction design across millions of users with different media literacies, cultural contexts, and information needs, the elegant simplicity starts to buckle. The gap between what works in the lab and what survives contact with human behavior at scale reveals something fundamental about how we've been thinking about AI deployment.

Da Silva and her collaborators found similar tensions when they tried to implement human-centered AI governance in healthcare settings. Nurses spending 25% of their time on administrative tasks seemed like a perfect target for AI assistance, but the governance frameworks designed to ensure safety and accountability often made the systems so cumbersome that they created more administrative burden, not less. The regulatory approach that protected patients by requiring extensive documentation and approval processes collided with the contextual approach that prioritized usability and workflow integration.

What emerges from these real-world collisions is a pattern that Thaler's vision of "centaur scientists" anticipated but didn't fully solve: the people who can bridge technical capability and human context are rare, and the institutional structures needed to support them are even rarer. De Santis and his team's breakthrough in concept bottleneck models — where AI systems explain their decisions using concepts they've learned rather than concepts humans pre-define — points toward a different kind of solution. Instead of requiring humans to translate between AI logic and human understanding, these systems develop explanatory frameworks that emerge from their own learning process, creating a middle ground that neither purely technical nor purely human-centered approaches could reach alone.

The question this raises cuts deeper than implementation strategy: if the most promising advances in AI explainability come from systems that generate their own conceptual frameworks, what happens to the human-centered governance structures we've been so carefully constructing?

Building a Future of Trustworthy AI

The real question isn't whether we can build trustworthy AI, but whether we can build trustworthy institutions around AI — and the gap between those two challenges turns out to be where most governance frameworks fall apart. Thaler's work mapping the intersection of AI and the mathematical sciences points toward something crucial that most policy discussions miss: the infrastructure for trustworthy AI isn't just computational or regulatory, it's fundamentally about creating new kinds of collaborative expertise. His concept of "centaur scientists" — researchers fluent in both AI and traditional disciplines — suggests that trust emerges not from perfect algorithms but from hybrid human-machine teams that can navigate complexity in real time, and this insight extends far beyond scientific research into every domain where AI makes consequential decisions.

What De Santis and his colleagues discovered while building concept bottleneck models illuminates why this hybrid approach matters so much. Their technique for making AI explain its reasoning through human-understandable concepts doesn't just improve transparency — it creates a new form of accountability where the model's logic becomes a site for ongoing negotiation between human judgment and machine pattern-recognition. When their system restricts itself to using only five concepts per prediction, it's not just being more interpretable; it's creating space for the kind of collaborative reasoning that Brehm's students are exploring in their chatbot designs, where AI becomes a partner in thinking rather than a replacement for it.

This partnership model runs directly counter to most current approaches to AI governance, which treat trust as something you either have or don't have rather than something you build through ongoing interaction. Kadıoğlu's work at Fidelity demonstrates what happens when you design for this kind of evolving trust from the ground up — their modular framework doesn't just make AI systems more scalable, it makes them more accountable because each component can be understood, tested, and modified independently. The fact that their open-source components have been downloaded over two million times suggests something powerful about how trust spreads: not through certification or compliance, but through transparency that invites participation.

The students in Brehm's class understand this intuitively. Their News Nest project doesn't just combat misinformation by providing credible sources — it builds media literacy by making the sources and their political leanings visible in real time, creating what amounts to a continuous education in how information gets constructed and distributed. This approach acknowledges that trustworthy AI isn't a destination but a practice, one that requires users who are equipped to engage critically with the systems they're using.

Herremans's research on why AI projects fail reveals the institutional dimension of this challenge: thirty-four percent of AI initiatives collapse not because the technology doesn't work but because organizations haven't built the capacity to integrate AI meaningfully into their decision-making processes. Building trustworthy AI requires building trustworthy institutions — places where human expertise and machine capability can evolve together, where failure becomes learning rather than abandonment, where trust gets earned through transparency rather than proclaimed through marketing.

The Trust Deficit in AI: Why It Matters

Explainability as a Cornerstone of Trust

Human-Centered Governance: Bridging the Trust Gap

Real-World Applications and Their Impact

Building a Future of Trustworthy AI

Interested in working together?