By now we all know—or should know—that AI hallucinates. That’s because large language models are designed to produce coherence from massive amounts of training data. Context doesn’t matter, and nor does truth, only what fits the pattern.
The bigger problem is that when AI fails, it fails convincingly. It’s a lot more dangerous than failing spectacularly, because errors are buried inside otherwise coherent output. For businesses, the pace of change has been part of the problem. So we need to step back and ask: Are we even using AI correctly?
Fabricating data isn’t the only risk. AI can also encode bias into decisions, show the wrong people the wrong information, or connect dots that land you in court, all while appearing to work perfectly. I’ve been working through these risks at Reliable Controls, and there’s no easy answer because each department has different risk exposure and different ways the errors can hide.
Accounting and Financial Analysis
I use AI for research. It’s powerful for finding information, providing reference links, pulling together material, and making unexpected connections. But would I connect it to my QuickBooks and ask for a quarterly analysis? Hell, no.
The danger is that when you feed an AI real data, the output becomes more convincing because it’s populated with your actual figures. It might invent a trend, fill in gaps with guesses, or summarize incomplete data with confidence.
If models were people, you might even call them good liars. Social research shows that liars who blend truth and deception are significantly harder to detect than those who fabricate entirely. As with AI, you don’t catch the lie because nothing looks wrong. Just ask Deloitte Australia.
In October 2025, the global consultant admitted to submitting a 237-page government report that contained fabricated academic references, research papers, and even quotes attributed to a federal court judge. What happened to the human fact-checkers? The firm was forced to issue a partial refund and acknowledge using GPT-4o to produce parts of the report.
In some domains, though, the stakes are higher. (And incidentally, research suggests models hallucinate more often in specialized domains like law and medicine where accuracy matters the most.)
Human Resources
“Output” is such a cold word when it comes to people. We are talking about sensitive information that includes medical conditions, performance evaluations, and personal histories, all of it heavily regulated.
In HR especially, the risk is that AI will leak sensitive data, make illegal inferences about candidates, or bypass privacy controls that keep protected information restricted to the right personnel. For example, a hiring tool could infer pregnancy from patterns in the data (age, gaps in employment, search history) and factor that into recommendations, even though that’s illegal to consider.
This happens because AI does not understand context, and guardrails are usually bolted on after the fact. In cases of hiring tools discriminating against applicants based on age, race, or sex, models may have learned the wrong patterns from the data they were trained on. Optimizing for “candidates who look like past successful hires” then extends and even amplifies human bias in the historical record.
For example, a lawsuit against Workday, now certified as a nationwide class action, alleges age discrimination in hiring tools used by hundreds of employers. If it proceeds, millions of applicants could be included.
Legal commentary around a separate Equal Employment Opportunity Commission case made the point that employers cannot delegate compliance to vendors. Even if liability is established against the software provider, that doesn’t absolve the company that deployed the tool. The same principle applies to any HR function where AI touches employee data.
The Operational Audit
The great virtue of AI is its ability to compound: outputs feed into other systems, scale across processes, build on themselves. But that’s also the risk. If I ask an AI for a quarterly analysis and it hallucinates a trend, that analysis goes to my CFO, shapes a board presentation, and informs budget allocation.
If we’re honest, most leaders right now couldn’t say with confidence whether their AI is working correctly. Often, the models are working exactly as designed. And that is the issue. For each department, an audit is required and humans put in the loop. The questions to ask are:
- What data does AI have access to?
- What decisions is it informing or making?
- Where can errors hide?
- Who reviews output before it’s acted on?
- What’s the liability if it’s wrong?
In accounting, the risk is hallucinated figures buried in real data. In HR, it’s encoded bias and exposed personal data. The law profession has already seen many cases of fabricated citations. Each function has its own failure mode, and each requires someone who understands the work well enough to catch what the model misses.
We know AI hallucinates, but without human backup and auditing systems, that black box poses a risk to business that may be as big as AI’s original promise.

