When you start a conversation with ChatGPT, you might notice some text at the bottom of the screen: “ChatGPT can make mistakes. Check important info.” That’s still the case with the new GPT-5 model, a senior OpenAI executive reiterated this week.
“The thing, though, with reliability is that there’s a strong discontinuity between very reliable and 100 percent reliable, in terms of the way that you conceive of the product,” Nick Turley, head of ChatGPT at OpenAI, said on The Verge’s Decoder podcast. “Until I think we are provably more reliable than a human expert on all domains, not just some domains, I think we’re going to continue to advise you to double-check your answer.”
That’s something we’ve been advising about chatbots for a long while now across our AI coverage. OpenAI does too.
Always double-check. Always verify. While the chatbot might be useful for some tasks, it might also totally make stuff up.
Turley is hopeful for improvement on that front.
“I think people are going to continue to leverage ChatGPT as a second opinion, versus necessarily their primary source of fact,” he said.
The problem is, it’s really tempting to take a chatbot response — or an AI Overview in Google search results — at face value. But generative AI tools (not just ChatGPT) have a tendency to “hallucinate,” or make stuff up. They do this because they’re built primarily to predict what the answer to a query will be, based on the information found in their training data. But gen AI models have no concrete understanding of truth. If you talk to a doctor or a therapist or a financial adviser, that person should be able to give you the correct answer for your situation, not just the most likely one. AI gives you, for the most part, the answer it determines is most probably correct — without specific domain expertise to check against.
While AI is pretty good at guessing, it’s still, for the most part, just guessing. Turley acknowledged that the tool does best when paired with something that provides a better grasp of the facts, like a traditional search engine or a company’s specific internal data. “I still believe that, no question, the right product is LLMs connected to ground truth, and that’s why we brought search to ChatGPT and I think that makes a huge difference,” he said.
(Disclosure: Ziff Davis, CNET’s parent company, in April filed a lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.)
Don’t expect ChatGPT to get everything right just yet
Turley said GPT-5, the newest large language model that undergirds ChatGPT, is a “huge improvement” in terms of hallucination, but it’s still far from perfect. “I’m confident we’ll eventually solve hallucinations and I’m confident we’re not going to do it in the next quarter,” he said.
In my testing of GPT-5, I’ve seen it commit a few errors already. When testing the new personalities of the language model, it got confused about a college football schedule and said games scheduled throughout the fall would all happen in September.
Be sure you’re checking whatever information you get from a chatbot against some reliable source of truth. That could be an expert like a doctor or a reliable source on the internet. And even if a chatbot gives you information with a link to a source, don’t trust that the bot summarized that source accurately. It may have mangled the facts on its way to you.
If you’re going to make a decision about the information, unless you absolutely don’t care about making the right one, double-check what AI tells you.