More

    Tests reveal that ChatGPT-5 hallucinates less than GPT-4o did – and Grok is still the king of making stuff up



    • ChatGPT 5 scores a low 1.4% on the Hallucination Leaderboard
    • This puts it ahead of ChatGPT-4 which scores 1.8% and GPT-4o, which scores 1.49%
    • Grok 4 is much higher at 4.8% with Gemini-2.5 Pro is at 2.6%

    When OpenAI launched ChatGPT-5 on Thursday last week one if the big selling points that CEO Sam Altman emphasised was that ChatGPT-5 was the most “powerful, smart, fastest, reliable and robust version of ChatGPT that we’ve ever shipped”, and in the presentation, OpenAI staff also emphasized that ChatGPT-5 would “mitigate hallucinations”.

    When AI makes something up it’s called an hallucination, and while hallucination rates are dropping amongst all LLMs, it’s still surprisingly common, and one of the main reasons that we can’t trust AI to perform a task without human supervision.

    https://cdn.mos.cms.futurecdn.net/6ocQUvPoS4DJUwR7bFbckD.jpg



    Source link

    Latest articles

    spot_imgspot_img

    Related articles

    spot_imgspot_img