More

    OpenAI’s Deep Research smashes records for the world’s hardest AI exam, with ChatGPT o3-mini and DeepSeek left in its wake




    • The accuracy achieved by the top-scoring AI in the world’s hardest benchmark as improved by 183% in just two weeks
    • ChatGPT o3-mini now scores up to 13% accuracy depending on capacity
    • OpenAI Deep Research obliterates competition with 26.6% accuracy result

    The world’s hardest AI exam, Humanity’s Last Exam, was launched less than two weeks ago, and we’ve already seen a huge jump in accuracy, with ChatGPT o3-mini and now OpenAI’s Deep Reasoning topping the leaderboard.

    The AI benchmark created by experts from around the world contains some of the hardest reasoning problems and questions known to man – it’s so hard, that when I previously wrote about Humanity’s Last Exam in the article linked above, I couldn’t even understand one of the questions, let alone answer it.


    https://cdn.mos.cms.futurecdn.net/9AcrGkWC7ayBY8rNLyy2C3-1200-80.jpg



    Source link
    john-anthony.disotto@futurenet.com (John-Anthony Disotto)

    Latest articles

    spot_imgspot_img

    Related articles

    Leave a reply

    Please enter your comment!
    Please enter your name here

    spot_imgspot_img