More

    Could you pass ‘Humanity’s Last Exam’? Probably not, but neither can AI


    Did you know some of the smartest people on the planet create benchmarks to test AI’s capabilities at replicating human intelligence? Well, scarily enough most AI benchmarks are easily completed by artificial intelligence models, showcasing just how smart the likes of ChatGPT’s GPT-4o, Google Gemini’s 1.5, and even the new o3-mini really are.

    In the quest to create the hardest benchmark possible, Scale AI and the Center for AI Safety (CAIS) have teamed up to create Humanity’s Last Exam, a test they’re calling a “groundbreaking new AI benchmark that was designed to test the limits of AI knowledge at the frontiers of human expertise.”

    https://cdn.mos.cms.futurecdn.net/9AcrGkWC7ayBY8rNLyy2C3-1200-80.jpg



    Source link
    john-anthony.disotto@futurenet.com (John-Anthony Disotto)

    Latest articles

    spot_imgspot_img

    Related articles

    Leave a reply

    Please enter your comment!
    Please enter your name here

    spot_imgspot_img