More

    I compared ChatGPT 4.1 to o3 and 4o to find the most logical AI model – the result seems almost irrational



    OpenAI‘s release of GPT-4.1 for ChatGPT came quietly but represents an impressive upgrade, albeit one focused specifically on logical reasoning and coding. Its enormous context window and grasp of structured thinking could open doors for a lot of new programming and puzzle solving. But OpenAI often brags about the coding abilities of its models in ways that the not-so technically minded find tedious at best.

    I decided it might be more interesting to apply the natural extension of logical coding to more human interests – specifically, riddles and logical puzzles. Rather than simply see how GPT-4.1 performed on its own, I decided to run it against a couple of other ChatGPT models. I picked GPT-4o, the default choice available to every ChatGPT user, as well as o3, OpenAI’s high-octane reasoning model designed to chew through math, code, and puzzles using reason like a scalpel. This Logic Olympics is not particularly scientific, but it would show at least a flavor of how the models stack up.

    Cat in a box

    https://cdn.mos.cms.futurecdn.net/dxJ4XkfkKijr8wSWeyfksA.jpg



    Source link
    erichs211@gmail.com (Eric Hal Schwartz)

    Latest articles

    spot_imgspot_img

    Related articles

    Leave a reply

    Please enter your comment!
    Please enter your name here

    spot_imgspot_img