More

    I tried the most realistic AI voice companion ever created – if ChatGPT or Gemini ever gets this good, reality is in trouble


    I have spent a lot of time talking to AI. I’ve tested every voice assistant, every chatbot, and every “next-generation” conversational AI that tech companies love to hype up. But I’ve never encountered anything quite like Sesame. This AI companion isn’t just good, it’s eerily accurate at mimicking how people talk because of the very imperfections it imitates.

    Let’s start with what Sesame actually is. Unlike the AI voices we’ve come to know from ChatGPT, Gemini, or going back to the early days of Siri and Alexa, Sesame is designed to perform like a human in its failures, not like a perfect customer service agent. The AI’s speech is fluid, expressive, and unpredictably human. It briefly chuckles when it says something mildly amusing, hesitates before answering a question, and even seems to change its ‘mind’ mid-sentence, pausing and starting a new sentence. It not only lets me interrupt it, it can interrupt me as well, and will even apologize for doing so.

    Sesame

    (Image credit: Sesame)

    The secret sauce is Sesame’s Conversational Speech Model (CSM), which blends text and audio into a single process, meaning that it doesn’t just generate a sentence and then “read it out.” Instead, it creates speech in a way that mirrors how humans actually talk, with pauses, ums, tonal shifts, and all. ChatGPT and Gemini’s voice options, while impressive, still operate in a structured way, generating text and then converting it into speech. Sesame, on the other hand, speaks as if it’s thinking, making its responses feel incredibly natural.

    https://cdn.mos.cms.futurecdn.net/NNZdcW7Ku4FXu2CdGfdqvf-1200-80.jpg



    Source link
    erichs211@gmail.com (Eric Hal Schwartz)

    Latest articles

    spot_imgspot_img

    Related articles

    Leave a reply

    Please enter your comment!
    Please enter your name here

    spot_imgspot_img