More

    Apple embraces Nvidia GPUs to accelerate LLM inference via its open source ReDrafter tech




    • ReDrafter delivers 2.7x more tokens per second compared to traditional auto-regression
    • ReDrafter could reduce latency for users while using fewer GPUs
    • Apple hasn’t said when ReDrafter will be deployed on rival AI GPUs from AMD and Intel

    Apple has announced a collaboration with Nvidia to accelerate large language model inference using its open source technology, Recurrent Drafter (or ReDrafter for short).

    The partnership aims to address the computational challenges of auto-regressive token generation, which is crucial for improving efficiency and reducing latency in real-time LLM applications.

    https://cdn.mos.cms.futurecdn.net/pBQSiTGru55Z7ghrsPMhxP-1200-80.jpg



    Source link
    waynewilliams@onmail.com (Wayne Williams)

    Latest articles

    spot_imgspot_img

    Related articles

    Leave a reply

    Please enter your comment!
    Please enter your name here

    spot_imgspot_img