More

    ‘A virtual DPU within a GPU’: Could clever hardware hack be behind DeepSeek’s groundbreaking AI efficiency?



    • A new approach called DualPipe seems to be the key to DeekSeek’s success
    • One expert describes it as an on-GPU virtual DPU that maximizes bandwidth efficiency
    • While DeepSeek has used Nvidia GPUs only, one wonders how AMD’s Instinct would fare

    China’s DeepSeek AI chatbot has stunned the tech industry, representing a credible alternative to OpenAI’s ChatGPT at a fraction of the cost.

    A recent paper revealed DeepSeek V3 was trained on a cluster of 2,048 Nvidia H800 GPUs – crippled versions of the H100 (we can only imagine how much more powerful it would be running on AMD Instinct accelerators!). It reportedly required 2.79 million GPU-hours for pretraining, fine-tuning on 14.8 trillion tokens, and cost – according to calculations made by The Next Platform – a mere $5.58 million.

    https://cdn.mos.cms.futurecdn.net/f6no4XW3TzUhwgzgJVGbBJ-1200-80.jpg



    Source link
    waynewilliams@onmail.com (Wayne Williams)

    Latest articles

    spot_imgspot_img

    Related articles

    Leave a reply

    Please enter your comment!
    Please enter your name here

    spot_imgspot_img