DeepSeek x Cerebras: How the most controversial AI model right now is getting supercharged by the most powerful AI superchip ever built

Maker of fastest AI chip in the world makes a splash with DeepSeek onboarding
Cerebras says the solution will rank 57x faster than on GPUs but doesn’t mention which GPUs
DeepSeek R1 will run on Cerebras cloud and the data will remain in the USA

Cerebras has announced that it will support DeepSeek in a not-so-surprising move, more specifically the R1 70B reasoning model. The move comes after Groq and Microsoft confirmed they would also bring the new kid of the AI block to their respective clouds. AWS and Google Cloud have yet to do so but anybody can run the open source model anywhere, even locally.

The AI inference chip specialist will run DeepSeek R1 70B at 1,600 tokens/second, which it claims is 57x faster than any R1 provider using GPUs; one can deduce that 28 tokens/second is what GPU-in-the-cloud solution (in that case DeepInfra) apparently reach. Serendipitously, Cerebras latest chip is 57x bigger than the H100. I have reached out to Cerebras to find out more about that claim.

Research by Cerebras also demonstrated that DeepSeek is more accurate than OpenAI models on a number of tests. The model will run on Cerebras hardware in US-based datacentres to assuage the privacy concerns that many experts have expressed. DeepSeek – the app – will send your data (and metadata) to China where it will most likely be stored. Nothing surprising here as almost all apps – especially free ones – capture user data for legitimate reasons.

Cerebras wafer scale solution positions it uniquely to benefit from the impending AI cloud inference boom. WSE-3, which is the fastest AI chip (or HPC accelerator) in the world, has almost one million cores and a staggering four trillion transistors. More importantly though, it has 44GB of SRAM, which is the fastest memory available, even faster than HBM found on Nvidia’s GPUs. Since WSE-3 is just one huge die, the available memory bandwith is huge, several orders of magnitude bigger than what the Nvidia H100 (and for that matter the H200) can muster.

A price war is brewing ahead of WSE-4 launch

No pricing has been disclosed yet but Cerebras, which is usually coy about that particular detail, did divulge last year that Llama 3.1 405B on Cerebras Inference would cost $6/million input tokens and $12/million output tokens. Expect DeepSeek to be available for far less.

WSE-4 is the next iteration of WSE-3 and will deliver a significant boost in the performance of DeepSeek and similar reasoning models when it is expected to launch in 2026 or 2027 (depending on market conditions).

The arrival of DeepSeek is also likely to shake the proverbial AI money tree, bringin more competition to established players like OpenAI or Anthropic, pushing prices down.

A quick look at Docsbot.ai LLM API calculator shows OpenAI is almost always the most expensive in all configurations, sometimes by several orders of magnitude.

Cerebras tokens per second on Llama 3.1 405B

(Image credit: Cerebras)

seconds to first token received on Llama 3.1 405B

(Image credit: Cerebras)

https://cdn.mos.cms.futurecdn.net/poNodEShAPABxW4ErZcVeR-1200-80.jpg

Source link
desire.athow@futurenet.com (Desire Athow)

I tried Nespresso’s Festive Edition Double Espresso, and it gave me the energy to finish my Christmas shopping a month early

Don’t wait for Black Friday, the cheapest 15K-class Gen5 1TB SSD is just $125.99 on Amazon

How to watch Ireland vs New Zealand for FREE

A major telephoto camera upgrade is rumored for all the Samsung Galaxy S26 models

Ex-Israeli Intelligence Official: Shockwaves of Trump’s “Take Over Gaza” Heard, Felt Across Region

What UK political parties are promising in the 2019 general election

Otto Warmbier’s parents want North Korea to suffer for their son’s death

Could a ‘youthquake’ cause Boris Johnson to lose the general election?

Which Celebrity Styles Americans Copy Most in 2025: New Study

New ‘Westworld’ trailer introduces us to another dystopian tech company

What’s the point of ‘Charlie’s Angels’ without Sam Rockwell dancing?

These striking photos capture the future of human flight

Enterprise Products Partners' SWOT analysis: midstream giant's stock resilience tested

JetBlue's SWOT analysis: airline stock faces turbulence amid strategic shifts

Minnesota lawmaker killed on Saturday served with compassion, governor says

Minnesota shooting suspect told friend in text message: I might be dead soon

The YouTuber who has become one of Gen Z’s most beloved celebrities

26 last-minute holiday gifts that are still thoughtful and unique

Practicing gratitude regularly can make you less stressed and sleep better

8 things millennials wish you would just stop getting them for the holidays

DeepSeek x Cerebras: How the most controversial AI model right now is getting supercharged by the most powerful AI superchip ever built

I tried Nespresso’s Festive Edition Double Espresso, and it gave me the energy to finish my Christmas shopping a month early

Don’t wait for Black Friday, the cheapest 15K-class Gen5 1TB SSD is just $125.99 on Amazon

How to watch Ireland vs New Zealand for FREE

A major telephoto camera upgrade is rumored for all the Samsung Galaxy S26 models