Explainer | What DeepSeek’s success means for Nvidia and costly GPU-driven AI growth
While Nvidia rebounded in pre-trading on Tuesday, analysts noted a shift in the perception of the company’s role in costly AI development driven by graphics processing units (GPU), threatening one of the world’s most valuable technology titans.
What has DeepSeek achieved?
DeepSeek claims to have pre-trained its V3 model on only 2,048 Nvidia H800 GPUs over a two-month period, with each chip costing about US$2 per hour to run. The total training cost for the V3 model was US$5.5 million, with 2.8 million GPU hours, far less than rival models.
Meanwhile, its open-source reasoning model, R1, released earlier this month, has demonstrated capabilities comparable to those of more advanced models from OpenAI, Anthropic, and Google, but with significantly lower training costs.
Does DeepSeek prove Nvidia chips are not indispensable?
Not yet. In a 2023 interview with Chinese media outlet Latepost, DeepSeek founder Liang Wenfeng said the company had gradually built up a stockpile of more than 10,000 Nvidia GPUs, making it one of the top owners of computing resources among Chinese AI start-ups.
Source link