Skip to main content

DeepSeek has taken the world by storm.

The Chinese AI lab, DeepSeek, has recently gained widespread attention after its chatbot app reached the top of the Apple App Store charts and Google Play, sparking intense interest among Wall Street analysts and technologists, who are now questioning whether the US can maintain its lead in the AI race and if the demand for AI chips will be sustained.

However, the question remains: where did DeepSeek come from, and what led to its rapid rise to international fame?

The Origins of DeepSeek

DeepSeek is backed by High-Flyer Capital Management, a Chinese quantitative hedge fund that utilizes AI to inform its trading decisions, founded by AI enthusiast Liang Wenfeng in 2015.

Wenfeng, who began exploring trading while studying at Zhejiang University, launched High-Flyer Capital Management as a hedge fund in 2019, focusing on developing and deploying AI algorithms, and in 2023, High-Flyer established DeepSeek as a lab dedicated to researching AI tools separate from its financial business.

With High-Flyer as one of its investors, the lab spun off into its own company, also called DeepSeek, and from its inception, DeepSeek built its own data center clusters for model training, although it has been affected by US export bans on hardware, forcing the company to use Nvidia H800 chips, a less powerful version of the H100 chip available to US companies.

DeepSeek’s technical team is known to be relatively young, with the company aggressively recruiting doctorate AI researchers from top Chinese universities, and also hiring individuals without a computer science background to help its tech better understand a wide range of subjects.

DeepSeek’s Impressive Models

DeepSeek unveiled its first set of models, including DeepSeek Coder, DeepSeek LLM, and DeepSeek Chat, in November 2023, but it wasn’t until the release of its next-gen DeepSeek-V2 family of models that the AI industry took notice.

DeepSeek-V2, a general-purpose text- and image-analyzing system, performed well in various AI benchmarks and was significantly cheaper to run than comparable models at the time, forcing DeepSeek’s domestic competition to cut prices and make some models free.

The launch of DeepSeek-V3 in December 2024 further solidified DeepSeek’s reputation, with the company’s internal benchmark testing showing that DeepSeek V3 outperforms both downloadable models like Meta’s Llama and “closed” models like OpenAI’s GPT-4o.

DeepSeek’s R1 “reasoning” model, released in January, claims to perform as well as OpenAI’s o1 model on key benchmarks, and as a reasoning model, R1 effectively fact-checks itself, avoiding common pitfalls and providing more reliable results in domains such as physics, science, and math.

However, being a Chinese-developed AI, DeepSeek’s models are subject to benchmarking by China’s internet regulator to ensure that its responses align with “core socialist values,” and in DeepSeek’s chatbot app, R1 won’t answer questions about sensitive topics like Tiananmen Square or Taiwan’s autonomy.

A Disruptive Approach

DeepSeek’s business model is unclear, but the company prices its products and services well below market value, and gives others away for free, citing efficiency breakthroughs as the reason for its cost competitiveness, although some experts dispute the figures supplied by the company.

Developers have taken to DeepSeek’s models, which are available under permissive licenses that allow for commercial use, and according to Clem Delangue, the CEO of Hugging Face, one of the platforms hosting DeepSeek’s models, developers on Hugging Face have created over 500 “derivative” models of R1 that have racked up 2.5 million downloads combined.

DeepSeek’s success has been described as “upending AI” and “over-hyped,” and the company’s success was partly responsible for causing Nvidia’s stock price to drop by 18% and eliciting a public response from OpenAI CEO Sam Altman.

Microsoft announced that DeepSeek is available on its Azure AI Foundry service, and when asked about DeepSeek’s impact on Meta’s AI spending, CEO Mark Zuckerberg said spending on AI infrastructure will continue to be a “strategic advantage” for Meta.

However, some companies are banning DeepSeek, and entire countries and governments, including South Korea, are blocking the app due to concerns over China data risks, and New York state has banned DeepSeek from being used on government devices.

As for what the future holds for DeepSeek, it’s uncertain, but improved models are expected, and the US government appears to be growing wary of what it perceives as harmful foreign influence.

TechCrunch has an AI-focused newsletter, sign up here to get it in your inbox every Wednesday.

This story was originally published January 28, 2025, and will be updated continuously with more information.


Source Link