Why is DeepSeek so popular?

DeepSeek: The “Pinduoduo of AI” Takes the Spotlight

In just a few days, an app called DeepSeek has skyrocketed to the top of the free app download rankings in Apple’s U.S. App Store, surpassing the highly popular ChatGPT. In the realm of general-purpose AI models, the U.S. is considered ChatGPT’s home turf. So, what makes DeepSeek capable of turning the tables?

For those unfamiliar with DeepSeek, here’s a simple way to describe it: It’s the “Pinduoduo” of the AI world.

How Affordable Is DeepSeek?

OpenAI CEO Sam Altman has previously revealed that training GPT-4 cost approximately $78 million, while an incomplete six-month training cycle for the GPT-5 model has already consumed around $500 million. In stark contrast, the training cost for DeepSeek-V3, a large-scale AI model, was just ¥5.58 million (~$800,000).

This low-cost approach extends to its API services, continuing its budget-friendly strategy. According to DeepSeek’s official pricing table:

• Cached hit input: ¥0.1 per 1 million tokens

• Cache miss input: ¥1 per 1 million tokens

• Output: ¥2 per 1 million tokens

This pricing is among the lowest in the AI large model market. (Note: 1 token is roughly equivalent to 1.5 Chinese characters or 3 English letters.)

DeepSeek-R1: A Game-Changer in AI

Released on January 20, the DeepSeek-R1 model has taken things to the next level. Its performance rivals that of GPT-4 (O1) in inference capabilities, and it goes a step further by open-sourcing its model weights. This means anyone can download and deploy the model independently. It even provides a detailed research paper outlining the training steps and key techniques, along with a mini versionthat can run on mobile devices.

What truly sets DeepSeek-R1 apart is its real-time internet connectivity, making it the only advanced AI currently capable of delivering the latest information as it becomes available. This innovation has clearly caught the attention of OpenAI CEO Sam Altman, who has already hinted at plans for an “O3-mini”. However, even as a ChatGPT Plus subscriber with a daily limit of 100 queries, it’s hard to ignore how DeepSeek, being both free and more versatile, stands out.

With its affordability, usability, speed, and access to the latest information, DeepSeek-R1 has understandably stirred up excitement in overseas markets. Who wouldn’t want a free, powerful, and cutting-edge AI at their fingertips?

DeepSeek: Pioneering a Unique Path in AI Development

Unlike most peers who replicate the Llama architecture, DeepSeek’s founder, Liang Wenfeng, has repeatedly emphasized the company’s commitment to a differentiated technological approach rather than mirroring OpenAI’s model. DeepSeek is determined to devise more efficient methods for training its models.

According to the information released about DeepSeek-R1, the company made extensive use of reinforcement learning during the post-training phase. This allowed for significant improvements in the model’s inference capabilities, even with minimal labeled data.

DeepSeek has also disclosed major advancements in two key areas:

• MLA (Multi-Head Latent Attention): A novel mechanism that optimizes attention distribution across tasks.

• DeepSeekMOE (Mixture of Experts) Structure: A proprietary design that significantly reduces computational resources required for training.

These innovations make DeepSeek models more cost-effective and enhance training efficiency, setting them apart in the competitive landscape of AI technology.

Meta’s Reaction Fuels DeepSeek’s Meteoric Rise

While DeepSeek’s inherent strengths have secured its foothold in the market, its sudden explosion in popularity might have been propelled by none other than Meta. Recently, a Meta employee on the U.S. anonymous workplace platform TeamBlind revealed that DeepSeek’s recent breakthroughs have thrown Meta’s generative AI team into a state of panic.

DeepSeek’s low-cost training methods have put Meta’s team in a difficult position, as they struggle to justify their high-budget expenditures. Meta engineers are reportedly racing against the clock to analyze DeepSeek’s technology, trying to replicate any feasible advancements.

This industry buzz not only underscores the disruptive potential of DeepSeek’s cost-effective approach but also highlights its growing influence, making it a focal point in the AI community.

Global Recognition for DeepSeek: A Rising Force in AI

Microsoft CEO Satya Nadella, speaking at the World Economic Forum in Davos, Switzerland, remarked that DeepSeek’s new model is highly impressive. He noted that the company has developed a genuinely effective open-source model that excels in inference computation and achieves remarkable efficiency. He emphasized the need to take China’s advancements in this area very, very seriously.

Similarly, Demis Hassabis, CEO of DeepMind, stated, “We need to think about how to maintain the West’s leading position in cutting-edge AI models. While I believe the West is still ahead, it is undeniable that China has extraordinary engineering and scaling capabilities.”

International media outlets have also paid significant attention to DeepSeek. The UK-based Financial Times published an article titled “Chinese Startups Like DeepSeek Are Challenging Global AI Giants”, offering high praise for the company. The article highlighted how DeepSeek’s V3 model has stunned the international tech community, delivering performance that rivals well-funded U.S. competitors like OpenAI. It also noted that the R1 model left a lasting impression as an ambitious step into the AI inference domain.

DeepSeek’s breakthroughs signal a shift in the global AI landscape, underscoring China’s growing influence in this highly competitive field.

We will bring you more tech updates in the future.

Why is DeepSeek so popular?

Leave a Comment Cancel Reply

erosripeaitools

Must Read

Leave a Comment Cancel Reply