Is DeepSeek Really Open Source? The Truth Behind the Industry Standard
The AI research company DeepSeek recently released its large language model (LLM) under the MIT License, providing model weights, inference code, and technical documentation. However, the company did not release its training code, sparking a heated debate about whether DeepSeek can truly be considered "open-source."
This controversy stems from differing interpretations of what constitutes open-source in the context of large language models. While some argue that without training code, a model cannot be considered fully open-source, others highlight that DeepSeek’s approach aligns with industry norms followed by leading AI companies like Meta, Google, and Alibaba.
Key Takeaways
-
DeepSeek’s Open-Source Approach
- Released model weights under the MIT License
- Provided inference code and technical documentation
- Did not release training code, leading to debates over its open-source credibility
-
Industry Standard for Open-Source LLMs
- Most companies (Meta, Google, Alibaba) follow a similar model
- Standard practice includes sharing weights and inference code but not training code
- Full open-source releases (including training code) are rare
-
Practical Considerations
- Training costs for LLMs are extremely high (DeepSeek v3 training cost: 30M RMB)
- Model weights are hosted on Hugging Face due to their large file sizes
- Community benefits from access to weights, enabling fine-tuning and experimentation
-
Community Reactions
- Some criticize the lack of training code, arguing it limits transparency
- Others emphasize the practical benefits of open weights and local deployment
- Similar criticisms have been raised against major AI companies, including OpenAI
Deep Analysis: Industry Context and Implications
A Broader Look at Open-Source in the AI Industry
DeepSeek is not an anomaly in how it approaches open-source AI. The practice of releasing model weights without training code has been the industry norm since Meta’s release of Llama 2. Companies like Google (Gemma), Alibaba (Qwen), and the GLM4 series have adopted similar policies. Even Llama 2 includes commercial restrictions, limiting usage for companies with over 700 million monthly active users.
Why don’t companies release training code? The answer lies in cost, complexity, and competitive advantage. Training large models like DeepSeek v3 requires tens of millions of dollars in compute resources. Additionally, AI firms guard their training methodologies as trade secrets, ensuring their models remain competitive.
Does the Lack of Training Code Matter?
While critics argue that training code is necessary for full transparency, most LLM users do not need it. Open weights allow developers to:
- Fine-tune models for specific tasks
- Deploy models locally
- Conduct experiments and create downstream applications
Furthermore, many AI models rely on standard frameworks such as PyTorch, transformers, and vLLM, which make it possible to infer architectural details and functionality without explicit access to training scripts.
Community Perspectives and Double Standards
One emerging concern is whether DeepSeek and other Chinese AI firms face more scrutiny compared to Western companies. Critics note that OpenAI, which has the word "open" in its name, does not release model weights at all, yet DeepSeek receives harsher criticism despite following the same playbook as Meta and Google.
The discussion reflects a broader pattern in tech debates—initial hype, followed by backlash, and then a more balanced reassessment. DeepSeek’s release has followed this cycle, with initial excitement about its capabilities giving way to criticism over its open-source claims.
Did You Know? Lesser-Known Facts About Open-Source AI
- OLMO is one of the few truly open-source LLMs, releasing not just weights but also training code and data. However, full open-source models remain niche and are primarily used for educational and research purposes.
- AI model weights are often hosted on Hugging Face, not GitHub, due to their massive file sizes, making direct access challenging for some users in China.
- The debate over open-source AI isn’t new. Discussions about "openness" date back to OpenAI’s early days, when it transitioned from an open research lab to a commercial AI powerhouse.
- Training costs for large AI models are astronomical. For example, training GPT-4 likely cost hundreds of millions of dollars, making it impractical for most organizations to replicate even if full training code were available.
Final Thoughts
DeepSeek’s approach to open-source AI follows industry norms, even if it does not align with traditional definitions of open-source software. The key question is whether open-source in the LLM space should prioritize full transparency (training code, data, and weights) or practical accessibility (model weights and inference capabilities). For now, most AI developers benefit from open weights, allowing for real-world applications and innovations.
The debate over what "open-source" means in AI will continue, but DeepSeek is far from alone in its approach. As AI research evolves, so too will the definition of openness in this rapidly growing field.