Key insights:
From a modest background as the son of a primary school teacher to becoming one of China's most influential AI innovators, Liang Wenfeng's journey shows that groundbreaking achievements don't always require massive resources. His company DeepSeek is now challenging Silicon Valley giants with AI models that deliver impressive results using minimal computing power.
Born in 1985 in Guangdong Province, Liang showed an early aptitude for mathematics and problem-solving. While other kids played sports, he found joy in untangling complex equations, a passion that would later shape his approach to AI development.
At Zhejiang University, Liang combined his mathematical talents with hands-on technology applications, studying electronic information engineering. His professors quickly recognized his exceptional abilities, assigning him advanced projects that merged mathematics with real-world applications.
In a pivotal moment, Liang declined a partnership offer from drone company DJI, believing AI held greater potential. This decision, while risky at the time, demonstrated his long-term vision for technology's future.
During the 2008 financial crisis, Liang saw an opportunity to apply his mathematical expertise to solve real-world problems. He gathered a team to explore machine learning applications in quantitative trading, leading to the foundation of High Flyer Technology.
Under Liang's leadership, High Flyer developed sophisticated AI trading systems that maintained profitability during volatile market conditions. By 2019, the company ranked among China's top four quantitative trading firms, managing over 1 billion yuan in assets.
The breakthrough came with the development of the Firefly supercomputer series, representing investments of over 1.2 billion yuan. These systems provided the computing foundation for Liang's broader AI ambitions.
In July 2023, Liang launched DeepSeek with a clear mission: create human-level AI without the massive resource requirements typically associated with such endeavors.
DeepSeek's V3 model achieved comparable performance to GPT-4 using just 2,000 basic NVIDIA H800 GPUs, a fraction of the hardware typically required. This efficiency comes from innovative approaches like:
The company's success stems from clever engineering and efficient algorithms rather than raw computing power. Training costs for DeepSeek V3 were approximately $8 million, compared to $100 million for comparable models.
If you're interested in learning more about AI development and its practical applications, consider exploring Futurise's ChatGPT Course, where you can learn to become a Generative AI Prompt Engineer.
To dive deeper into this fascinating story and see more details about DeepSeek's journey, check out the full video on the East Money YouTube channel below.