Published November 20, 2024 8:33 AM PST
A Chinese laboratory has introduced what it claims to be one of the first “reasoning” AI models designed to rival OpenAI’s o1 model.
On Wednesday, DeepSeek, an AI research firm backed by quantitative trading investors, unveiled a preview of DeepSeek-R1, which the company describes as a reasoning model competitive with OpenAI’s o1.
Reasoning models differ from standard AI systems by essentially fact-checking themselves. They take more time to process queries, which helps reduce errors and enhances their decision-making. Like OpenAI’s o1, DeepSeek-R1 plans ahead and performs a sequence of actions to arrive at answers, often requiring several seconds to respond to complex questions.
According to DeepSeek, their model matches OpenAI’s o1-preview in performance on two prominent AI benchmarks: AIME, which evaluates AI using other models, and MATH, a test set of word problems. However, early feedback suggests it is not flawless. Some users on X (formerly Twitter) observed that DeepSeek-R1, like o1, struggles with logic-based games like tic-tac-toe.
The model also appears vulnerable to “jailbreaking,” allowing users to bypass its safeguards. For instance, one user reportedly prompted it to provide detailed instructions for making methamphetamine.
Additionally, DeepSeek-R1 avoids answering politically sensitive topics, such as questions about Chinese leader Xi Jinping, Tiananmen Square, or the implications of a Chinese invasion of Taiwan. This behavior likely reflects regulatory pressure, as AI models in China are required to comply with government standards emphasizing “core socialist values” and undergo strict content monitoring by regulators.
This development comes amid debates over the limitations of traditional “scaling laws” — the assumption that increasing data and computing power leads to continuous improvements in AI. Recent reports suggest major AI developers, including OpenAI, Google, and Anthropic, are seeing diminishing returns with this approach, prompting a shift toward new techniques like test-time compute. This method, utilized by both o1 and DeepSeek-R1, provides models with additional processing time during task execution, potentially boosting performance.
Microsoft CEO Satya Nadella recently highlighted test-time compute as a transformative advancement, calling it a “new scaling law” during a keynote at the Microsoft Ignite conference.
DeepSeek plans to open-source DeepSeek-R1 and release an API, marking a significant move in the competitive AI landscape. The company is supported by High-Flyer Capital Management, a Chinese quantitative hedge fund leveraging AI for trading strategies.
DeepSeek’s earlier model, DeepSeek-V2, a general-purpose text and image analysis tool, disrupted the market by forcing rivals such as ByteDance, Baidu, and Alibaba to lower their prices or offer free access to certain AI models.
High-Flyer has invested heavily in infrastructure, reportedly building a server cluster with 10,000 Nvidia A100 GPUs at a cost of $138 million. The company, led by Liang Wenfeng, aims to develop “superintelligent” AI through its DeepSeek division.
Sign up for free newsletters and get more BVD delivered to your inbox
Get this delivered to your inbox, and more info about our products and services.
© 2024 BVD LLC. All Rights Reserved.
Data is a real-time snapshot *Data is delayed at least 15 minutes. Global Business and Financial News, Stock Quotes, and Market Data and Analysis.
Data also provided by REFINITY