DeepSeek releases the V4 open-source preview, with a technical score of 3206 surpassing GPT-5.4

MarketWhisper

DeepSeek V4 open-source preview

DeepSeek officially released a V4 preview series on April 24. It is open-sourced under the MIT license, and the model weights have been simultaneously published on Hugging Face and ModelScope. According to the DeepSeek V4 technical report, V4-Pro-Max (the strongest inference mode) scored 3206 on the Codeforces benchmark, surpassing GPT-5.4.

Specifications for Two MoE Model Architectures

According to the DeepSeek V4 technical report, the V4 series includes two mixture-of-experts (MoE) models:

V4-Pro: Total parameters 1.6T, 49B activated per token, supports a 1M token context

V4-Flash: Total parameters 284B, 13B activated per token, also supports a 1M token context

According to the technical report, under a 1M context, V4-Pro’s per-token inference FLOPs are only 27% of V3.2, and the KV cache is reduced to 10% of V3.2. This is mainly due to an architectural upgrade of the mixture attention mechanism (compressed sparse attention CSA + heavily compressed attention HCA). The pretraining data scale exceeds 32T tokens; the training optimizer has been updated to Muon.

Post-Training Methodology: Online Policy Distillation Replaces Mixed Reinforcement Learning

According to the DeepSeek V4 technical report, the core update in V4 post-training is that online policy distillation (On-Policy Distillation, OPD) completely replaces the mixed reinforcement learning (mixed RL) stage of V3.2. The new process is divided into two steps: first, domain experts are trained separately for areas such as math, code, agents, and instruction following (SFT + GRPO reinforcement learning); then, the capabilities of a dozen-plus experts are distilled into a unified model using multi-teacher OPD, and logit alignment is used to avoid the common capability conflicts seen in traditional methods.

The report also introduces a generative reward model (Generative Reward Model, GRM). For tasks that are difficult to verify with rules, it is trained with a small amount of diverse human-annotated data, enabling the model to handle both generation and evaluation functions simultaneously.

Benchmark Results: Leading on Coding, Still a Gap in Knowledge Reasoning

According to the DeepSeek V4 technical report, the comparison results of V4-Pro-Max with Opus 4.6 Max, GPT-5.4 xHigh, and Gemini 3.1 Pro High (excluding the recently released GPT-5.5 and Opus 4.7):

Codeforces: 3206 (GPT-5.4: 3168 / Gemini 3.1 Pro: 3052) → Highest across the board

LiveCodeBench: 93.5 → Highest across the board

SWE Verified: 80.6, behind Opus 4.6’s 80.8 by 0.2 percentage points

GPQA Diamond: 90.1, behind Gemini 3.1 Pro’s 94.3

SimpleQA-Verified: 57.9, behind Gemini 3.1 Pro’s 75.6

HLE: 37.7, behind Gemini 3.1 Pro’s 44.4

The technical report also states that the above comparisons exclude the recently released GPT-5.5 and Opus 4.7, and the gap between V4 and the latest generation of closed-source models remains to be verified by third-party evaluations.

Frequently Asked Questions

What are the open-source license terms for the DeepSeek V4 preview version, and where can I obtain them?

According to DeepSeek’s official announcement on April 24, the V4 series is open-sourced under the MIT license. Model weights are available on Hugging Face and ModelScope, and it applies to both commercial and academic use.

What are the differences in parameter scale between DeepSeek V4-Pro and V4-Flash?

According to the DeepSeek V4 technical report, V4-Pro has total parameters of 1.6T, with 49B activated per token; V4-Flash has total parameters of 284B, with 13B activated per token. Both models support a 1M token context.

What are the benchmark comparison results for DeepSeek V4-Pro-Max versus GPT-5.4 and Gemini 3.1 Pro?

According to the DeepSeek V4 technical report, V4-Pro-Max surpasses GPT-5.4 and Gemini 3.1 Pro on the Codeforces (3206) and LiveCodeBench (93.5) benchmarks, but still lags behind Gemini 3.1 Pro on knowledge-intensive benchmarks (GPQA Diamond, SimpleQA-Verified, HLE). The comparison set excludes GPT-5.5 and Opus 4.7.

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.

Related Articles

Google DeepMind joins South Korea’s K-Moonshot to tackle scientific challenges with AI

Google DeepMind has entered into a partnership with South Korea to provide the technical backbone for the country's K-Moonshot initiative, an ambitious national programme designed to solve the most complex scientific challenges of the century using AI. Summary Google DeepMind partners with Sout

Cryptonews1h ago

Gate to Host AI Trading Space Roundtable on April 28: Exploring AI as the Next Web3 Cycle Driver

Gate News message, April 27 — Gate will host a live Space roundtable discussion on AI Trading on April 28 at 8 p.m., bringing together industry experts to explore whether AI's deep integration into trading workflows marks the true starting point of the next Web3 cycle. The discussion will examine A

GateNews1h ago

Mac Studio Runs Large-Model Performance Tests: M3 Ultra, Cluster Solutions, and Expectations for M5 Ultra

In April 2026, models at the trillion-parameter class such as DeepSeek V4 Pro and Kimi K2.6 were released in succession, making “running frontier open-source LLMs on your own machine” a feasible option. For engineers and small teams who don’t want to build an H100 workstation but still want full local inference capability, \\Mac Studio M3 Ultra 256GB\\ is currently the most cost-effective single-machine solution, and with a Thunderbolt 5 cluster it can go up into the 1T-parameter range. This article compiles real-world test data for running large models on the M3 Ultra, cluster solutions, the advantages of the MLX framework, and the expected timeline for the M5 Ultra. Current M3 Ultra specs: 256GB unified memory, 819

ChainNewsAbmedia1h ago

Musk Sues OpenAI, Microsoft Over Nonprofit Mission Shift

Jury selection is set to begin in Elon Musk's 2024 lawsuit against OpenAI, Sam Altman, Greg Brockman, and Microsoft, accusing them of betraying OpenAI's nonprofit mission by creating a for-profit entity in 2019, according to Reuters. Musk is seeking US$150 billion in damages for OpenAI's

CryptoFrontier1h ago

Intel's Stock Surges 110% as AI Transition Reshapes Tech Industry

Gate News message, April 27 — Intel's stock has climbed 110% this year and reached a new all-time high on Friday, marking a significant turnaround for the chipmaker 25 years after its previous peak. The resurgence reflects a broader shift in the technology sector, where the AI transition is

GateNews1h ago

Ant Group Launches Ling-2.6-1T: Trillion-Parameter Model Optimized for Token-Efficient Task Execution

Gate News message, April 27 — Ant Group's inclusionAI has released Ling-2.6-1T, a new trillion-parameter flagship instruction model in the Ling series. Unlike long-chain reasoning models, Ling-2.6-1T employs a "Fast-Thinking" mechanism designed for precise task execution with minimal token

GateNews1h ago
Comment
0/400
No comments