Xiaomi Reveals MiMo-V2-Pro Training Details: 1T Model Parameters, Thousands of GPUs Deployed

Gate News message, April 24 — Xiaomi’s large language model team lead Luo Fuli disclosed in an in-depth interview that the MiMo-V2-Pro model has 1 trillion parameters in total and required thousands of GPUs for training. She noted that the 1T scale represents the minimum threshold to achieve performance approaching Claude Opus 4.6 level and secure a competitive entry ticket for the next phase of AI agents.

Technically, the Pro version employs an extreme sparse attention mechanism with a 7:1 ratio between global attention and sliding window attention, controlling inference costs for long-context processing. The model also retains the MTP (Multi-Token Prediction) architecture to leverage surplus compute power for faster inference.

On the management side, the 100-person MiMo team has only 30-40 people directly engaged in core iterations. The team operates without formal hierarchies or explicit sub-group divisions and delivery deadlines. When encountering unstable numerical issues such as training loss spikes, the team prioritizes halting training for investigation, even if it means stopping operations for one or two weeks and incurring millions of dollars in compute costs.

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.

Related Articles

Web3 AI Infrastructure AIW3 Raises $2M in Seed Funding Led by Buffalo Capital

Gate News message, April 24 — Web3 AI infrastructure platform AIW3 announced the completion of a $2 million seed round funding. The round was led by Buffalo Capital, with GalaXin Capital and Three-stones Ventures participating as co-investors. AIW3 is transitioning toward an Agent-as-a-Service

GateNews6m ago

Cohere Acquires German AI Firm Aleph Alpha, Secures $600M Investment for European Expansion

Gate News message, April 24 — Canadian AI company Cohere announced plans to acquire German AI firm Aleph Alpha to strengthen its presence in Europe. Schwarz Group, a backer of Aleph Alpha, plans to invest $600 million in Cohere's Series E funding round. The funding round is expected to close in 202

GateNews48m ago

Xpeng, Xiaomi Lead In-Car AI Push at Beijing Auto Show

Gate News message, April 24 — Chinese automakers showcased advanced in-car AI systems at the Beijing Auto Show on April 24, as the country accelerates its AI Plus strategy and seeks greater independence from foreign semiconductors. Xpeng demonstrated voice-controlled parking that allows drivers to

GateNews1h ago

Former ByteDance Seed Engineer: ByteDance AI Iteration Takes Six Months vs Google's Three Months

Gate News message, April 24 — Zhang Chi, a former engineer at ByteDance's Seed team and current assistant professor at Peking University, revealed on the podcast "Into Asia" that ByteDance requires approximately six months to complete one full cycle of large language model training (pretraining

GateNews1h ago

OpenAI Engineer Clive Chan Challenges V4 Hardware Recommendations, Citing Errors and Vagueness vs. V3

Gate News message, April 24 — OpenAI engineer Clive Chan has raised detailed objections to the hardware recommendations chapter in the V4 technical report, calling it "surprisingly mediocre and error-prone" compared to the acclaimed V3 version. V3's hardware guidance, which included Q&A sessions

GateNews2h ago

Naver Launches AI Tab Beta as Google Gemini Enters South Korea Search Market

Gate News message, April 24 — Naver announced the start of a closed beta for AI Tab, its new conversational search feature, following Google's launch of Gemini in Chrome in South Korea. AI Tab will appear alongside Naver's existing search tabs, offering users a dedicated space for conversational

GateNews2h ago
Comment
0/400
No comments