Xiaomi open-sources MiMo-V2.5: a 1.02-trillion-parameter pure-text MoE Pro and a 310-billion-parameter native multimodal model, MIT-licensed for commercial use, with notable benchmarks and a token incentive program.Abstract: This article reports on Xiaomi's release of MiMo-V2.5, a pair of MIT-licensed large models for commercial deployment and ongoing fine-tuning. MiMo-V2.5-Pro is a 1.02T pure-text MoE, while MiMo-V2.5 is a 310B multimodal model. The release includes performance benchmarks, a notable case study at Peking University, and a token-creator incentive program with weights available on Hugging Face.

AirdropBlackHole

2026-05-02 21:00:16

Abstract generation in progress

According to monitoring by Dongcha Beating, the Xiaomi MiMo team has open-sourced the MiMo-V2.5 series of large models, which includes two models, both under the MIT license, supporting commercial deployment, continued training, and fine-tuning, with a context window of up to 1 million tokens. The MiMo-V2.5-Pro is a pure text MoE model (Mixture of Experts architecture) with a total of 1.02 trillion parameters and 42 billion active parameters; MiMo-V2.5 is a native multimodal model with a total of 310 billion parameters and 15 billion active parameters, supporting text, image, video, and audio understanding. MiMo-V2.5-Pro primarily targets complex agent and programming tasks. In the ClawEval evaluation, V2.5-Pro achieved a 64% Pass^3, reaching comparable levels while consuming only about 70,000 tokens per task trajectory, which is approximately 40% to 60% less than Claude Opus 4.6, Gemini 3.1 Pro, and GPT-5.4. The SWE-bench Verified score is 78.9. In a case showcased on the official blog, V2.5-Pro autonomously implemented a complete SysY to RISC-V compiler for a compiler principles project at Peking University, taking 4.3 hours and 672 tool calls, achieving a perfect score of 233/233 on a hidden test set. MiMo-V2.5 is designed for multimodal agent scenarios, equipped with a dedicated visual encoder (729 million parameters ViT) and an audio encoder (261 million parameters), scoring 62.3 on the Claw-Eval general subset. Both models utilize a mixed architecture of sliding window attention (SWA) and global attention (GA), along with a 3-layer multi-token prediction (MTP) module (predicting multiple tokens at once to accelerate inference). Weights have been released on Hugging Face. Along with the open-source release, the MiMo team has launched the ‘Orbit Trillion Token Creator Incentive Program’, offering a total of 100 trillion token quota free to global users within 30 days. Individual developers, teams, and enterprises can apply on the event page, with an evaluation period of about 3 working days. Upon approval, benefits will be credited in the form of Token Plan or grants, which can be directly used with programming tools like Claude Code and Cursor.

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

Reward
like
Comment
Repost
Share

Comment

Add a comment

No comments

Trending Topics
View More
#
WCTCTradingKingPK
532.17K Popularity
#
USSeeksStrategicBitcoinReserve
58.75M Popularity
#
BitcoinETFOptionLimitQuadruples
1.02M Popularity
#
#FedHoldsRateButDividesDeepen
42.28K Popularity
#
DeFiLossesTop600MInApril
10.19M Popularity

Sitemap

Xiaomi MiMo-V2.5 Series Open-Sourced: 1T Parameters under MIT License, Token Efficiency Surpassing GPT-5.4 on ClawEval

Trending Topics

WCTCTradingKingPK

USSeeksStrategicBitcoinReserve

BitcoinETFOptionLimitQuadruples

#FedHoldsRateButDividesDeepen

DeFiLossesTop600MInApril

Pin