Xiaomi MiMo-V2.5 Series Open-Sourced: 1T Parameters under MIT License, Token Efficiency Surpassing GPT-5.4 on ClawEval

robot
Abstract generation in progress

According to monitoring by Dongcha Beating, the Xiaomi MiMo team has open-sourced the MiMo-V2.5 series of large models, which includes two models, both under the MIT license, supporting commercial deployment, continued training, and fine-tuning, with a context window of up to 1 million tokens. The MiMo-V2.5-Pro is a pure text MoE model (Mixture of Experts architecture) with a total of 1.02 trillion parameters and 42 billion active parameters; MiMo-V2.5 is a native multimodal model with a total of 310 billion parameters and 15 billion active parameters, supporting text, image, video, and audio understanding. MiMo-V2.5-Pro primarily targets complex agent and programming tasks. In the ClawEval evaluation, V2.5-Pro achieved a 64% Pass^3, reaching comparable levels while consuming only about 70,000 tokens per task trajectory, which is approximately 40% to 60% less than Claude Opus 4.6, Gemini 3.1 Pro, and GPT-5.4. The SWE-bench Verified score is 78.9. In a case showcased on the official blog, V2.5-Pro autonomously implemented a complete SysY to RISC-V compiler for a compiler principles project at Peking University, taking 4.3 hours and 672 tool calls, achieving a perfect score of 233/233 on a hidden test set. MiMo-V2.5 is designed for multimodal agent scenarios, equipped with a dedicated visual encoder (729 million parameters ViT) and an audio encoder (261 million parameters), scoring 62.3 on the Claw-Eval general subset. Both models utilize a mixed architecture of sliding window attention (SWA) and global attention (GA), along with a 3-layer multi-token prediction (MTP) module (predicting multiple tokens at once to accelerate inference). Weights have been released on Hugging Face. Along with the open-source release, the MiMo team has launched the ‘Orbit Trillion Token Creator Incentive Program’, offering a total of 100 trillion token quota free to global users within 30 days. Individual developers, teams, and enterprises can apply on the event page, with an evaluation period of about 3 working days. Upon approval, benefits will be credited in the form of Token Plan or grants, which can be directly used with programming tools like Claude Code and Cursor.

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin