In brief Xiaomi unveiled MiMo-V2.5 and V2.5-Pro, combining text, image, audio, and video capabilities into a single multimodal AI model. The Pro version rivals top frontier models in coding and agentic

Decrypt

2026-04-22 20:47:05

In brief

Xiaomi unveiled MiMo-V2.5 and V2.5-Pro, combining text, image, audio, and video capabilities into a single multimodal AI model.
The Pro version rivals top frontier models in coding and agentic benchmarks, while significantly improving token efficiency and cost.
The new models mark Xiaomi’s rapid AI push, with open-source plans and aggressive iteration following strong adoption on platforms like OpenRouter.

Xiaomi just launched a new AI model family. Again. A few weeks ago, the company dropped MiMo-V2-Pro—a trillion-parameter model that had been quietly circulating on OpenRouter under the alias “Hunter Alpha” before Xiaomi revealed its identity. It went from anonymous to top-tier overnight. We tested it, and it was impressive. Now Xiaomi is back with MiMo-V2.5 and MiMo-V2.5-Pro, a two-model family that adds something the previous generation never had in a single package: eyes, ears, and the ability to process video. Oh, and the company plans to open source the models in the near future.

The V2-Pro was text-and-code only. Multimodal capability existed in its sibling model, MiMo-V2-Omni, but that was a separate product at lower benchmark scores. MiMo-V2.5 collapses all of that into one model—faster, more capable, and with native image, video, and audio understanding baked in from the start. That matters more than it might sound for regular users. For example, now you can upload a photo of your fridge and ask it to suggest dinner recipes. Drop in a video tutorial and get a step-by-step summary. Record a meeting and have it pull out action items. All in one place, without juggling separate tools and separate models with different pricing strategies. Xiaomi claims MiMo-V2.5-Pro represents “a major leap from MiMo-V2-Pro in general agentic capabilities, complex software engineering, and long-horizon tasks,” and says it now matches frontier models like Claude Opus 4.6 and GPT-5.4 across most coding and agent benchmarks. The numbers largely back that up—with some gaps still visible on harder reasoning tasks.

The base and pro models serve different purposes. MiMo-V2.5-Pro is the heavy lifter. Xiaomi says it can “autonomously complete professional tasks involving 1,000+ tool calls, work that would take human experts days.” That’s for developers running complex, multi-step automated workflows. It runs at 60–80 tokens per second and costs $1.00 input / $3.00 output per million tokens. MiMo-V2.5 is the everyday version. Faster (100–150 tokens per second), cheaper ($0.40 input / $2.00 output), and supports all modalities—image, audio, and video that the Pro-only tier skips. Both models carry a 1M-token context window, meaning they can hold roughly 750,000 words in a single conversation. On SWE-bench Pro—a coding benchmark where models fix real bugs in actual startup codebases, scored as a pass rate out of 100—MiMo-V2.5-Pro resolves 57.2% of tasks. That’s near the top of the field; the average model manages around 25%. The story is similar on τ3-bench and ClawEval, where it lands within a few points of Claude Opus 4.6 and GPT-5.4. The gap opens up on Humanity’s Last Exam, a gauntlet of graduate-level problems across dozens of academic fields: MiMo scores 48.0% versus GPT-5.4’s 58.7—a 10-point deficit that’s hard to paper over… Where it genuinely stands out is token efficiency. Xiaomi says MiMo-V2.5-Pro uses 42% fewer tokens than Kimi K2.6 at equivalent benchmark scores, and MiMo-V2.5 uses nearly half the tokens of Muse Spark for similar results. For anyone running these at scale—developers processing thousands of requests daily—that difference is real money. On multimodal tasks, MiMo-V2.5 scores show results that put it on par with GPT/5.4 and Gemini 3.1 Pro, and are quite close to Opus 4.6 standards.

Since December 2025, Xiaomi has completed three major model releases: First, it released its efficient MiMo-V2-Flash, then the V2-Pro/Omni/TTS trio in March, and now the V2.5 series today. The company committed at least $8.7 billion in AI investment over the next three years, announced by CEO Lei Jun the day after V2-Pro launched—and the release cadence suggests that the budget is already moving. Context also helps explain the speed. According to Digital Applied, as of early April, Xiaomi’s models accounted for roughly 21% of all traffic on OpenRouter—growing over 42% in the last 7 days. When your previous model becomes one of the most competitive models in the world’s largest AI routing platform, you have both the resources and the pressure to iterate fast.

This was probably due to the boom of the agentic AI tool Hermes and its arrangement with Xiaomi, giving users free access to MiMo v2 Pro for a limited time. That timeframe is already closed, but the hype was enough to put Xiaomi in the game field.

Thank you for your love ❤️❤️ https://t.co/mA1WV1GAia

— Xiaomi MiMo (@XiaomiMiMo) April 11, 2026

Those who want to use Hermes for free now can test the new Step 3.5 flash with the Nous API or use OpenRouter with free models but more limited usage. Token plan pricing also got a refresh. MiMo-V2.5 runs at a 1x credit rate; MiMo-V2.5-Pro at 2x. Xiaomi is no longer charging an extra multiplier for using the full 1 million-token context window, which makes long-document analysis noticeably cheaper. Existing users also get a full credit reset as a launch bonus. Xiaomi says the model is available in its AI Studio. We tried to access it there immediately after launch—no luck. It is, however, already live via the Xiaomi MiMo API, which is where most developers will actually use it. The company says it’s already training the next generation, with “deeper reasoning, tighter tool integration, and richer real-world grounding.” At the rate Xiaomi is moving, that announcement is probably closer than you’d expect.

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

Reward
like
Comment
Repost
Share

Comment

Add a comment

No comments

Trending Topics
View More
#
Gate13thAnniversaryLive
1.15M Popularity
#
WCTCTradingChallengeShare8MUSDT
777.8K Popularity
#
BitcoinBouncesBack
199.78K Popularity
#
USIranTalksProgress
784.96K Popularity
#
ArbitrumFreezesKelpDAOHackerETH
42.04K Popularity

Sitemap

Xiaomi's New MiMo 2.5 Pro AI Can See, Hear, and Act—All in One Model

In brief

Trending Topics

Gate13thAnniversaryLive

WCTCTradingChallengeShare8MUSDT

BitcoinBouncesBack

USIranTalksProgress

ArbitrumFreezesKelpDAOHackerETH

Pin