OpenBMB 已發布 MiniCPM5-1B,這是一款具備 one-billion-parameter(十億參數)的 AI 模型,旨在針對資源受限的硬體進行本地部署,目前可在 Hugging Face 上使用。該模型在代理式與推理基準測試中取得平均 42.57 分,表現優於下一名最佳的 1B 級競品 35.61。MiniCPM5-1B 支援模型上下文協議(Model Context Protocol,MCP)以及原生工具呼叫,讓在無需雲端連線的情況下即可在消費級裝置上進行本地代理工作流程。該模型能符合智慧型手機的記憶體限制,同時維持 128K-token 的上下文視窗——在單次傳遞中約為 96,000 字的連續文字。
MiniCPM5-1B builds on the architectural backbone of MiniCPM4, developed by teams at THUNLP, Tsinghua University, and ModelBest. The core innovation is InfLLM v2, a trainable attention mechanism that processes each token against fewer than 5% of surrounding tokens during long-context inference, reducing computation without meaningful accuracy loss.
The training pipeline introduced UltraClean, a filtering system that achieved competitive performance using 8 trillion training tokens—compared to 36 trillion consumed by Qwen 3. Post-training applied reinforcement learning combined with efficient distillation techniques, raising benchmark scores on math, code, and instruction-following by 16 points while reducing runaway-length responses by 29 percentage points.
Testing confirmed MiniCPM5-1B supports both MCP and tool calling, placing it on a short list of sub-2-billion-parameter models capable of local agentic workflows without cloud infrastructure. Practical deployment scenarios include local agents on mobile devices that query calendars, search local databases, or call web research MCP servers entirely offline.
The 128K-token context window enables persistent memory across extended interactions—sufficient for roleplay sessions spanning dozens or hundreds of exchanges, document digestion, or multi-step agent tasks without context reset.
OpenBMB's capability benchmark compares MiniCPM5-1B against Alibaba's Qwen3-0.6B, Qwen3.5-0.8B, and Liquid AI's LFM2.5-1.2B-Thinking across seven categories: general knowledge, domain knowledge, coding, instruction-following, math reasoning, logical reasoning, and agentic tasks. MiniCPM5-1B leads across all seven, with the most pronounced margins in agentic performance and general knowledge.
Three evaluations were conducted:
Logic Trap Test: When asked whether it is legal for a man to marry his widow's sister according to Falkland Islands law, the model produced a detailed breakdown of marital law and missed the logical trap—that a man with a widow is deceased. The model treated it as a straightforward jurisdictional question rather than recognizing the logical impossibility.
A/B Choice Test: When asked to determine which industry—Crypto or AI—would dominate the economy in 2100, the model hedged into a both-sides answer rather than reasoning decisively. This represents a known failure mode across small models under conversational pressure.
Tool Calling Test: When asked for the current Bitcoin price and three stock recommendations, the model successfully called the tool. Recommendations provided were Amazon, Microsoft, and Nvidia.
Pairing MiniCPM5-1B with an MCP server for web research substantially mitigates hallucination on obscure factual questions.
MiniCPM5-1B 可在 Hugging Face 以 Apache 2.0 授權提供。該模型相容於 vLLM、SGLang 以及標準 Transformers 推理框架。需要代理式功能的使用者必須在模型的 Github 儲存庫中設定其他可用選項。
相關新聞
AI 模型對天主教呈正向偏見,CEFE-AI 研究發現
騰訊 ima Copilot 全面開放,逾 10 萬人等候終可使用
BNB Chain 在重大基礎設施週推出 BNBAgent SDK 主網
Microsoft 的 Fara1.5 AI 在網頁瀏覽上勝過 OpenAI 與 Google
博通、Meta Fund $125M AI Chip Hub 在 UCLA 的 AI 晶片中心