MiniMax Open-Sources Sparse Attention Library for Blackwell, M3 Weights Coming Friday

According to Ryan Lee, MiniMax's head of developer relations, the company has open-sourced MiniMax Sparse Attention (MSA), a high-performance attention library for NVIDIA Blackwell (SM100) GPUs, under the MIT license. Lee announced M3 model weights will launch on Friday, June 13.

When applied to MiniMax-M3's million-token context inference, MSA reduces attention computation by 28.4x compared to Dense GQA at equivalent configuration. On H800 GPUs, the library achieved 14.2x pre-fill speedup and 7.6x decoding acceleration.

Disclaimer: The information on this page may come from third-party sources and is for reference only. It does not represent the views or opinions of Gate and does not constitute any financial, investment, or legal advice. Virtual asset trading involves high risk. Please do not rely solely on the information on this page when making decisions. For details, see the Disclaimer.
Comment
0/400
No comments