PANews January 1 News, according to Jin10 reports, DeepSeek has released a new paper proposing a novel architecture called Manifold-Constrained Hyperconnection (mHC), aimed at addressing issues such as training instability and limited scalability caused by the disruption of identity mapping properties in hyperconnection (HC) technology. The architecture restores the identity mapping characteristic by mapping the residual connection space of HC onto a specific manifold, while combining rigorous infrastructure optimization to ensure efficiency, achieving significant performance improvements and superior scalability. DeepSeek anticipates that mHC, as a flexible and practical extension of HC, will contribute to a deeper understanding of topological architecture design and point to promising directions for the evolution of foundational models. The paper is co-authored by Zhenda Xie (解振达), Yixuan Wei (韦毅轩), and Huanqi Cao, with Liang Wenfeng also listed among the authors.
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
DeepSeek releases new paper authored by Liang Wenfeng: proposes mHC new architecture to improve large model training stability
PANews January 1 News, according to Jin10 reports, DeepSeek has released a new paper proposing a novel architecture called Manifold-Constrained Hyperconnection (mHC), aimed at addressing issues such as training instability and limited scalability caused by the disruption of identity mapping properties in hyperconnection (HC) technology. The architecture restores the identity mapping characteristic by mapping the residual connection space of HC onto a specific manifold, while combining rigorous infrastructure optimization to ensure efficiency, achieving significant performance improvements and superior scalability. DeepSeek anticipates that mHC, as a flexible and practical extension of HC, will contribute to a deeper understanding of topological architecture design and point to promising directions for the evolution of foundational models. The paper is co-authored by Zhenda Xie (解振达), Yixuan Wei (韦毅轩), and Huanqi Cao, with Liang Wenfeng also listed among the authors.