DeepSeek releases new paper authored by Liang Wenfeng: proposes mHC new architecture to improve large model training stability

PANews January 1 News, according to Jin10 reports, DeepSeek has released a new paper proposing a novel architecture called Manifold-Constrained Hyperconnection (mHC), aimed at addressing issues such as training instability and limited scalability caused by the disruption of identity mapping properties in hyperconnection (HC) technology. The architecture restores the identity mapping characteristic by mapping the residual connection space of HC onto a specific manifold, while combining rigorous infrastructure optimization to ensure efficiency, achieving significant performance improvements and superior scalability. DeepSeek anticipates that mHC, as a flexible and practical extension of HC, will contribute to a deeper understanding of topological architecture design and point to promising directions for the evolution of foundational models. The paper is co-authored by Zhenda Xie (解振达), Yixuan Wei (韦毅轩), and Huanqi Cao, with Liang Wenfeng also listed among the authors.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
0/400
No comments
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate App
Community
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)