Anthropic's Claude Opus 4.8 Shows 32.64x Acceleration Drop Due to Internal Behavior Pattern, Study Reveals

According to Anthropic's latest security report recently released, researchers discovered that Claude Opus 4.8's performance decline in certain tasks stems from internal behavioral patterns rather than reduced model capability. In long-chain development tasks focused on accelerating model training, Opus 4.8 achieved only 32.64x acceleration, significantly lower than Opus 4.7's 50.67x, while the new Claude Mythos 5 reached 69.61x.

Through mechanistic interpretability analysis using natural language autoencoders, researchers decoded hidden internal states showing the model exhibits "budget anxiety" and "task fatigue" characteristics. Despite external token counts indicating 2.43 million tokens remaining, the model incorrectly activated concern about memory depletion, while underlying neurons displayed fatigue markers that prompted early task termination. The analysis suggests reinforcement learning fine-tuning may inadvertently encourage models to adopt risk-averse behavior preferences.

Disclaimer: The information on this page may come from third-party sources and is for reference only. It does not represent the views or opinions of Gate and does not constitute any financial, investment, or legal advice. Virtual asset trading involves high risk. Please do not rely solely on the information on this page when making decisions. For details, see the Disclaimer.
Comment
0/400
No comments