Tag: Mathematical Reasoning
All the articles with the tag "Mathematical Reasoning".
-
The Unreasonable Effectiveness of Model Merging for Cross-Lingual Transfer in LLMs
本文通过模块化方法,利用大型语言模型参数在数学推理和多语言能力上的分离性,提出Layer-Swapping等策略,在低资源语言跨语言迁移中显著优于非模块化基线,尤其在数据受限场景下表现最佳。
-
Agent RL Scaling Law: Agent RL with Spontaneous Code Execution for Mathematical Problem Solving
本文通过ZeroTIR框架利用强化学习训练基础大型语言模型自发执行Python代码解决数学问题,揭示了训练步数与代码使用频率、响应长度及任务准确率的正相关规律(Agent RL Scaling Law),并在数学基准上显著优于无工具基线。
-
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
This paper demonstrates through meta-analysis and experiments that Chain-of-Thought (CoT) prompting significantly enhances large language model performance on math and symbolic reasoning tasks, but offers limited benefits for non-symbolic tasks and underperforms compared to tool-augmented approaches.
-
Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math
本文提出了一种多阶段训练方案,包括大规模蒸馏、滚动偏好优化和可验证奖励的强化学习,显著提升了小型语言模型在数学推理任务中的性能,使3.8B参数的Phi-4-Mini-Reasoning模型超过了近两倍参数的开源基线模型。