Tag: Tool-Integrated Reasoning

All the articles with the tag "Tool-Integrated Reasoning".

Agent RL Scaling Law: Agent RL with Spontaneous Code Execution for Mathematical Problem Solving

Published: 22 May, 2025 at 11:12 AM

89.47 🤔

本文通过ZeroTIR框架利用强化学习训练基础大型语言模型自发执行Python代码解决数学问题，揭示了训练步数与代码使用频率、响应长度及任务准确率的正相关规律（Agent RL Scaling Law），并在数学基准上显著优于无工具基线。