Tag: Reasoning
All the articles with the tag "Reasoning".
-
Learning Like Humans: Advancing LLM Reasoning Capabilities via Adaptive Difficulty Curriculum Learning and Expert-Guided Self-Reformulation
This paper introduces Adaptive Difficulty Curriculum Learning (ADCL) and Expert-Guided Self-Reformulation (EGSR) to enhance LLM reasoning by dynamically adjusting training curricula and guiding models to reformulate expert solutions, achieving significant performance improvements over standard RL baselines on mathematical reasoning benchmarks.
-
ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning
ReMA通过多智能体强化学习分离元思考和推理过程,提升了大型语言模型在数学推理和LLM-as-a-Judge任务上的性能,尤其在分布外泛化能力上表现出色,但对超参数敏感且多轮设置存在稳定性挑战。
-
本文提出Reasoning CPT方法,通过在持续预训练中加入合成隐藏思维数据,显著提升大型语言模型在跨领域推理、困难问题解决和推理效率方面的表现,特别是在MMLU基准上实现了最高3.3%的整体提升和困难问题上约8%的改进。
-
From Distributional to Overton Pluralism: Investigating Large Language Model Alignment
本文通过分析对齐前后LLM输出分布的变化,揭示了对齐虽减少分布性多元化但通过更长响应实现奥弗顿多元化,且基础模型通过上下文学习可有效模仿对齐模型行为,支持表面对齐假说。
-
The Quest for Efficient Reasoning: A Data-Centric Benchmark to CoT Distillation
本文提出DC-CoT基准,通过系统评估数据增强、选择和混合策略在链式思维(CoT)蒸馏中的效果,揭示数据增强(如反向思维)对小型学生模型推理能力提升的显著作用,并为高效推理模型开发提供了实践指导。