Tag: Reasoning

All the articles with the tag "Reasoning".

Learning Like Humans: Advancing LLM Reasoning Capabilities via Adaptive Difficulty Curriculum Learning and Expert-Guided Self-Reformulation

Published: 17 May, 2025 at 11:01 AM

85.16 🤔

This paper introduces Adaptive Difficulty Curriculum Learning (ADCL) and Expert-Guided Self-Reformulation (EGSR) to enhance LLM reasoning by dynamically adjusting training curricula and guiding models to reformulate expert solutions, achieving significant performance improvements over standard RL baselines on mathematical reasoning benchmarks.
ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning

Published: 1 Jun, 2025 at 11:53 AM

85.15 🤔

ReMA通过多智能体强化学习分离元思考和推理过程，提升了大型语言模型在数学推理和LLM-as-a-Judge任务上的性能，尤其在分布外泛化能力上表现出色，但对超参数敏感且多轮设置存在稳定性挑战。
Mining Hidden Thoughts from Texts: Evaluating Continual Pretraining with Synthetic Data for LLM Reasoning

Published: 18 May, 2025 at 11:14 AM

85.14 🤔

本文提出Reasoning CPT方法，通过在持续预训练中加入合成隐藏思维数据，显著提升大型语言模型在跨领域推理、困难问题解决和推理效率方面的表现，特别是在MMLU基准上实现了最高3.3%的整体提升和困难问题上约8%的改进。
From Distributional to Overton Pluralism: Investigating Large Language Model Alignment

Published: 18 May, 2025 at 11:16 AM

85.12 🤔

本文通过分析对齐前后LLM输出分布的变化，揭示了对齐虽减少分布性多元化但通过更长响应实现奥弗顿多元化，且基础模型通过上下文学习可有效模仿对齐模型行为，支持表面对齐假说。
The Quest for Efficient Reasoning: A Data-Centric Benchmark to CoT Distillation

Published: 30 May, 2025 at 11:18 AM

85.11 🤔

本文提出DC-CoT基准，通过系统评估数据增强、选择和混合策略在链式思维（CoT）蒸馏中的效果，揭示数据增强（如反向思维）对小型学生模型推理能力提升的显著作用，并为高效推理模型开发提供了实践指导。

Tag: Reasoning

Learning Like Humans: Advancing LLM Reasoning Capabilities via Adaptive Difficulty Curriculum Learning and Expert-Guided Self-Reformulation

ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning

Mining Hidden Thoughts from Texts: Evaluating Continual Pretraining with Synthetic Data for LLM Reasoning

From Distributional to Overton Pluralism: Investigating Large Language Model Alignment

The Quest for Efficient Reasoning: A Data-Centric Benchmark to CoT Distillation