Tag: Reasoning

All the articles with the tag "Reasoning".

How does Transformer Learn Implicit Reasoning?

Published: 31 May, 2025 at 11:22 AM

85.05 🤔

本文通过在受控符号环境中从头训练Transformer模型，揭示了隐式多跳推理的三阶段发展轨迹，并利用跨查询语义补丁和余弦表示透镜工具，阐明了推理能力与隐藏空间聚类的关联，为模型可解释性提供了新见解。
Unveiling the Mechanisms of Explicit CoT Training: How CoT Enhances Reasoning Generalization

Published: 6 May, 2025 at 11:21 PM

85.04 🤔

本文通过控制实验、内部机制分析和理论推导，揭示了显式思维链（CoT）训练通过形成二阶段泛化电路显著提升大型语言模型的分布内（ID）和分布外（OOD）推理泛化能力，并验证了其在噪声数据下的鲁棒性。
Concise Reasoning, Big Gains: Pruning Long Reasoning Trace with Difficulty-Aware Prompting

Published: 28 May, 2025 at 11:20 AM

85.03 🤔

本文提出难度感知提示（DAP）方法，通过动态调整推理轨迹长度构建精简的LiteCoT数据集（100K样本，平均720token），训练的Liter模型在多个推理基准上显著优于传统长CoT方法，同时大幅降低训练和推理成本。
DialogueReason: Rule-Based RL Sparks Dialogue Reasoning in LLMs

Published: 18 May, 2025 at 11:17 AM

83.58 🤔

本文提出DialogueReason，一种基于对话的推理模式，通过PPO和规则奖励函数训练大型语言模型，以提升复杂复合问答任务中的推理多样性和连贯性，并在MATH、AIME和GPQA数据集上展现出比单论式推理更强的鲁棒性。
Distillation and Refinement of Reasoning in Small Language Models for Document Re-ranking

Published: 6 May, 2025 at 11:15 PM

83.31 🤔

本文提出InteRank方法，通过知识蒸馏和强化学习训练一个3B参数小型语言模型，在推理密集型文档重排序任务中生成解释并实现与70B+参数模型相当的性能，在BRIGHT基准上位列第三。

Tag: Reasoning

How does Transformer Learn Implicit Reasoning?

Unveiling the Mechanisms of Explicit CoT Training: How CoT Enhances Reasoning Generalization

Concise Reasoning, Big Gains: Pruning Long Reasoning Trace with Difficulty-Aware Prompting

DialogueReason: Rule-Based RL Sparks Dialogue Reasoning in LLMs

Distillation and Refinement of Reasoning in Small Language Models for Document Re-ranking