Tag: Reasoning
All the articles with the tag "Reasoning".
-
Belief Injection for Epistemic Control in Linguistic State Space
This paper proposes belief injection as a proactive epistemic control mechanism to shape AI agents' internal linguistic belief states within the Semantic Manifold framework, offering diverse strategies for guiding reasoning and alignment, though it lacks empirical validation.
-
Unlocking Efficient Long-to-Short LLM Reasoning with Model Merging
本文通过模型融合方法整合快速思维和慢速推理能力,实现长到短推理,在7B模型上将响应长度压缩高达55%且保持性能,提出了一种高效解决大语言模型过度思考问题的方案。
-
Why Distillation can Outperform Zero-RL: The Role of Flexible Reasoning
本文通过仅使用920个蒸馏样本对Qwen2.5-32B基础模型进行监督微调,显著超越了资源密集的Zero-RL方法,并揭示了蒸馏模型通过拟人化语言和高级认知行为实现更灵活推理的机制。
-
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
本文通过 pass@k 指标系统评估 RLVR 在大型语言模型推理能力边界上的效果,发现 RLVR 仅提高采样效率而未引入新推理模式,其能力受限于基础模型,强调需改进 RL 范式以激发真正的新推理能力。
-
Universal Reasoner: A Single, Composable Plug-and-Play Reasoner for Frozen LLMs
本文提出 Universal Reasoner (UniR),一种轻量级、可组合的推理模块,通过将预定义奖励转化为 token 级别指导信号,为冻结的大型语言模型提供高效的推理能力增强,并在数学推理与机器翻译任务上展现出优于部分基线的性能与跨模型迁移能力。