Tag: Reasoning
All the articles with the tag "Reasoning".
-
Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models
Insight-V introduces a scalable data generation pipeline and a multi-agent system with iterative DPO training to significantly enhance long-chain visual reasoning in MLLMs, achieving up to 7.0% performance gains on challenging benchmarks while maintaining perception capabilities.
-
Trace-of-Thought Prompting: Investigating Prompt-Based Knowledge Distillation Through Question Decomposition
本文提出了 Trace-of-Thought Prompting,一种基于提示的知识蒸馏框架,通过将复杂问题分解为可管理的步骤,有效地将高资源模型的推理能力迁移到低资源模型,显著提升了低资源模型在算术推理任务上的表现,且无需大量微调。
-
A Survey on Test-Time Scaling in Large Language Models: What, How, Where, and How Well?
本文通过提出一个四维度分类框架(什么扩展、如何扩展、哪里扩展、扩展效果如何),系统综述了测试时扩展(TTS)在大型语言模型中的研究现状,为理解和应用推理阶段计算扩展提供了结构化视角和实践指导。
-
Between Underthinking and Overthinking: An Empirical Study of Reasoning Length and correctness in LLMs
本文通过实证研究发现,大型语言模型在推理任务中存在"过度思考"简单问题和"思考不足"困难问题的现象,其推理长度与正确性呈非单调关系,且简单偏好更短回答可在保持准确率的同时显著减少生成长度。
-
Weight Ensembling Improves Reasoning in Language Models
本文发现监督微调导致推理模型多样性坍塌损害 Pass@K,并提出通过插值早期与后期 SFT 检查点(WiSE-FT)的方法,有效提升模型多样性,同时提高 Pass@1 和 Pass@K,进而改善测试时缩放和强化学习效果。