Tag: Reasoning

All the articles with the tag "Reasoning".

Not-Just-Scaling Laws: Towards a Better Understanding of the Downstream Impact of Language Model Design Decisions

Published: 2 Jun, 2025 at 11:32 AM

90.51 🤔

本文通过对92个开源语言模型的元分析，提出了一种超越缩放定律的性能预测框架，揭示了数据组成（如代码比例15-25%）和架构决策对下游任务性能的显著影响，预测精度相对提升3-28%。
Self-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-Teaching

Published: 20 May, 2025 at 11:12 AM

90.50 🤔

本文提出SELF-TUNING框架，通过自教策略（SELF-TEACHING）显著提升大型语言模型从新文档中获取知识的能力，并在记忆、提取和推理任务上取得优异表现，同时保持较好的知识保留能力。
RL of Thoughts: Navigating LLM Reasoning with Inference-time Reinforcement Learning

Published: 23 May, 2025 at 11:16 AM

90.48 🤔

本文提出RL-of-Thoughts (RLoT) 方法，通过强化学习训练轻量化导航模型，在推理时动态构建任务特定逻辑结构，显著提升大型语言模型在多领域推理任务中的表现，并展现出跨模型和任务的强迁移能力。
Is PRM Necessary? Problem-Solving RL Implicitly Induces PRM Capability in LLMs

Published: 20 May, 2025 at 11:09 AM

90.43 🤔

本文通过系统性实验证明，纯强化学习（RL）训练不仅提升大型语言模型的复杂推理能力，还能隐式培养过程奖励模型（PRM）能力，提出Self-PRM框架以进一步改进性能，但也揭示了其在高难度问题上的低精度局限。
Talking Heads: Understanding Inter-layer Communication in Transformer Language Models

Published: 13 May, 2025 at 11:21 AM

90.20 🤔

This paper investigates inter-layer communication in Transformer LMs by identifying low-rank communication channels via SVD, demonstrating their causal role in prompt sensitivity through interventions that significantly improve performance on context retrieval tasks like the Laundry List task.

Tag: Reasoning

Not-Just-Scaling Laws: Towards a Better Understanding of the Downstream Impact of Language Model Design Decisions

Self-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-Teaching

RL of Thoughts: Navigating LLM Reasoning with Inference-time Reinforcement Learning

Is PRM Necessary? Problem-Solving RL Implicitly Induces PRM Capability in LLMs

Talking Heads: Understanding Inter-layer Communication in Transformer Language Models