Tag: Reasoning

All the articles with the tag "Reasoning".

How Well Can a Long Sequence Model Model Long Sequences? Comparing Architechtural Inductive Biases on Long-Context Abilities

Published: 25 May, 2025 at 11:24 AM

86.84 🤔

本文通过对比实验揭示，尽管长序列模型（如Mamba2）理论上支持无限长上下文，但在实际长上下文任务中与Transformer模型一样面临显著局限，尤其在信息位置和数据格式变化时表现不佳，亟需进一步研究其原因。
Recall with Reasoning: Chain-of-Thought Distillation for Mamba's Long-Context Memory and Extrapolation

Published: 8 May, 2025 at 06:13 PM

86.84 🤔

This paper proposes Recall with Reasoning (RwR), a method that enhances Mamba's long-context memory and extrapolation by distilling chain-of-thought summarization from a teacher model, achieving significant performance improvements on LONGMEMEVAL and HELMET benchmarks while preserving short-context capabilities.
TensorLLM: Tensorising Multi-Head Attention for Enhanced Reasoning and Compression in LLMs

Published: 17 May, 2025 at 11:21 PM

86.82 🤔

本文提出了一种基于多头张量化和Tucker分解的框架，通过强制共享高维子空间对大型语言模型的多头注意力权重进行结构化去噪和压缩，显著提升推理能力并实现高达247倍的压缩率。
Illusion or Algorithm? Investigating Memorization, Emergence, and Symbolic Processing in In-Context Learning

Published: 20 May, 2025 at 11:24 AM

86.82 🤔

本文通过创新任务设计和Pythia模型训练检查点分析，揭示上下文学习（ICL）在大型语言模型中既非纯记忆也非符号算法，而是依赖统计特性的有限泛化能力，并探讨了其训练动态和内部机制联系。
Can Large Reasoning Models Self-Train?

Published: 1 Jun, 2025 at 11:43 AM

86.73 🤔

本文提出Self-Rewarded Training (SRT) 方法，通过模型自一致性驱动强化学习实现无监督数学推理能力提升，初期性能媲美有监督方法，但因奖励黑客问题导致长期训练性能崩溃，并探索了提前停止和课程学习等缓解策略。

Tag: Reasoning

How Well Can a Long Sequence Model Model Long Sequences? Comparing Architechtural Inductive Biases on Long-Context Abilities

Recall with Reasoning: Chain-of-Thought Distillation for Mamba's Long-Context Memory and Extrapolation

TensorLLM: Tensorising Multi-Head Attention for Enhanced Reasoning and Compression in LLMs

Illusion or Algorithm? Investigating Memorization, Emergence, and Symbolic Processing in In-Context Learning

Can Large Reasoning Models Self-Train?