Tag: Large Language Model

All the articles with the tag "Large Language Model".

Do Theory of Mind Benchmarks Need Explicit Human-like Reasoning in Language Models?

Published: 20 May, 2025 at 11:11 AM

87.75 🤔

本文通过RL和SFT训练不同规模LLMs，发现RL在较大模型中促进显式ToM推理但在小模型中导致推理崩溃，而SFT意外取得高性能，揭示当前ToM基准测试可能无需显式人类式推理即可解决。
RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale

Published: 8 May, 2025 at 06:17 PM

87.73 🤔

RADLADS introduces a cost-effective three-step distillation protocol to convert softmax attention transformers into linear attention models using only 350-700M tokens, achieving near-teacher performance on benchmarks and setting a new state-of-the-art for pure RNNs with models up to 72B parameters.
R1-Code-Interpreter: Training LLMs to Reason with Code via Supervised and Reinforcement Learning

Published: 3 Jun, 2025 at 11:42 AM

87.73 🤔

本文提出 R1-Code-Interpreter 框架，通过监督微调和强化学习训练大型语言模型动态生成和执行代码，在 144 个推理和规划任务上显著提升准确率，R1-CI-14B 达到 64.1%，接近 GPT-4o+Code Interpreter 的性能。
Rodimus*: Breaking the Accuracy-Efficiency Trade-Off with Efficient Attentions

Published: 21 May, 2025 at 11:29 AM

87.72 🤔

本文提出 Rodimus 和 Rodimus+ 模型，通过数据依赖温度选择（DDTS）和滑动窗口共享键注意力（SW-SKA）机制，在保持性能的同时显著降低大型语言模型的计算和内存复杂度，挑战了准确性与效率的权衡。
MaskSearch: A Universal Pre-Training Framework to Enhance Agentic Search Capability

Published: 30 May, 2025 at 11:19 AM

87.72 🤔

本文提出 MASKSEARCH 框架，通过 Retrieval-Augmented Mask Prediction (RAMP) 预训练任务结合监督微调和强化学习，显著提升了大型语言模型在开放域多跳问答任务中的代理搜索能力。

Tag: Large Language Model

Do Theory of Mind Benchmarks Need Explicit Human-like Reasoning in Language Models?

RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale

R1-Code-Interpreter: Training LLMs to Reason with Code via Supervised and Reinforcement Learning

Rodimus*: Breaking the Accuracy-Efficiency Trade-Off with Efficient Attentions

MaskSearch: A Universal Pre-Training Framework to Enhance Agentic Search Capability