Tag: Pre-training

All the articles with the tag "Pre-training".

SelfBudgeter: Adaptive Token Allocation for Efficient LLM Reasoning

Published: 22 May, 2025 at 11:12 AM

87.16 🤔

SelfBudgeter通过自适应令牌预算预测和强化学习优化，在MATH数据集上实现74.47%响应长度压缩，同时保持接近原始准确性，显著提升大型推理模型的效率。
MoL for LLMs: Dual-Loss Optimization to Enhance Domain Expertise While Preserving General Capabilities

Published: 23 May, 2025 at 11:10 AM

87.03 🤔

本文提出MoL框架，通过对领域语料使用CE损失和对通用语料使用KL散度损失的双重优化策略，显著提升大型语言模型的领域专长，同时有效保留通用能力，并在医学领域任务中取得优异表现。
Not All Correct Answers Are Equal: Why Your Distillation Source Matters

Published: 24 May, 2025 at 11:11 AM

86.97 🤔

本文通过从三个顶尖大语言模型中提炼189万推理数据，系统研究了提炼源对学生模型性能的影响，发现AM-Thinking-v1提炼数据在多个推理基准上显著提升学生模型表现，并展现出适应性生成长度特性。
Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models

Published: 17 May, 2025 at 11:02 AM

86.87 🤔

This paper introduces a systematic approach to enhance large reasoning models by aligning them with deduction, induction, and abduction meta-abilities through a three-stage pipeline of individual training, parameter merging, and domain-specific RL, achieving up to 4% performance gains over instruction-tuned baselines across math, coding, and science benchmarks.
Illusion or Algorithm? Investigating Memorization, Emergence, and Symbolic Processing in In-Context Learning

Published: 20 May, 2025 at 11:24 AM

86.82 🤔

本文通过创新任务设计和Pythia模型训练检查点分析，揭示上下文学习（ICL）在大型语言模型中既非纯记忆也非符号算法，而是依赖统计特性的有限泛化能力，并探讨了其训练动态和内部机制联系。

Tag: Pre-training

SelfBudgeter: Adaptive Token Allocation for Efficient LLM Reasoning

MoL for LLMs: Dual-Loss Optimization to Enhance Domain Expertise While Preserving General Capabilities

Not All Correct Answers Are Equal: Why Your Distillation Source Matters

Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models

Illusion or Algorithm? Investigating Memorization, Emergence, and Symbolic Processing in In-Context Learning