Tag: Scaling Laws
All the articles with the tag "Scaling Laws".
-
LENSLLM: Unveiling Fine-Tuning Dynamics for LLM Selection
LENSLLM introduces a Hessian-based PAC-Bayes framework and NTK-based scaling model for LLM selection, achieving up to 91.1% accuracy and 88.5% computational cost reduction by modeling fine-tuning dynamics across diverse tasks.
-
Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon
This paper introduces a taxonomy of language model memorization into recitation, reconstruction, and recollection, demonstrating through experiments with Pythia models that different factors influence each category, with a taxonomy-based predictive model outperforming baselines in predicting memorization likelihood.
-
Contextures: Representations from Contexts
This paper introduces the contexture theory, unifying representation learning across paradigms by targeting top singular functions of a context-induced expectation operator, demonstrating high alignment in neural representations and proposing a task-agnostic metric for context evaluation with strong empirical correlation to performance on various datasets.
-
A Survey on Test-Time Scaling in Large Language Models: What, How, Where, and How Well?
本文通过提出一个四维度分类框架(什么扩展、如何扩展、哪里扩展、扩展效果如何),系统综述了测试时扩展(TTS)在大型语言模型中的研究现状,为理解和应用推理阶段计算扩展提供了结构化视角和实践指导。
-
EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time Test
本文提出 EAGLE-3 方法,通过移除特征预测约束和多层特征融合技术,显著提高了大语言模型的推理加速比,并在实验中实现了高达 6.5 倍的无损速度提升。