Tag: Efficiency
All the articles with the tag "Efficiency".
-
AI agents may be worth the hype but not the resources (yet): An initial exploration of machine translation quality and costs in three language pairs in the legal and news domains
本文通过实证评估五种机器翻译范式,发现推理增强的大型语言模型(如o1-preview)在人工评估中表现出色,超越传统NMT,而多智能体系统虽具潜力,但因高计算成本和语言对表现不一致而受限。
-
Activated LoRA: Fine-tuned LLMs for Intrinsics
本文提出 Activated LoRA (aLoRA),一种改进的 LoRA 框架,通过仅对激活后 token 适配权重,复用基础模型 KV 缓存,实现高效动态适配,并在多个任务上保持与标准 LoRA 相当的性能,同时显著降低推理成本。
-
LENSLLM: Unveiling Fine-Tuning Dynamics for LLM Selection
LENSLLM introduces a Hessian-based PAC-Bayes framework and NTK-based scaling model for LLM selection, achieving up to 91.1% accuracy and 88.5% computational cost reduction by modeling fine-tuning dynamics across diverse tasks.
-
Exploring the Potential of Offline RL for Reasoning in LLMs: A Preliminary Study
本文通过探索离线强化学习方法(LD-DPO),在DeepDistill-32B模型上实现了平均3.3%的推理性能提升,尤其在Arena-Hard基准上提升10.1%,并强调了推理长度与语义丰富性平衡的重要性。
-
Do We Truly Need So Many Samples? Multi-LLM Repeated Sampling Efficiently Scales Test-Time Compute
This paper introduces ModelSwitch, a multi-LLM repeated sampling strategy that leverages answer consistency to dynamically switch models, achieving superior performance and 34% sample efficiency over single-LLM self-consistency across diverse datasets.