Tag: Inference Acceleration
All the articles with the tag "Inference Acceleration".
-
EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time Test
本文提出 EAGLE-3 方法,通过移除特征预测约束和多层特征融合技术,显著提高了大语言模型的推理加速比,并在实验中实现了高达 6.5 倍的无损速度提升。