Tag: Attention Distribution

All the articles with the tag "Attention Distribution".

ZeroTuning: Unlocking the Initial Token's Power to Enhance Large Language Models Without Training

Published: 22 May, 2025 at 11:23 AM

91.42 🤔

ZeroTuning提出了一种无需训练的方法，通过调整大型语言模型初始token的注意力分布，在文本分类、问答和多轮对话任务中显著提升性能，同时展现出对资源限制和长上下文的鲁棒性。