Tag: In-Context Learning
All the articles with the tag "In-Context Learning".
-
Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism
本文通过提出Gather-and-Aggregate (G&A)机制,揭示了Transformer和SSM模型在上下文检索能力上的性能差距主要源于少数关键头部的实现差异,并通过混合模型实验验证了注意力机制在改进SSM检索能力上的潜力。
-
MateICL: Mitigating Attention Dispersion in Large-Scale In-Context Learning
本文提出 MateICL 框架,通过分割上下文窗口并引入注意力校准层解决大型语言模型在大规模上下文学习中的注意力分散问题,实验证明其在多种 NLP 任务中有效提升性能并保持稳定性。
-
Competition Dynamics Shape Algorithmic Phases of In-Context Learning
This paper introduces a synthetic sequence modeling task using finite Markov mixtures to unify the study of in-context learning (ICL), identifying four competing algorithms that explain model behavior and phase transitions, thus offering insights into ICL's transient nature and phenomenology.
-
ICLR: In-Context Learning of Representations
本文通过上下文图追踪任务揭示了大型语言模型能随上下文规模增加而突现地重组概念表示以适应新语义,并提出能量最小化假设解释这一过程。
-
Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models
This paper introduces a recursive summarization method to enhance long-term dialogue memory in LLMs, achieving marginal quantitative improvements and notable qualitative gains in consistency and coherence across multiple models and datasets.