Tag: Long Context
All the articles with the tag "Long Context".
-
MOOSComp: Improving Lightweight Long-Context Compressor via Mitigating Over-Smoothing and Incorporating Outlier Scores
本文提出MOOSComp方法,通过在训练中添加inter-class cosine similarity loss缓解over-smoothing问题,并在压缩中整合outlier分数保留关键token,显著提升了任务无关的长上下文压缩性能和泛化能力。
-
RetroInfer: A Vector-Storage Approach for Scalable Long-Context LLM Inference
RetroInfer reimagines the KV cache as a vector storage system, using an attention-aware wave index and wave buffer to achieve up to 4.5x speedup over full attention and 10.5x over sparse baselines for long-context LLM inference, while preserving near-full-attention accuracy.
-
An Empirical Study of Evaluating Long-form Question Answering
本文实证研究了长形式问题回答的自动评估指标,证明了基于LLM的指标在准确性和稳定性上的优势,同时分析了其偏差和改进策略。
-
State Space Models are Strong Text Rerankers
本文通过全面benchmark比较状态空间模型如Mamba与Transformer在文本重排序任务中的性能和效率,发现Mamba模型可实现类似性能但效率较低,并强调了未来优化方向。
-
RWKV-X: A Linear Complexity Hybrid Language Model
本文提出RWKV-X,一种线性复杂度的混合语言模型,通过结合RWKV和稀疏注意力机制,提升长上下文建模能力,同时保持高效性和短上下文性能。