Posts
All the articles I've posted.
-
Multiple Weaks Win Single Strong: Large Language Models Ensemble Weak Reinforcement Learning Agents into a Supreme One
本文提出LLM-Ens框架,利用大型语言模型(LLMs)通过语义状态分类和动态代理选择增强强化学习模型集成,在Atari基准上显著提升性能,最高较基线方法提升51.2%。
-
Who Taught You That? Tracing Teachers in Model Distillation
本文提出了一种基于句法模式(PoS 模板)的方法,通过学生模型输出的高阶语言特征识别其教师模型,并在多个任务和数据集上验证了其优于传统相似度和困惑度方法的性能,但准确率仍有待提升。
-
Contrastive Learning for Task-Independent SpeechLLM-Pretraining
本文提出了一种基于对比学习的SpeechLLM任务无关预训练方法,通过对齐语音和文本表示,在低资源场景下显著提升了ASR、语音翻译和语音问答任务的性能,并超越了多个专门模型。
-
Hide & Seek: Transformer Symmetries Obscure Sharpness & Riemannian Geometry Finds It
This paper introduces geodesic sharpness, a novel measure using Riemannian geometry to account for transformer symmetries on a quotient manifold, demonstrating stronger correlations with generalization across diagonal networks, vision transformers, and language models compared to traditional adaptive sharpness.
-
LoLA: Low-Rank Linear Attention With Sparse Caching
LoLA通过结合线性注意力、滑动窗口和稀疏缓存三种内存形式,在推理时有效缓解记忆冲突,显著提升线性注意力模型在长上下文关联回忆和语言建模任务上的性能,同时保持高效内存使用。