Posts
All the articles I've posted.
-
IDEAL: Data Equilibrium Adaptation for Multi-Capability Language Model Alignment
IDEAL提出了一种基于梯度的迭代数据均衡适应框架,通过动态优化监督微调(SFT)中多领域数据集的比例,在2次迭代内显著提升大型语言模型的多任务性能,平均得分提高约7%。
-
Lost in Transmission: When and Why LLMs Fail to Reason Globally
本文提出BAPO模型量化大型语言模型(LLMs)内部通信带宽限制,理论证明与实验验证了LLMs在高带宽需求任务上的失败,并展示链式思维(CoT)可降低带宽需求以缓解部分问题。
-
Navigating the Accuracy-Size Trade-Off with Flexible Model Merging
FlexMerge提出了一种无数据的灵活模型合并框架,通过逐块贪婪合并微调模型,支持任意大小模型生成,并在精度-大小权衡上展现出显著的初期精度提升和接近微调精度的潜力。
-
Next Token Perception Score: Analytical Assessment of your LLM Perception Skills
本文提出Next Token Perception Score (NTPS),一个量化自回归预训练与下游感知任务特征子空间对齐程度的度量方法,通过理论证明和实验验证其与线性探针性能的相关性,并展示其预测LoRA微调增益的实用性。
-
Long Term Memory: The Foundation of AI Self-Evolution
This paper proposes Long-Term Memory (LTM) as a cornerstone for AI self-evolution, demonstrating through multi-agent frameworks like OMNE and diverse experiments that LTM enables personalized, adaptive learning in LLMs during inference, achieving top performance on benchmarks like GAIA.