Tag: Continual Learning

All the articles with the tag "Continual Learning".

Less, but Better: Efficient Multilingual Expansion for LLMs via Layer-wise Mixture-of-Experts

Published: 2 Jun, 2025 at 11:24 AM

88.22 🤔

本文提出LayerMoE算法，通过基于层间语言相似性的专家分配和路由分类器，实现了多语言LLM的高效扩展，以更少的参数显著提升新语言性能并减少旧语言遗忘。
Recurrent Knowledge Identification and Fusion for Language Model Continual Learning

Published: 4 Jun, 2025 at 11:26 AM

88.00 🤔

本文提出Recurrent-KIF框架，通过内外循环机制动态估计参数重要性并迭代融合新旧知识，在持续学习中有效缓解灾难性遗忘并促进知识转移，实验验证其在多个大语言模型上的性能优势。
Scalable Strategies for Continual Learning with Replay

Published: 21 May, 2025 at 11:14 AM

87.97 🤔

本文提出低秩适应（LoRA）、整合和顺序合并三种策略以提升持续学习的可扩展性，通过减少重放样本需求（最高65%）并结合高效微调技术，在图像分类任务中显著提高性能。
MoL for LLMs: Dual-Loss Optimization to Enhance Domain Expertise While Preserving General Capabilities

Published: 23 May, 2025 at 11:10 AM

87.03 🤔

本文提出MoL框架，通过对领域语料使用CE损失和对通用语料使用KL散度损失的双重优化策略，显著提升大型语言模型的领域专长，同时有效保留通用能力，并在医学领域任务中取得优异表现。
TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining

Published: 11 May, 2025 at 11:16 AM

86.74 🤔

This paper introduces TiC-LM, a web-scale benchmark for time-continual LLM pretraining using 114 Common Crawl dumps, demonstrating that replay and autoregressive schedules can match Oracle retraining on general web data with less compute, though trade-offs persist across domains.

Tag: Continual Learning

Less, but Better: Efficient Multilingual Expansion for LLMs via Layer-wise Mixture-of-Experts

Recurrent Knowledge Identification and Fusion for Language Model Continual Learning

Scalable Strategies for Continual Learning with Replay

MoL for LLMs: Dual-Loss Optimization to Enhance Domain Expertise While Preserving General Capabilities

TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining