Tag: Regularization
All the articles with the tag "Regularization".
-
Elastic Weight Consolidation for Full-Parameter Continual Pre-Training of Gemma2
This paper demonstrates that Elastic Weight Consolidation (EWC) applied to full-parameter continual pre-training of Gemma2 2B LLM mitigates catastrophic forgetting on English tasks while improving performance on Lithuanian language benchmarks during autoregressive pre-training on CulturaX data.