Tag: Pre-training

All the articles with the tag "Pre-training".

Towards Safer Pretraining: Analyzing and Filtering Harmful Content in Webscale datasets for Responsible LLMs

Published: 8 May, 2025 at 11:07 AM

94.57 🤔

This paper proposes a three-dimensional taxonomy and develops TTP and HarmFormer tools to filter harmful content from web-scale LLM pretraining datasets, revealing significant toxicity prevalence and persistent safety gaps through benchmarks like HAVOC.
RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale

Published: 8 May, 2025 at 06:17 PM

87.73 🤔

RADLADS introduces a cost-effective three-step distillation protocol to convert softmax attention transformers into linear attention models using only 350-700M tokens, achieving near-teacher performance on benchmarks and setting a new state-of-the-art for pure RNNs with models up to 72B parameters.
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild

Published: 8 May, 2025 at 06:17 PM

86.55 🤔

This paper investigates zero RL training on diverse open base models, achieving significant accuracy and response length improvements while identifying key factors like reward design and data difficulty that influence the emergence of reasoning behaviors.
Dynamic Fisher-weighted Model Merging via Bayesian Optimization

Published: 6 May, 2025 at 01:19 AM

86.13 🤔

本文提出了动态 Fisher 加权合并 (DF-Merge) 方法，通过贝叶斯优化动态调整微调模型的缩放系数，并在这些缩放模型上利用 Fisher 信息进行加权合并，从而高效地创建性能显著优于现有基线的多任务模型。
Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon

Published: 10 May, 2025 at 10:59 AM

85.36 🤔

This paper introduces a taxonomy of language model memorization into recitation, reconstruction, and recollection, demonstrating through experiments with Pythia models that different factors influence each category, with a taxonomy-based predictive model outperforming baselines in predicting memorization likelihood.

Tag: Pre-training

Towards Safer Pretraining: Analyzing and Filtering Harmful Content in Webscale datasets for Responsible LLMs

RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale

SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild

Dynamic Fisher-weighted Model Merging via Bayesian Optimization

Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon