Tag: Data Augmentation

All the articles with the tag "Data Augmentation".

Towards Revealing the Effectiveness of Small-Scale Fine-tuning in R1-style Reinforcement Learning

Published: 28 May, 2025 at 11:25 AM

92.52 🤔

本文通过理论分析和Re-distillation技术，揭示了小规模SFT在R1风格RL中的效率瓶颈，并以极少样本（<1K）在K&K和MATH数据集上接近RL性能，显著提升了数据效率。
TL;DR: Too Long, Do Re-weighting for Effcient LLM Reasoning Compression

Published: 5 Jun, 2025 at 11:22 AM

92.10 🤔

本文提出TLDR方法，通过动态再加权系统1和系统2推理数据，显著压缩大型语言模型的推理输出token数量（约40%），同时在多难度数学任务上基本保持准确性。
AttentionInfluence: Adopting Attention Head Influence for Weak-to-Strong Pretraining Data Selection

Published: 17 May, 2025 at 11:20 PM

91.07 🤔

本文提出AttentionInfluence方法，通过无监督地利用预训练模型注意力头机制选择推理密集型数据，显著提升了7B参数模型在知识和推理任务上的性能，展现了弱到强的扩展潜力。
Not-Just-Scaling Laws: Towards a Better Understanding of the Downstream Impact of Language Model Design Decisions

Published: 2 Jun, 2025 at 11:32 AM

90.51 🤔

本文通过对92个开源语言模型的元分析，提出了一种超越缩放定律的性能预测框架，揭示了数据组成（如代码比例15-25%）和架构决策对下游任务性能的显著影响，预测精度相对提升3-28%。
Cyber Security Data Science: Machine Learning Methods and their Performance on Imbalanced Datasets

Published: 15 May, 2025 at 11:06 AM

90.25 🤔

This paper systematically evaluates machine learning classifiers and imbalance learning techniques on two cybersecurity datasets, revealing that XGB and RF perform robustly, while sampling and ensembling effects vary, emphasizing the need for dataset-specific method selection.

Tag: Data Augmentation

Towards Revealing the Effectiveness of Small-Scale Fine-tuning in R1-style Reinforcement Learning

TL;DR: Too Long, Do Re-weighting for Effcient LLM Reasoning Compression

AttentionInfluence: Adopting Attention Head Influence for Weak-to-Strong Pretraining Data Selection

Not-Just-Scaling Laws: Towards a Better Understanding of the Downstream Impact of Language Model Design Decisions

Cyber Security Data Science: Machine Learning Methods and their Performance on Imbalanced Datasets