Tag: Data Augmentation
All the articles with the tag "Data Augmentation".
-
When Does Metadata Conditioning (NOT) Work for Language Model Pre-Training? A Study with Context-Free Grammars
本论文通过上下文无关文法合成数据研究了元数据条件化在语言模型预训练中的影响,发现其对长提示任务有益但对短提示任务有害,揭示了潜在语义推断的权衡。
-
Think, Prune, Train, Improve: Scaling Reasoning without Scaling Models
本文提出 Think, Prune, Train 框架,通过迭代监督微调和基于正确性的数据修剪,实现模型在不增加规模的情况下提升推理能力,避免模型坍缩。
-
Constraint Back-translation Improves Complex Instruction Following of Large Language Models
本文提出约束反向翻译方法,通过从现有指令-响应对中提取隐含约束构建高质量复杂指令数据集CRAB,并结合反向训练显著提升大型语言模型在复杂指令跟随任务上的性能。
-
Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models
Insight-V introduces a scalable data generation pipeline and a multi-agent system with iterative DPO training to significantly enhance long-chain visual reasoning in MLLMs, achieving up to 7.0% performance gains on challenging benchmarks while maintaining perception capabilities.
-
Phi-4-reasoning Technical Report
本文通过数据导向的监督微调和强化学习,开发了小型LLM Phi-4-reasoning 和 Phi-4-reasoning-plus,提升了其在复杂推理任务上的性能,与大型模型竞争。