Tag: Self-Supervised Learning
All the articles with the tag "Self-Supervised Learning".
-
本文提出自推理语言模型(SRLM),通过少量推理催化数据引导模型自生成更长推理链并迭代自训练,在多个推理基准上实现平均 +2.5 个百分点的性能提升,展现了探索深度和创造性推理路径的潜力。
-
Is PRM Necessary? Problem-Solving RL Implicitly Induces PRM Capability in LLMs
本文通过系统性实验证明,纯强化学习(RL)训练不仅提升大型语言模型的复杂推理能力,还能隐式培养过程奖励模型(PRM)能力,提出Self-PRM框架以进一步改进性能,但也揭示了其在高难度问题上的低精度局限。
-
Always Skip Attention
This paper theoretically demonstrates the ill-conditioning of Self-Attention Blocks in Vision Transformers without skip connections, highlights their role as regularizers, and proposes Token Graying (SVD and DCT) to improve input token conditioning, achieving modest performance gains in supervised and self-supervised tasks.
-
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
本文提出了一种通过强化学习(GRPO)优化大型语言模型自我反思能力的方法,在函数调用和数学方程任务上显著提升性能(平均9.0%和16.0%),并展示小模型在训练后可超越未训练大模型。
-
Foundation Models For Seismic Data Processing: An Extensive Review
This paper conducts an extensive review of natural image foundation models for seismic data processing, demonstrating that hierarchical models like Swin and ConvNeXt, especially with self-supervised pre-training, outperform non-hierarchical ones in demultiple, interpolation, and denoising tasks, while highlighting the benefits and limitations of natural image pre-training for seismic applications.