Tag: Self-Supervised Learning

All the articles with the tag "Self-Supervised Learning".

Self-Reasoning Language Models: Unfold Hidden Reasoning Chains with Few Reasoning Catalyst

Published: 23 May, 2025 at 11:16 AM

93.01 🤔

本文提出自推理语言模型（SRLM），通过少量推理催化数据引导模型自生成更长推理链并迭代自训练，在多个推理基准上实现平均 +2.5 个百分点的性能提升，展现了探索深度和创造性推理路径的潜力。
Is PRM Necessary? Problem-Solving RL Implicitly Induces PRM Capability in LLMs

Published: 20 May, 2025 at 11:09 AM

90.43 🤔

本文通过系统性实验证明，纯强化学习（RL）训练不仅提升大型语言模型的复杂推理能力，还能隐式培养过程奖励模型（PRM）能力，提出Self-PRM框架以进一步改进性能，但也揭示了其在高难度问题上的低精度局限。
Always Skip Attention

Published: 8 May, 2025 at 11:06 AM

89.20 🤔

This paper theoretically demonstrates the ill-conditioning of Self-Attention Blocks in Vision Transformers without skip connections, highlights their role as regularizers, and proposes Token Graying (SVD and DCT) to improve input token conditioning, achieving modest performance gains in supervised and self-supervised tasks.
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Published: 3 Jun, 2025 at 11:45 AM

88.30 🤔

本文提出了一种通过强化学习（GRPO）优化大型语言模型自我反思能力的方法，在函数调用和数学方程任务上显著提升性能（平均9.0%和16.0%），并展示小模型在训练后可超越未训练大模型。
Foundation Models For Seismic Data Processing: An Extensive Review

Published: 14 May, 2025 at 11:07 AM

87.29 🤔

This paper conducts an extensive review of natural image foundation models for seismic data processing, demonstrating that hierarchical models like Swin and ConvNeXt, especially with self-supervised pre-training, outperform non-hierarchical ones in demultiple, interpolation, and denoising tasks, while highlighting the benefits and limitations of natural image pre-training for seismic applications.

Tag: Self-Supervised Learning

Self-Reasoning Language Models: Unfold Hidden Reasoning Chains with Few Reasoning Catalyst

Is PRM Necessary? Problem-Solving RL Implicitly Induces PRM Capability in LLMs

Always Skip Attention

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Foundation Models For Seismic Data Processing: An Extensive Review