Posts

All the articles I've posted.

Reinforced MLLM: A Survey on RL-Based Reasoning in Multimodal Large Language Models

Published: 7 May, 2025 at 08:42 AM

78.41 🤔

本文系统综述了基于强化学习的推理方法在多模态大语言模型（MLLMs）中的进展，分析了算法设计、奖励机制及应用，揭示了跨模态推理和奖励稀疏性等挑战，并提出了分层奖励和交互式RL等未来方向。
Layered Unlearning for Adversarial Relearning

Published: 19 May, 2025 at 11:17 AM

77.78 🤔

本文提出分层遗忘（Layered Unlearning, LU）方法，通过多阶段逐步遗忘数据子集并诱导不同抑制机制，增强大型语言模型对对抗性重新学习的鲁棒性，尽管对语料库攻击仍显脆弱。
MOOSComp: Improving Lightweight Long-Context Compressor via Mitigating Over-Smoothing and Incorporating Outlier Scores

Published: 4 May, 2025 at 04:29 PM

77.68 🤔

本文提出MOOSComp方法，通过在训练中添加inter-class cosine similarity loss缓解over-smoothing问题，并在压缩中整合outlier分数保留关键token，显著提升了任务无关的长上下文压缩性能和泛化能力。
Less is More: Enhancing Structured Multi-Agent Reasoning via Quality-Guided Distillation

Published: 4 May, 2025 at 04:28 PM

77.44 🤔

本文提出了一种质量导向的多代理框架，通过提示诱导、检索增强合成和奖励过滤从少量标注数据中提炼高质量监督信号，提升LLMs在低资源结构化推理任务中的性能。
Latte: Transfering LLMs` Latent-level Knowledge for Few-shot Tabular Learning

Published: 11 May, 2025 at 11:08 AM

77.34 🤔

The paper introduces 'Latte', a framework that transfers latent-level knowledge from Large Language Models during training to enhance few-shot tabular learning, outperforming baselines by leveraging unlabeled data and mitigating overfitting across diverse classification and regression tasks.

Posts

Reinforced MLLM: A Survey on RL-Based Reasoning in Multimodal Large Language Models

Layered Unlearning for Adversarial Relearning

MOOSComp: Improving Lightweight Long-Context Compressor via Mitigating Over-Smoothing and Incorporating Outlier Scores

Less is More: Enhancing Structured Multi-Agent Reasoning via Quality-Guided Distillation

Latte: Transfering LLMs` Latent-level Knowledge for Few-shot Tabular Learning