Tag: Robustness

All the articles with the tag "Robustness".

Do LLMs Memorize Recommendation Datasets? A Preliminary Study on MovieLens-1M

Published: 19 May, 2025 at 11:16 AM

83.59 🤔

本文通过基于提示的方法初步研究了大型语言模型（LLMs）对MovieLens-1M推荐数据集的记忆程度，发现所有测试模型均表现出一定记忆，且记忆程度与推荐性能和模型规模正相关，同时揭示了流行度偏见问题。
Layered Unlearning for Adversarial Relearning

Published: 19 May, 2025 at 11:17 AM

77.78 🤔

本文提出分层遗忘（Layered Unlearning, LU）方法，通过多阶段逐步遗忘数据子集并诱导不同抑制机制，增强大型语言模型对对抗性重新学习的鲁棒性，尽管对语料库攻击仍显脆弱。
MOOSComp: Improving Lightweight Long-Context Compressor via Mitigating Over-Smoothing and Incorporating Outlier Scores

Published: 4 May, 2025 at 04:29 PM

77.68 🤔

本文提出MOOSComp方法，通过在训练中添加inter-class cosine similarity loss缓解over-smoothing问题，并在压缩中整合outlier分数保留关键token，显著提升了任务无关的长上下文压缩性能和泛化能力。
Latent Preference Coding: Aligning Large Language Models via Discrete Latent Codes

Published: 12 May, 2025 at 11:14 AM

76.90 🤔

This paper introduces Latent Preference Coding (LPC), a framework that uses discrete latent codes to model multifaceted human preferences, consistently improving the performance of offline alignment algorithms like DPO, SimPO, and IPO across multiple LLMs and benchmarks.
The Illusion of Role Separation: Hidden Shortcuts in LLM Role Learning (and How to Fix Them)

Published: 4 May, 2025 at 04:33 PM

75.58 🤔

本文通过提出位置 ID 操纵的 PFT 方法，揭示并解决了 LLM 在角色分离学习中依赖捷径的问题，提高了模型的鲁棒性和安全性，同时保持了性能。

Tag: Robustness

Do LLMs Memorize Recommendation Datasets? A Preliminary Study on MovieLens-1M

Layered Unlearning for Adversarial Relearning

MOOSComp: Improving Lightweight Long-Context Compressor via Mitigating Over-Smoothing and Incorporating Outlier Scores

Latent Preference Coding: Aligning Large Language Models via Discrete Latent Codes

The Illusion of Role Separation: Hidden Shortcuts in LLM Role Learning (and How to Fix Them)