Tag: Large Language Model
All the articles with the tag "Large Language Model".
-
Reinforced MLLM: A Survey on RL-Based Reasoning in Multimodal Large Language Models
本文系统综述了基于强化学习的推理方法在多模态大语言模型(MLLMs)中的进展,分析了算法设计、奖励机制及应用,揭示了跨模态推理和奖励稀疏性等挑战,并提出了分层奖励和交互式RL等未来方向。
-
Layered Unlearning for Adversarial Relearning
本文提出分层遗忘(Layered Unlearning, LU)方法,通过多阶段逐步遗忘数据子集并诱导不同抑制机制,增强大型语言模型对对抗性重新学习的鲁棒性,尽管对语料库攻击仍显脆弱。
-
Latte: Transfering LLMs` Latent-level Knowledge for Few-shot Tabular Learning
The paper introduces 'Latte', a framework that transfers latent-level knowledge from Large Language Models during training to enhance few-shot tabular learning, outperforming baselines by leveraging unlabeled data and mitigating overfitting across diverse classification and regression tasks.
-
Large Language Model Compression with Global Rank and Sparsity Optimization
This paper introduces a two-stage LLM compression method using RPCA for low-rank and sparse decomposition and probabilistic pruning via policy gradient, outperforming state-of-the-art techniques at a 50% compression ratio while automatically adapting to layer-wise redundancy without manual thresholds or extensive fine-tuning.
-
Latent Preference Coding: Aligning Large Language Models via Discrete Latent Codes
This paper introduces Latent Preference Coding (LPC), a framework that uses discrete latent codes to model multifaceted human preferences, consistently improving the performance of offline alignment algorithms like DPO, SimPO, and IPO across multiple LLMs and benchmarks.