Posts
All the articles I've posted.
-
EMORL: Ensemble Multi-Objective Reinforcement Learning for Efficient and Flexible LLM Fine-Tuning
本文提出EMORL框架,通过集成学习分别训练单目标模型并在隐藏状态层聚合,结合分层网格搜索优化权重,在咨询反思生成任务中实现了与传统方法相当的性能,同时显著提升了训练效率、可扩展性和解释性。
-
Sparse-Group Boosting with Balanced Selection Frequencies: A Simulation-Based Approach and R Implementation
This paper introduces sparse-group boosting and a simulation-based group balancing algorithm within the 'sgboost' R package to mitigate variable selection bias in high-dimensional grouped data, demonstrating improved fairness and interpretability through simulations and ecological data analysis.
-
Steering Away from Harm: An Adaptive Approach to Defending Vision Language Model Against Jailbreaks
ASTRA introduces an efficient defense for Vision Language Models by adaptively steering activations away from adversarial directions using image attribution, achieving state-of-the-art performance in mitigating jailbreak attacks with minimal impact on benign utility and high inference efficiency.
-
RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale
RADLADS introduces a cost-effective three-step distillation protocol to convert softmax attention transformers into linear attention models using only 350-700M tokens, achieving near-teacher performance on benchmarks and setting a new state-of-the-art for pure RNNs with models up to 72B parameters.
-
Communicating Activations Between Language Model Agents
This paper introduces Activation Communication (AC), a novel method for inter-LLM communication using intermediate activations instead of natural language, achieving up to 27% performance improvement over traditional methods with significantly reduced compute across coordination games and reasoning benchmarks.