Posts
All the articles I've posted.
-
Budget-Adaptive Adapter Tuning in Orthogonal Subspaces for Continual Learning in LLMs
本文提出OA-Adapter,一种用于大型语言模型持续学习的新型参数高效方法,通过单阶段端到端训练结合动态预算分配与正交子空间学习,在标准基准上实现更高准确率并减少58.5%的参数使用。
-
Beyond Single-Task: Robust Multi-Task Length Generalization for LLMs
本文提出Meta-RFFT框架,通过多任务规则跟随预训练和少量下游适应,显著提升了大型语言模型在未见任务上的长度泛化能力,32B模型在长度30的加法任务上达到98%准确率,超越现有长链推理模型。
-
Enhancing Safety Standards in Automated Systems Using Dynamic Bayesian Networks
This paper proposes a Dynamic Bayesian Network framework for autonomous vehicles that enhances safety in cut-in maneuvers by integrating lateral evidence and probabilistic safety assessments, achieving superior crash avoidance in high-speed scenarios (9.22% crash rate) compared to baseline models in the JRC-FSM simulator.
-
REARANK: Reasoning Re-ranking Agent via Reinforcement Learning
本文提出REARANK,一种基于强化学习的列表式重排序代理,通过显式推理和数据增强,仅用179个标注查询即在多个信息检索基准上显著超越基线并媲美甚至超越GPT-4,尤其在推理密集型任务中表现突出。
-
LiteWebAgent: The Open-Source Suite for VLM-Based Web-Agent Applications
LiteWebAgent is an open-source suite for VLM-based web agents that bridges the gap in production-ready solutions by offering an extensible framework with decoupled action generation and grounding, advanced planning, memory, tree search, and practical deployments via Vercel and Chrome extension.