Posts

All the articles I've posted.

Budget-Adaptive Adapter Tuning in Orthogonal Subspaces for Continual Learning in LLMs

Published: 4 Jun, 2025 at 11:26 AM

90.65 🤔

本文提出OA-Adapter，一种用于大型语言模型持续学习的新型参数高效方法，通过单阶段端到端训练结合动态预算分配与正交子空间学习，在标准基准上实现更高准确率并减少58.5%的参数使用。
Beyond Single-Task: Robust Multi-Task Length Generalization for LLMs

Published: 23 May, 2025 at 11:14 AM

90.65 🤔

本文提出Meta-RFFT框架，通过多任务规则跟随预训练和少量下游适应，显著提升了大型语言模型在未见任务上的长度泛化能力，32B模型在长度30的加法任务上达到98%准确率，超越现有长链推理模型。
Enhancing Safety Standards in Automated Systems Using Dynamic Bayesian Networks

Published: 8 May, 2025 at 11:06 AM

90.63 🤔

This paper proposes a Dynamic Bayesian Network framework for autonomous vehicles that enhances safety in cut-in maneuvers by integrating lateral evidence and probabilistic safety assessments, achieving superior crash avoidance in high-speed scenarios (9.22% crash rate) compared to baseline models in the JRC-FSM simulator.
REARANK: Reasoning Re-ranking Agent via Reinforcement Learning

Published: 30 May, 2025 at 11:19 AM

90.58 🤔

本文提出REARANK，一种基于强化学习的列表式重排序代理，通过显式推理和数据增强，仅用179个标注查询即在多个信息检索基准上显著超越基线并媲美甚至超越GPT-4，尤其在推理密集型任务中表现突出。
LiteWebAgent: The Open-Source Suite for VLM-Based Web-Agent Applications

Published: 14 May, 2025 at 11:12 AM

90.54 🤔

LiteWebAgent is an open-source suite for VLM-based web agents that bridges the gap in production-ready solutions by offering an extensible framework with decoupled action generation and grounding, advanced planning, memory, tree search, and practical deployments via Vercel and Chrome extension.

Posts

Budget-Adaptive Adapter Tuning in Orthogonal Subspaces for Continual Learning in LLMs

Beyond Single-Task: Robust Multi-Task Length Generalization for LLMs

Enhancing Safety Standards in Automated Systems Using Dynamic Bayesian Networks

REARANK: Reasoning Re-ranking Agent via Reinforcement Learning

LiteWebAgent: The Open-Source Suite for VLM-Based Web-Agent Applications