Tag: Agent

All the articles with the tag "Agent".

Structured Agent Distillation for Large Language Model

Published: 28 May, 2025 at 11:23 AM

85.73 🤔

本文提出结构化代理蒸馏框架，通过分割大型语言模型代理轨迹为推理和行动片段并施加分段特定监督，在压缩模型时显著提升任务成功率、推理效率和一致性，优于token级基线。
LiteWebAgent: The Open-Source Suite for VLM-Based Web-Agent Applications

Published: 14 May, 2025 at 11:12 AM

90.54 🤔

LiteWebAgent is an open-source suite for VLM-based web agents that bridges the gap in production-ready solutions by offering an extensible framework with decoupled action generation and grounding, advanced planning, memory, tree search, and practical deployments via Vercel and Chrome extension.
Agent RL Scaling Law: Agent RL with Spontaneous Code Execution for Mathematical Problem Solving

Published: 22 May, 2025 at 11:12 AM

89.47 🤔

本文通过ZeroTIR框架利用强化学习训练基础大型语言模型自发执行Python代码解决数学问题，揭示了训练步数与代码使用频率、响应长度及任务准确率的正相关规律（Agent RL Scaling Law），并在数学基准上显著优于无工具基线。
MELON: Provable Indirect Prompt Injection Defense via Masked Re-execution and Tool Comparison

Published: 8 May, 2025 at 10:22 AM

89.40 🤔

MELON introduces a novel training-free defense against indirect prompt injection attacks on LLM agents by detecting independence of tool calls from user inputs through masked re-execution, achieving superior attack prevention (0.24% ASR on GPT-4o) and utility preservation (58.78% UA on GPT-4o) compared to existing methods.
Pre-Act: Multi-Step Planning and Reasoning Improves Acting in LLM Agents

Published: 25 May, 2025 at 11:24 AM

87.43 🤔

本文提出Pre-Act方法，通过多步骤规划和详细推理提升LLM代理性能，并通过微调小型模型（如Llama 3.1 70B）在Almita数据集上实现比GPT-4高69.5%的行动准确率和28%的目标完成率。

Tag: Agent

Structured Agent Distillation for Large Language Model

LiteWebAgent: The Open-Source Suite for VLM-Based Web-Agent Applications

Agent RL Scaling Law: Agent RL with Spontaneous Code Execution for Mathematical Problem Solving

MELON: Provable Indirect Prompt Injection Defense via Masked Re-execution and Tool Comparison

Pre-Act: Multi-Step Planning and Reasoning Improves Acting in LLM Agents