Tag: Human-AI Interaction

All the articles with the tag "Human-AI Interaction".

Adversarial Attacks on LLM-as-a-Judge Systems: Insights from Prompt Injections

Published: 4 May, 2025 at 04:30 PM

72.07 🤔

本文通过提出攻击框架和实验评估，揭示了LLM-as-a-judge系统的prompt injection漏洞，并推荐使用多模型委员会等策略提升鲁棒性。
SEM: Reinforcement Learning for Search-Efficient Large Language Models

Published: 18 May, 2025 at 11:14 AM

71.64 🤔

本文提出 *SEM* 框架，通过强化学习优化大型语言模型的搜索行为，在减少冗余搜索的同时提升回答准确性，显著提高推理效率。
Large Language Models Think Too Fast To Explore Effectively

Published: 18 May, 2025 at 11:17 AM

71.14 🤔

本文通过《Little Alchemy 2》游戏评估大型语言模型（LLMs）的探索能力，发现大多数LLMs因过早决策和过度依赖不确定性驱动策略而表现不如人类，但o1和DeepSeek-R1通过平衡赋能和深入推理显著超越人类，揭示了推理深度和架构设计对开放性探索的重要性。
MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism

Published: 4 May, 2025 at 04:30 PM

70.65 🤔

本文提出MegaScale-Infer系统，通过分离注意力模块和FFN模块的并行策略以及高效M2N通信库，优化大规模MoE模型的推理效率，实现高达1.90倍的吞吐量提升。
Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks

Published: 4 May, 2025 at 04:26 PM

70.33 🤔

本文提出PLAN-AND-ACT框架，通过分离规划和执行模块、利用合成数据训练和动态重规划，提高LLM代理在复杂长期任务中的性能，并在web导航基准上达到state-of-the-art结果。

Tag: Human-AI Interaction

Adversarial Attacks on LLM-as-a-Judge Systems: Insights from Prompt Injections

SEM: Reinforcement Learning for Search-Efficient Large Language Models

Large Language Models Think Too Fast To Explore Effectively

MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism

Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks