Tag: Adaptive Systems

All the articles with the tag "Adaptive Systems".

Adaptive Layer-skipping in Pre-trained LLMs

Published: 4 May, 2025 at 04:28 PM

62.55 🤔

本文提出FlexiDepth方法，通过插件式路由器和适配器实现预训练LLM的自适应层跳过，提高计算效率同时保持生成性能，并通过实验揭示了token类型对计算需求的影响。
Streaming, Fast and Slow: Cognitive Load-Aware Streaming for Efficient LLM Serving

Published: 4 May, 2025 at 04:30 PM

60.43 🤔

本文提出基于认知负载的适应性流式传输框架，用于优化 LLM 服务，通过动态调整输出速度减少计算资源消耗高达 16.8%，同时维持用户满意度。
EPO: Explicit Policy Optimization for Strategic Reasoning in LLMs via Reinforcement Learning

Published: 4 May, 2025 at 04:27 PM

60.29 🤔

本文提出EPO方法，通过强化学习优化一个专门的战略推理模型，辅助任意LLM代理在动态环境中实现长期目标对齐，提升战略推理能力。
MARFT: Multi-Agent Reinforcement Fine-Tuning

Published: 4 May, 2025 at 04:28 PM

56.39 🤔

本文提出MARFT框架，通过序列决策和信任区域优化在LLM-based多代理系统中实现高效强化微调，提升代理协作能力并解决传统MARL的适用性问题。
Between Underthinking and Overthinking: An Empirical Study of Reasoning Length and correctness in LLMs

Published: 6 May, 2025 at 01:18 AM

89.54 😐

本文通过实证研究发现，大型语言模型在推理任务中存在"过度思考"简单问题和"思考不足"困难问题的现象，其推理长度与正确性呈非单调关系，且简单偏好更短回答可在保持准确率的同时显著减少生成长度。

Tag: Adaptive Systems

Adaptive Layer-skipping in Pre-trained LLMs

Streaming, Fast and Slow: Cognitive Load-Aware Streaming for Efficient LLM Serving

EPO: Explicit Policy Optimization for Strategic Reasoning in LLMs via Reinforcement Learning

MARFT: Multi-Agent Reinforcement Fine-Tuning

Between Underthinking and Overthinking: An Empirical Study of Reasoning Length and correctness in LLMs