Tag: Reasoning

All the articles with the tag "Reasoning".

SEM: Reinforcement Learning for Search-Efficient Large Language Models

Published: 18 May, 2025 at 11:14 AM

71.64 🤔

本文提出 *SEM* 框架，通过强化学习优化大型语言模型的搜索行为，在减少冗余搜索的同时提升回答准确性，显著提高推理效率。
Large Language Models Think Too Fast To Explore Effectively

Published: 18 May, 2025 at 11:17 AM

71.14 🤔

本文通过《Little Alchemy 2》游戏评估大型语言模型（LLMs）的探索能力，发现大多数LLMs因过早决策和过度依赖不确定性驱动策略而表现不如人类，但o1和DeepSeek-R1通过平衡赋能和深入推理显著超越人类，揭示了推理深度和架构设计对开放性探索的重要性。
CCSK:Cognitive Convection of Self-Knowledge Based Retrieval Augmentation for Large Language Models

Published: 7 May, 2025 at 08:43 AM

70.69 🤔

本文提出CCSK框架，通过Siamese Network和Response Quality Model动态融合查询相似性和响应质量，优化大型语言模型的信息检索决策，在多个问答数据集上显著提升了F1分数和准确率。
Reward-SQL: Boosting Text-to-SQL via Stepwise Reasoning and Process-Supervised Rewards

Published: 12 May, 2025 at 11:15 AM

70.53 🤔

REWARD-SQL introduces a framework for Text-to-SQL by decomposing queries into Chain-of-CTEs and using Process Reward Models (PRMs) with GRPO and Best-of-N sampling, achieving a state-of-the-art 68.9% execution accuracy on the BIRD dataset with a 7B model.
Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks

Published: 4 May, 2025 at 04:26 PM

70.33 🤔

本文提出PLAN-AND-ACT框架，通过分离规划和执行模块、利用合成数据训练和动态重规划，提高LLM代理在复杂长期任务中的性能，并在web导航基准上达到state-of-the-art结果。

Tag: Reasoning

SEM: Reinforcement Learning for Search-Efficient Large Language Models

Large Language Models Think Too Fast To Explore Effectively

CCSK:Cognitive Convection of Self-Knowledge Based Retrieval Augmentation for Large Language Models

Reward-SQL: Boosting Text-to-SQL via Stepwise Reasoning and Process-Supervised Rewards

Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks