Tag: Large Language Model

All the articles with the tag "Large Language Model".

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

Published: 4 May, 2025 at 04:26 PM

76.52 🤔

本文首次系统调查了大型语言模型高效推理的进展，通过分类模型、输出和提示-based方法，探讨了减少"过度思考"现象的策略，以优化计算效率并保持推理能力。
Reward Guidance for Reinforcement Learning Tasks Based on Large Language Models: The LMGT Framework

Published: 5 May, 2025 at 11:16 PM

76.51 🤔

本文提出了LMGT框架，通过利用大型语言模型的先验知识对强化学习的奖励进行动态调整，有效平衡了探索与利用，显著提高了样本效率并降低了训练成本，并在多种环境、算法以及机器人和推荐系统等复杂场景中验证了其有效性。
Agentic Reasoning and Tool Integration for LLMs via Reinforcement Learning

Published: 13 May, 2025 at 11:12 AM

76.49 🤔

ARTIST, a novel framework unifying agentic reasoning, reinforcement learning, and tool integration, enables LLMs to autonomously orchestrate external tools within multi-turn reasoning, achieving up to 22% accuracy gains on complex math tasks and significant improvements in multi-turn function calling over baselines.
Beyond the Last Answer: Your Reasoning Trace Uncovers More than You Think

Published: 7 May, 2025 at 12:12 AM

76.38 🤔

本文提出了一种通过分割大型语言模型推理轨迹为子思维并从中间状态生成多条推理路径、最终以众数聚合答案的方法，显著提高了数学推理任务的准确性（最高提升13%），并揭示了答案一致性与正确性的相关性。
LLM-e Guess: Can LLMs Capabilities Advance Without Hardware Progress?

Published: 12 May, 2025 at 11:20 AM

76.16 🤔

This paper introduces a framework to classify algorithmic innovations in LLMs as compute-dependent or compute-independent, demonstrating through small-scale GPT-2 experiments that compute-independent advancements like FlashAttention can yield up to 3.5× compute-equivalent gains even under hardware constraints, challenging the efficacy of hardware-focused AI regulation.

Tag: Large Language Model

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

Reward Guidance for Reinforcement Learning Tasks Based on Large Language Models: The LMGT Framework

Agentic Reasoning and Tool Integration for LLMs via Reinforcement Learning

Beyond the Last Answer: Your Reasoning Trace Uncovers More than You Think

LLM-e Guess: Can LLMs Capabilities Advance Without Hardware Progress?