Tag: Exploration Exploitation

All the articles with the tag "Exploration Exploitation".

Reward Guidance for Reinforcement Learning Tasks Based on Large Language Models: The LMGT Framework

Published: 5 May, 2025 at 11:16 PM

76.51 🤔

本文提出了LMGT框架，通过利用大型语言模型的先验知识对强化学习的奖励进行动态调整，有效平衡了探索与利用，显著提高了样本效率并降低了训练成本，并在多种环境、算法以及机器人和推荐系统等复杂场景中验证了其有效性。