Tag: Human-AI Interaction

All the articles with the tag "Human-AI Interaction".

SmallPlan: Leverage Small Language Models for Sequential Path Planning with Simulation-Powered, LLM-Guided Distillation

Published: 5 May, 2025 at 11:15 PM

64.11 🤔

本文提出了 SmallPlan 框架，通过结合 LLM 指导的蒸馏、模拟环境反馈的 SFT 和 RL，训练轻量级的小型语言模型 (SLM) 进行高效的机器人高层路径规划，使其在资源受限的边缘设备上实现接近大型模型 (LLM) 的性能。
The Promise and Limits of LLMs in Constructing Proofs and Hints for Logic Problems in Intelligent Tutoring Systems

Published: 13 May, 2025 at 11:09 AM

63.17 🤔

This paper evaluates LLMs in intelligent tutoring systems for propositional logic, demonstrating DeepSeek-V3's promising accuracy in proof construction (up to 86.7%) and hint generation (75%), but reveals significant pedagogical limitations in justification and subgoaling, necessitating hybrid approaches for educational integration.
MAC-Tuning: LLM Multi-Compositional Problem Reasoning with Enhanced Knowledge Boundary Awareness

Published: 4 May, 2025 at 04:32 PM

63.08 🤔

本文提出MAC-Tuning方法，通过分步微调分离答案预测和置信度估计，提升LLMs在多问题设置下的知识边界意识，显著减少幻觉并改善性能。
Evidence of conceptual mastery in the application of rules by Large Language Models

Published: 4 May, 2025 at 04:26 PM

62.79 🤔

本文通过心理实验证明大型语言模型在规则应用中表现出概念掌握能力，能够泛化到新情境并部分模仿人类对时间压力等语境的敏感性。
How do Humans and Language Models Reason About Creativity? A Comparative Analysis

Published: 10 May, 2025 at 10:59 AM

60.58 🤔

This paper conducts a comparative analysis of creativity evaluation in STEM, revealing that human experts and LLMs prioritize different facets of originality (cleverness vs. remoteness/uncommonness) and are differentially influenced by contextual examples, with LLMs showing higher predictive accuracy but poorer construct validity due to homogenized facet correlations.

Tag: Human-AI Interaction

SmallPlan: Leverage Small Language Models for Sequential Path Planning with Simulation-Powered, LLM-Guided Distillation

The Promise and Limits of LLMs in Constructing Proofs and Hints for Logic Problems in Intelligent Tutoring Systems

MAC-Tuning: LLM Multi-Compositional Problem Reasoning with Enhanced Knowledge Boundary Awareness

Evidence of conceptual mastery in the application of rules by Large Language Models

How do Humans and Language Models Reason About Creativity? A Comparative Analysis