Tag: Human-AI Interaction

All the articles with the tag "Human-AI Interaction".

A Large-Scale Empirical Analysis of Custom GPTs' Vulnerabilities in the OpenAI Ecosystem

Published: 16 May, 2025 at 11:13 AM

94.41 🤔

This paper conducts a large-scale empirical analysis of 14,904 custom GPTs in the OpenAI store, revealing over 95% lack adequate security against attacks like roleplay (96.51%) and phishing (91.22%), introduces a multi-metric popularity ranking system, and highlights the need for enhanced security in both custom and base models.
Distilling LLM Agent into Small Models with Retrieval and Code Tools

Published: 28 May, 2025 at 11:25 AM

93.11 🤔

本文提出Agent Distillation框架，通过将LLM代理的交互行为蒸馏到sLMs中，并结合first-thought prefix和self-consistent action generation方法，使小型模型在事实和数学推理任务上取得显著性能提升，接近甚至超越更大规模的CoT蒸馏模型。
Multilingual Performance of a Multimodal Artificial Intelligence System on Multisubject Physics Concept Inventories

Published: 16 May, 2025 at 11:10 AM

92.78 🤔

This exploratory study evaluates GPT-4o's multilingual and multimodal performance on physics concept inventories, revealing strong results in English and text-based tasks but significant weaknesses in visual interpretation and non-Western languages, highlighting implications for equitable AI integration in education.
Thinking Short and Right Over Thinking Long: Serving LLM Reasoning Efficiently and Accurately

Published: 23 May, 2025 at 11:10 AM

90.81 🤔

本文提出SART框架，通过冗余采样与早期停止以及两阶段动态修剪方法，显著提升了大型语言模型推理服务的效率（最高28.2倍），同时保持了与基线相近的准确性。
AdaptThink: Reasoning Models Can Learn When to Think

Published: 24 May, 2025 at 11:11 AM

90.77 🤔

本文提出 *AdaptThink*，一种基于强化学习的算法，通过自适应选择 *Thinking* 或 *NoThinking* 模式显著降低推理模型的响应长度（平均减少 40-53%）并提升准确率（平均提升 2.3-2.4%），在数学任务上展现了效率与性能的良好平衡。

Tag: Human-AI Interaction

A Large-Scale Empirical Analysis of Custom GPTs' Vulnerabilities in the OpenAI Ecosystem

Distilling LLM Agent into Small Models with Retrieval and Code Tools

Multilingual Performance of a Multimodal Artificial Intelligence System on Multisubject Physics Concept Inventories

Thinking Short and Right Over Thinking Long: Serving LLM Reasoning Efficiently and Accurately

AdaptThink: Reasoning Models Can Learn When to Think