Tag: Large Language Model
All the articles with the tag "Large Language Model".
-
Beyond Single Concept Vector: Modeling Concept Subspace in LLMs with Gaussian Distribution
This paper introduces Gaussian Concept Subspace (GCS), a framework to model concept representations in LLMs as Gaussian distributions, demonstrating improved robustness, faithfulness, and plausibility over single vector methods, with effective application in emotion steering tasks.
-
How Do Multimodal Large Language Models Handle Complex Multimodal Reasoning? Placing Them in An Extensible Escape Game
This paper introduces MM-Escape, a benchmark using the customizable 3D environment EscapeCraft to evaluate multimodal reasoning in MLLMs through room escape tasks, revealing that while models like GPT-4o achieve high success in simple scenarios, performance drops significantly with increased difficulty, exposing distinct limitations in reasoning and spatial awareness.
-
CCSK:Cognitive Convection of Self-Knowledge Based Retrieval Augmentation for Large Language Models
本文提出CCSK框架,通过Siamese Network和Response Quality Model动态融合查询相似性和响应质量,优化大型语言模型的信息检索决策,在多个问答数据集上显著提升了F1分数和准确率。
-
SIMPLEMIX: Frustratingly Simple Mixing of Off- and On-policy Data in Language Model Preference Learning
This paper introduces SIMPLEMIX, a simple method to mix on- and off-policy data in language model preference optimization, demonstrating that their complementary strengths—on-policy for reasoning tasks and off-policy for open-ended tasks—lead to a 6.03% average improvement over single-source methods on Alpaca Eval 2.0.
-
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models
本文首次系统调查了大型语言模型高效推理的进展,通过分类模型、输出和提示-based方法,探讨了减少"过度思考"现象的策略,以优化计算效率并保持推理能力。