Tag: Embeddings
All the articles with the tag "Embeddings".
-
Let's Predict Sentence by Sentence
本文提出了一种句子级推理框架,通过自回归预测连续句子嵌入,将预训练语言模型提升到抽象推理空间,上下文嵌入在连续推理模式下与Chain-of-Thought (CoT) 表现相当,同时平均将推理计算成本降低一半。
-
Pre-training vs. Fine-tuning: A Reproducibility Study on Dense Retrieval Knowledge Acquisition
本文通过线性探查和神经元激活分析,复制并扩展了对密集检索模型中预训练与微调知识获取作用的研究,发现预训练知识在DPR模型中主导检索效果且微调导致知识分散,但此结论在不同架构(如Contriever、RepLlama)和表示策略下并不成立。
-
An Analysis for Reasoning Bias of Language Models with Small Initialization
本文通过理论分析和实验验证,揭示了小参数初始化规模如何通过影响嵌入空间和训练动态,促使大型语言模型更倾向于推理任务而非记忆任务。
-
Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs
本文提出UniME框架,通过文本判别知识蒸馏和硬负例增强指令微调,利用多模态大语言模型学习通用的多模态嵌入,提高了下游任务的判别性和组合能力。
-
Beyond Single Concept Vector: Modeling Concept Subspace in LLMs with Gaussian Distribution
This paper introduces Gaussian Concept Subspace (GCS), a framework to model concept representations in LLMs as Gaussian distributions, demonstrating improved robustness, faithfulness, and plausibility over single vector methods, with effective application in emotion steering tasks.