Tag: Instruction Tuning
All the articles with the tag "Instruction Tuning".
-
SEFE: Superficial and Essential Forgetting Eliminator for Multimodal Continual Instruction Tuning
This paper introduces SEFE, a method combining Answer Style Diversification (ASD) to mitigate superficial forgetting and RegLoRA to address essential forgetting in Multimodal Continual Instruction Tuning, achieving state-of-the-art performance on the CoIN benchmark.
-
Domain Regeneration: How well do LLMs match syntactic properties of text domains?
本文通过‘LLM-regeneration’范式,使用Llama模型生成Wikipedia和新闻文本,发现生成文本在句法复杂性指标上表现出均值偏移、方差降低和长尾减少的系统性差异,揭示了模型在域匹配能力上的局限性。
-
ComPO: Preference Alignment via Comparison Oracles
This paper introduces ComPO, a novel preference alignment method for LLMs using comparison oracles to effectively utilize noisy preference pairs, demonstrating reduced verbosity and likelihood displacement across multiple models and benchmarks.
-
Reward-Augmented Data Enhances Direct Preference Alignment of LLMs
本文提出了一种奖励增强数据集方法,通过对偏好对进行重新标记使大型语言模型条件化于奖励值学习响应质量全谱,显著提升了直接偏好优化(DPO)的性能并缓解了其遗忘高质被拒响应和无差别学习低质选中响应的局限性。
-
Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs
本文提出UniME框架,通过文本判别知识蒸馏和硬负例增强指令微调,利用多模态大语言模型学习通用的多模态嵌入,提高了下游任务的判别性和组合能力。