Tag: Few-Shot Learning
All the articles with the tag "Few-Shot Learning".
-
Beyond Output Matching: Bidirectional Alignment for Enhanced In-Context Learning
本文提出双向对齐(BiAlign)方法,通过对齐学生模型与教师模型的令牌级输出分布和输入偏好,显著提升了学生模型的上下文学习能力,并在多种任务上取得了优于基线的结果。
-
Scalable Model Merging with Progressive Layer-wise Distillation
本文提出ProDistill算法,通过逐层教师-学生蒸馏高效合并大型预训练模型,理论证明领域特定数据的必要性,并在视觉、语言任务上实现显著性能提升(6.14%-6.61%),展现出优越的内存和计算效率。
-
AdaReasoner: Adaptive Reasoning Enables More Flexible Thinking
AdaReasoner通过强化学习框架自适应调整大型语言模型的推理配置(生成温度、推理步骤数和指令格式),在多样化任务上显著优于固定配置的基线方法,展现了快速收敛和分布外鲁棒性。
-
Less is More: Enhancing Structured Multi-Agent Reasoning via Quality-Guided Distillation
本文提出了一种质量导向的多代理框架,通过提示诱导、检索增强合成和奖励过滤从少量标注数据中提炼高质量监督信号,提升LLMs在低资源结构化推理任务中的性能。
-
Latte: Transfering LLMs` Latent-level Knowledge for Few-shot Tabular Learning
The paper introduces 'Latte', a framework that transfers latent-level knowledge from Large Language Models during training to enhance few-shot tabular learning, outperforming baselines by leveraging unlabeled data and mitigating overfitting across diverse classification and regression tasks.