Tag: Fine-tuning

All the articles with the tag "Fine-tuning".

Unlocking Efficient Long-to-Short LLM Reasoning with Model Merging

Published: 31 May, 2025 at 11:35 AM

95.81 🤔

本文通过模型融合方法整合快速思维和慢速推理能力，实现长到短推理，在7B模型上将响应长度压缩高达55%且保持性能，提出了一种高效解决大语言模型过度思考问题的方案。
Scaling Reasoning without Attention

Published: 4 Jun, 2025 at 11:25 AM

85.88 🤔

本文提出 PROMPTCOT-MAMBA，一种基于 Mamba-2 状态空间模型的无注意力语言模型，通过两阶段课程微调和 PROMPTCOT 合成范式，在数学和代码推理任务上超越同规模甚至更大规模的 Transformer 模型，同时实现固定内存和高效推理。
Navigating the Accuracy-Size Trade-Off with Flexible Model Merging

Published: 4 Jun, 2025 at 11:26 AM

85.58 🤔

FlexMerge提出了一种无数据的灵活模型合并框架，通过逐块贪婪合并微调模型，支持任意大小模型生成，并在精度-大小权衡上展现出显著的初期精度提升和接近微调精度的潜力。
Fine-Tuning on Diverse Reasoning Chains Drives Within-Inference CoT Refinement in LLMs

Published: 4 Jun, 2025 at 11:26 AM

85.53 🤔

本文提出DCoT方法，通过在单次推理步骤内生成多个多样化推理链并进行自我改进，显著提升了大型语言模型在复杂推理任务上的性能，尤其在结果空间较大的任务中效果突出。
Understanding Overadaptation in Supervised Fine-Tuning: The Role of Ensemble Methods

Published: 4 Jun, 2025 at 11:59 AM

85.17 🤔

本文通过理论和实验分析，提出模型集成方法通过平衡‘bias-variance’权衡有效缓解监督微调中的过适应问题，提升下游任务性能并减少预训练知识遗忘。

Tag: Fine-tuning

Unlocking Efficient Long-to-Short LLM Reasoning with Model Merging

Scaling Reasoning without Attention

Navigating the Accuracy-Size Trade-Off with Flexible Model Merging

Fine-Tuning on Diverse Reasoning Chains Drives Within-Inference CoT Refinement in LLMs

Understanding Overadaptation in Supervised Fine-Tuning: The Role of Ensemble Methods