Tag: Fine-tuning

All the articles with the tag "Fine-tuning".

Unlocking Efficient Long-to-Short LLM Reasoning with Model Merging

Published: 31 May, 2025 at 11:35 AM

95.81 🤔

本文通过模型融合方法整合快速思维和慢速推理能力，实现长到短推理，在7B模型上将响应长度压缩高达55%且保持性能，提出了一种高效解决大语言模型过度思考问题的方案。
Towards Revealing the Effectiveness of Small-Scale Fine-tuning in R1-style Reinforcement Learning

Published: 28 May, 2025 at 11:25 AM

92.52 🤔

本文通过理论分析和Re-distillation技术，揭示了小规模SFT在R1风格RL中的效率瓶颈，并以极少样本（<1K）在K&K和MATH数据集上接近RL性能，显著提升了数据效率。
The Unreasonable Effectiveness of Model Merging for Cross-Lingual Transfer in LLMs

Published: 28 May, 2025 at 11:24 AM

92.28 🤔

本文通过模块化方法，利用大型语言模型参数在数学推理和多语言能力上的分离性，提出Layer-Swapping等策略，在低资源语言跨语言迁移中显著优于非模块化基线，尤其在数据受限场景下表现最佳。
Gameplay Highlights Generation

Published: 14 May, 2025 at 11:06 AM

92.19 🤔

This paper presents a method to generate gameplay highlight reels by finetuning the X-CLIP multimodal model on an in-house FPS game dataset, achieving over 90% event detection accuracy and demonstrating transfer learning, while optimizing deployment through quantization.
TL;DR: Too Long, Do Re-weighting for Effcient LLM Reasoning Compression

Published: 5 Jun, 2025 at 11:22 AM

92.10 🤔

本文提出TLDR方法，通过动态再加权系统1和系统2推理数据，显著压缩大型语言模型的推理输出token数量（约40%），同时在多难度数学任务上基本保持准确性。

Tag: Fine-tuning

Unlocking Efficient Long-to-Short LLM Reasoning with Model Merging

Towards Revealing the Effectiveness of Small-Scale Fine-tuning in R1-style Reinforcement Learning

The Unreasonable Effectiveness of Model Merging for Cross-Lingual Transfer in LLMs

Gameplay Highlights Generation

TL;DR: Too Long, Do Re-weighting for Effcient LLM Reasoning Compression