Posts
All the articles I've posted.
-
Think2SQL: Reinforce LLM Reasoning Capabilities for Text2SQL
本文通过结合监督微调(SFT)、强化学习(RL)及细粒度奖励函数(如QATCH),显著提升了小型LLM在Text2SQL任务中的推理能力和性能,Think2SQL-7B模型在BIRD数据集上超越了400B+参数模型。
-
Initialization using Update Approximation is a Silver Bullet for Extremely Efficient Low-Rank Fine-Tuning
本文提出LoRA-SB方法,通过基于全参数微调第一步梯度近似的初始化策略优化低秩微调,在参数量减少27-90倍的情况下,显著超越LoRA-XS并接近全参数微调性能。
-
Foundation Models For Seismic Data Processing: An Extensive Review
This paper conducts an extensive review of natural image foundation models for seismic data processing, demonstrating that hierarchical models like Swin and ConvNeXt, especially with self-supervised pre-training, outperform non-hierarchical ones in demultiple, interpolation, and denoising tasks, while highlighting the benefits and limitations of natural image pre-training for seismic applications.
-
FlashThink: An Early Exit Method For Efficient Reasoning
FlashThink方法通过验证模型动态判断推理过程是否提前结束,在保持大型语言模型准确率的同时显著减少推理内容长度(平均效率提升约77%),并通过FT²微调进一步优化性能。
-
Mini-batch Coresets for Memory-efficient Language Model Training on Data Mixtures
本文提出 CoLM 方法,通过构建小批量核心集匹配大批量梯度,在内存需求减少 2 倍的情况下,使 LLM 微调性能优于 4 倍批大小的常规训练,同时提升收敛速度。