Posts

All the articles I've posted.

EfficientQAT: Efficient Quantization-Aware Training for Large Language Models

Published: 24 May, 2025 at 11:15 AM

86.14 🤔

EfficientQAT提出了一种高效的量化感知训练框架，通过块级全参数训练（Block-AP）和端到端量化参数训练（E2E-QP），在低比特场景下显著提升大型语言模型的量化性能，同时大幅降低训练资源需求。
Dynamic Fisher-weighted Model Merging via Bayesian Optimization

Published: 6 May, 2025 at 01:19 AM

86.13 🤔

本文提出了动态 Fisher 加权合并 (DF-Merge) 方法，通过贝叶斯优化动态调整微调模型的缩放系数，并在这些缩放模型上利用 Fisher 信息进行加权合并，从而高效地创建性能显著优于现有基线的多任务模型。
Graceful Forgetting in Generative Language Models

Published: 31 May, 2025 at 11:20 AM

86.10 🤔

本文提出Learning With Forgetting (LWF)框架，通过自生成知识、Fisher信息矩阵加权的遗忘置信度计算和周期性遗忘策略，在生成式语言模型的微调中实现优雅遗忘，实验表明其在大多数领域特定问答任务上显著提升性能。
DeepSeek vs. o3-mini: How Well can Reasoning LLMs Evaluate MT and Summarization?

Published: 2 Jun, 2025 at 11:31 AM

86.10 🤔

本文首次系统比较了推理型与非推理型大语言模型在自然语言生成评估中的表现，发现推理能力的效果高度依赖模型架构，OpenAI o3-mini 在机器翻译评估中显著优于非推理型模型，而 DeepSeek-R1 仅在文本摘要一致性评估中表现突出，蒸馏模型在 32B 参数规模时仍有效。
ZeroSearch: Incentivize the Search Capability of LLMs without Searching

Published: 8 May, 2025 at 06:16 PM

86.09 🤔

ZEROSEARCH introduces a reinforcement learning framework that enhances LLMs' search capabilities by simulating search engines with fine-tuned LLMs, achieving performance comparable to or better than real search engines without API costs through a curriculum-based rollout strategy.

Posts

EfficientQAT: Efficient Quantization-Aware Training for Large Language Models

Dynamic Fisher-weighted Model Merging via Bayesian Optimization

Graceful Forgetting in Generative Language Models

DeepSeek vs. o3-mini: How Well can Reasoning LLMs Evaluate MT and Summarization?

ZeroSearch: Incentivize the Search Capability of LLMs without Searching