Tag: Large Language Model

All the articles with the tag "Large Language Model".

Fine-tuning Quantized Neural Networks with Zeroth-order Optimization

Published: 25 May, 2025 at 11:24 AM

85.17 🤔

本文提出Quantized Zeroth-order Optimization (QZO)，通过扰动量化尺度参数并结合方向导数裁剪，在量化神经网络上实现零阶优化微调，将内存使用减少18倍以上，并在LLMs和Stable Diffusion上展示出显著的内存效率和一定的性能提升。
Scalable Complexity Control Facilitates Reasoning Ability of LLMs

Published: 3 Jun, 2025 at 11:29 AM

85.16 🤔

本文通过调整初始化率和权重衰减系数控制大语言模型复杂性，显著提升推理能力，尤其在数学任务上表现突出，并在扩展律上展现更优性能。
Learning Like Humans: Advancing LLM Reasoning Capabilities via Adaptive Difficulty Curriculum Learning and Expert-Guided Self-Reformulation

Published: 17 May, 2025 at 11:01 AM

85.16 🤔

This paper introduces Adaptive Difficulty Curriculum Learning (ADCL) and Expert-Guided Self-Reformulation (EGSR) to enhance LLM reasoning by dynamically adjusting training curricula and guiding models to reformulate expert solutions, achieving significant performance improvements over standard RL baselines on mathematical reasoning benchmarks.
ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning

Published: 1 Jun, 2025 at 11:53 AM

85.15 🤔

ReMA通过多智能体强化学习分离元思考和推理过程，提升了大型语言模型在数学推理和LLM-as-a-Judge任务上的性能，尤其在分布外泛化能力上表现出色，但对超参数敏感且多轮设置存在稳定性挑战。
Mining Hidden Thoughts from Texts: Evaluating Continual Pretraining with Synthetic Data for LLM Reasoning

Published: 18 May, 2025 at 11:14 AM

85.14 🤔

本文提出Reasoning CPT方法，通过在持续预训练中加入合成隐藏思维数据，显著提升大型语言模型在跨领域推理、困难问题解决和推理效率方面的表现，特别是在MMLU基准上实现了最高3.3%的整体提升和困难问题上约8%的改进。

Tag: Large Language Model

Fine-tuning Quantized Neural Networks with Zeroth-order Optimization

Scalable Complexity Control Facilitates Reasoning Ability of LLMs

Learning Like Humans: Advancing LLM Reasoning Capabilities via Adaptive Difficulty Curriculum Learning and Expert-Guided Self-Reformulation

ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning

Mining Hidden Thoughts from Texts: Evaluating Continual Pretraining with Synthetic Data for LLM Reasoning