Tag: Reasoning
All the articles with the tag "Reasoning".
-
Extracting and Transferring Abilities For Building Multi-lingual Ability-enhanced Large Language Models
本文提出MAET方法,通过提取语言无关的能力相关权重并跨语言转移,构建多语言能力增强的大型语言模型,在数学和科学任务上以60%的计算资源实现约10%的性能提升,优于多种基线方法。
-
Learning to Plan Before Answering: Self-Teaching LLMs to Learn Abstract Plans for Problem Solving
本文提出LEPA自训练算法,通过训练LLM生成预期计划作为抽象元知识来提升问题解决泛化能力,并在多个推理基准上显著优于现有方法。
-
Towards Robust and Parameter-Efficient Knowledge Unlearning for LLMs
本文提出了低秩知识遗忘(LoKU)框架,包含反向铰链损失(IHL)和 Fisher 加权低秩适配器初始化(FILA),以实现鲁棒且参数高效的大语言模型知识遗忘,有效移除敏感信息同时保持模型原有能力。
-
Efficient Single-Pass Training for Multi-Turn Reasoning
本文提出了一种通过响应令牌复制和自定义注意力掩码来实现多轮推理对话单次前向传递训练的方法,显著提高了训练效率,同时维护了推理可见性和位置一致性。
-
HAIR: Hardness-Aware Inverse Reinforcement Learning with Introspective Reasoning for LLM Alignment
HAIR introduces a novel LLM alignment method using hardness-aware inverse reinforcement learning and introspective reasoning, constructing a balanced safety dataset and training category-specific reward models with GRPO-S, achieving state-of-the-art harmlessness while preserving usefulness across multiple benchmarks.