Tag: Reasoning

All the articles with the tag "Reasoning".

How Do Multimodal Large Language Models Handle Complex Multimodal Reasoning? Placing Them in An Extensible Escape Game

Published: 10 May, 2025 at 10:59 AM

68.75 🤔

This paper introduces MM-Escape, a benchmark using the customizable 3D environment EscapeCraft to evaluate multimodal reasoning in MLLMs through room escape tasks, revealing that while models like GPT-4o achieve high success in simple scenarios, performance drops significantly with increased difficulty, exposing distinct limitations in reasoning and spatial awareness.
Extracting and Transferring Abilities For Building Multi-lingual Ability-enhanced Large Language Models

Published: 7 May, 2025 at 12:17 AM

68.20 🤔

本文提出MAET方法，通过提取语言无关的能力相关权重并跨语言转移，构建多语言能力增强的大型语言模型，在数学和科学任务上以60%的计算资源实现约10%的性能提升，优于多种基线方法。
HAIR: Hardness-Aware Inverse Reinforcement Learning with Introspective Reasoning for LLM Alignment

Published: 11 May, 2025 at 11:12 AM

67.37 🤔

HAIR introduces a novel LLM alignment method using hardness-aware inverse reinforcement learning and introspective reasoning, constructing a balanced safety dataset and training category-specific reward models with GRPO-S, achieving state-of-the-art harmlessness while preserving usefulness across multiple benchmarks.
LZ Penalty: An information-theoretic repetition penalty for autoregressive language models

Published: 6 May, 2025 at 11:19 PM

67.26 🤔

本文提出LZ惩罚方法，基于LZ77压缩算法的码长变化动态调整自回归语言模型的采样分布，在贪婪解码下有效消除退化重复，同时保持推理基准性能。
Waking Up an AI: A Quantitative Framework for Prompt-Induced Phase Transition in Large Language Models

Published: 7 May, 2025 at 09:31 AM

64.57 🤔

本文提出了一种双重提示框架（TIP和TQP）来量化大型语言模型（LLMs）的认知相变，发现LLMs对概念融合提示的情感反应与人类直觉差异显著，揭示了AI与人类认知在概念整合上的潜在鸿沟。

Tag: Reasoning

How Do Multimodal Large Language Models Handle Complex Multimodal Reasoning? Placing Them in An Extensible Escape Game

Extracting and Transferring Abilities For Building Multi-lingual Ability-enhanced Large Language Models

HAIR: Hardness-Aware Inverse Reinforcement Learning with Introspective Reasoning for LLM Alignment

LZ Penalty: An information-theoretic repetition penalty for autoregressive language models

Waking Up an AI: A Quantitative Framework for Prompt-Induced Phase Transition in Large Language Models