Posts

All the articles I've posted.

CRANE: Reasoning with constrained LLM generation

Published: 9 May, 2025 at 11:08 AM

61.03 🤔

This paper introduces CRANE, a reasoning-augmented constrained decoding algorithm that alternates between unconstrained and constrained generation to preserve LLM reasoning capabilities while ensuring syntactic correctness, achieving up to 10% accuracy improvement on symbolic reasoning benchmarks like GSM-Symbolic and FOLIO.
Splitwiser: Efficient LM inference with constrained resources

Published: 11 May, 2025 at 11:14 AM

60.85 🤔

Splitwiser introduces a method to split LLM inference phases on a single GPU using multiprocessing and NVIDIA MPS, achieving modest latency reductions (up to 18.2%) and throughput improvements (up to 1.42x) on Huggingface and vLLM pipelines, though constrained by overheads and scalability issues.
How do Humans and Language Models Reason About Creativity? A Comparative Analysis

Published: 10 May, 2025 at 10:59 AM

60.58 🤔

This paper conducts a comparative analysis of creativity evaluation in STEM, revealing that human experts and LLMs prioritize different facets of originality (cleverness vs. remoteness/uncommonness) and are differentially influenced by contextual examples, with LLMs showing higher predictive accuracy but poorer construct validity due to homogenized facet correlations.
Streaming, Fast and Slow: Cognitive Load-Aware Streaming for Efficient LLM Serving

Published: 4 May, 2025 at 04:30 PM

60.43 🤔

本文提出基于认知负载的适应性流式传输框架，用于优化 LLM 服务，通过动态调整输出速度减少计算资源消耗高达 16.8%，同时维持用户满意度。
EPO: Explicit Policy Optimization for Strategic Reasoning in LLMs via Reinforcement Learning

Published: 4 May, 2025 at 04:27 PM

60.29 🤔

本文提出EPO方法，通过强化学习优化一个专门的战略推理模型，辅助任意LLM代理在动态环境中实现长期目标对齐，提升战略推理能力。

Posts

CRANE: Reasoning with constrained LLM generation

Splitwiser: Efficient LM inference with constrained resources

How do Humans and Language Models Reason About Creativity? A Comparative Analysis

Streaming, Fast and Slow: Cognitive Load-Aware Streaming for Efficient LLM Serving

EPO: Explicit Policy Optimization for Strategic Reasoning in LLMs via Reinforcement Learning