Tag: Large Language Model

All the articles with the tag "Large Language Model".

Efficient Reasoning for LLMs through Speculative Chain-of-Thought

Published: 6 May, 2025 at 01:19 AM

79.97 🤔

本文提出了推测思维链（SCoT）框架，通过轻量级草稿模型并行生成多个思维链草稿，并由微调后的目标大模型选择最佳草稿或决定重新思考，从而在保持接近大模型准确率的同时，显著降低了大型语言模型的推理延迟。
StreamRL: Scalable, Heterogeneous, and Elastic RL for LLMs with Disaggregated Stream Generation

Published: 4 May, 2025 at 04:29 PM

79.93 🤔

本文提出 StreamRL 框架，通过分离式流生成架构优化 RL 训练，解决了流水线和偏斜气泡问题，提高了 LLMs RL 训练的吞吐量和成本效率。
Radio: Rate-Distortion Optimization for Large Language Model Compression

Published: 9 May, 2025 at 11:09 AM

79.84 🤔

This paper introduces 'Radio,' a rate-distortion optimization framework for LLM compression that outperforms existing quantization methods in perplexity and downstream task accuracy, particularly at lower bit depths, by iteratively optimizing bit depths and using companding quantization post-training.
Beyond Next Token Prediction: Patch-Level Training for Large Language Models

Published: 19 May, 2025 at 11:18 AM

79.83 🤔

本文提出patch级训练方法，通过将多个token聚合成高信息密度patch并分阶段训练大型语言模型，在训练成本减半的情况下保持甚至略提升模型性能。
Toward Efficient Exploration by Large Language Model Agents

Published: 4 May, 2025 at 04:31 PM

79.45 🤔

本文通过使用 LLMs 显式实现后验采样 RL 算法，显著提高了 LLMs 代理在自然语言环境中的探索效率，同时保留了经典算法的统计性能优势。

Tag: Large Language Model

Efficient Reasoning for LLMs through Speculative Chain-of-Thought

StreamRL: Scalable, Heterogeneous, and Elastic RL for LLMs with Disaggregated Stream Generation

Radio: Rate-Distortion Optimization for Large Language Model Compression

Beyond Next Token Prediction: Patch-Level Training for Large Language Models

Toward Efficient Exploration by Large Language Model Agents