Tag: Large Language Model

All the articles with the tag "Large Language Model".

Efficient Single-Pass Training for Multi-Turn Reasoning

Published: 4 May, 2025 at 04:30 PM

74.08 🤔

本文提出了一种通过响应令牌复制和自定义注意力掩码来实现多轮推理对话单次前向传递训练的方法，显著提高了训练效率，同时维护了推理可见性和位置一致性。
ComPO: Preference Alignment via Comparison Oracles

Published: 13 May, 2025 at 11:09 AM

73.73 🤔

This paper introduces ComPO, a novel preference alignment method for LLMs using comparison oracles to effectively utilize noisy preference pairs, demonstrating reduced verbosity and likelihood displacement across multiple models and benchmarks.
Communication-Efficient Wireless Federated Fine-Tuning for Large-Scale AI Models

Published: 4 May, 2025 at 04:33 PM

73.51 🤔

本文提出了一种无线联邦LoRA微调框架，通过Sparsified Orthogonal Fine-Tuning (SOFT) 和Two Stage Federated Algorithm (TSFA) 优化参数稀疏化和动态资源分配，提高了通信效率和学习性能。
Cache-Efficient Posterior Sampling for Reinforcement Learning with LLM-Derived Priors Across Discrete and Continuous Domains

Published: 18 May, 2025 at 11:16 AM

73.39 🤔

本文提出了一种缓存高效的后验采样框架，通过元学习优化的缓存机制重用LLM先验，显著降低强化学习中的计算成本（查询减少3.8-4.7倍，延迟降低4.0-12.0倍），同时在文本和连续控制任务中保持96-98%的性能。
RetroInfer: A Vector-Storage Approach for Scalable Long-Context LLM Inference

Published: 9 May, 2025 at 11:06 AM

73.12 🤔

RetroInfer reimagines the KV cache as a vector storage system, using an attention-aware wave index and wave buffer to achieve up to 4.5x speedup over full attention and 10.5x over sparse baselines for long-context LLM inference, while preserving near-full-attention accuracy.

Tag: Large Language Model

Efficient Single-Pass Training for Multi-Turn Reasoning

ComPO: Preference Alignment via Comparison Oracles

Communication-Efficient Wireless Federated Fine-Tuning for Large-Scale AI Models

Cache-Efficient Posterior Sampling for Reinforcement Learning with LLM-Derived Priors Across Discrete and Continuous Domains

RetroInfer: A Vector-Storage Approach for Scalable Long-Context LLM Inference