Posts

All the articles I've posted.

Large Language Model Compression with Global Rank and Sparsity Optimization

Published: 11 May, 2025 at 11:14 AM

77.26 🤔

This paper introduces a two-stage LLM compression method using RPCA for low-rank and sparse decomposition and probabilistic pruning via policy gradient, outperforming state-of-the-art techniques at a 50% compression ratio while automatically adapting to layer-wise redundancy without manual thresholds or extensive fine-tuning.
Latent Preference Coding: Aligning Large Language Models via Discrete Latent Codes

Published: 12 May, 2025 at 11:14 AM

76.90 🤔

This paper introduces Latent Preference Coding (LPC), a framework that uses discrete latent codes to model multifaceted human preferences, consistently improving the performance of offline alignment algorithms like DPO, SimPO, and IPO across multiple LLMs and benchmarks.
Rethinking Meta-Learning from a Learning Lens

Published: 12 May, 2025 at 11:18 AM

76.64 🤔

This paper rethinks meta-learning from a 'learning' lens, proposing TRLearner, a plug-and-play method that leverages task relations to calibrate optimization, demonstrating significant performance improvements across regression, classification, drug activity, pose prediction, and OOD generalization tasks.
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

Published: 4 May, 2025 at 04:26 PM

76.52 🤔

本文首次系统调查了大型语言模型高效推理的进展，通过分类模型、输出和提示-based方法，探讨了减少"过度思考"现象的策略，以优化计算效率并保持推理能力。
Reward Guidance for Reinforcement Learning Tasks Based on Large Language Models: The LMGT Framework

Published: 5 May, 2025 at 11:16 PM

76.51 🤔

本文提出了LMGT框架，通过利用大型语言模型的先验知识对强化学习的奖励进行动态调整，有效平衡了探索与利用，显著提高了样本效率并降低了训练成本，并在多种环境、算法以及机器人和推荐系统等复杂场景中验证了其有效性。

Posts

Large Language Model Compression with Global Rank and Sparsity Optimization

Latent Preference Coding: Aligning Large Language Models via Discrete Latent Codes

Rethinking Meta-Learning from a Learning Lens

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

Reward Guidance for Reinforcement Learning Tasks Based on Large Language Models: The LMGT Framework