Posts

All the articles I've posted.

Improving the Language Understanding Capabilities of Large Language Models Using Reinforcement Learning

Published: 5 Jun, 2025 at 11:24 AM

87.16 🤔

本文通过将自然语言理解任务转化为强化学习问题，使用PPO算法微调中小规模LLMs，在GLUE和SuperGLUE基准上显著提升性能，超越监督微调和BERT-large，并展现出优于GPT-4o的零样本泛化能力。
Can Pruning Improve Reasoning? Revisiting Long-CoT Compression with Capability in Mind for Better Reasoning

Published: 22 May, 2025 at 11:18 AM

87.16 🤔

本文提出Prune-on-Logic框架，通过将长链思维（Long-CoT）转化为逻辑图并选择性剪枝低效验证步骤，在提升小型语言模型（SLMs）推理准确率的同时降低推理成本，揭示了剪枝作为能力对齐策略的潜力。
SelfBudgeter: Adaptive Token Allocation for Efficient LLM Reasoning

Published: 22 May, 2025 at 11:12 AM

87.16 🤔

SelfBudgeter通过自适应令牌预算预测和强化学习优化，在MATH数据集上实现74.47%响应长度压缩，同时保持接近原始准确性，显著提升大型推理模型的效率。
Deformable Beta Splatting

Published: 8 May, 2025 at 11:07 AM

87.15 🤔

Deformable Beta Splatting (DBS) enhances real-time radiance field rendering by introducing deformable Beta Kernels for superior geometric fidelity, Spherical Beta for efficient color encoding, and kernel-agnostic MCMC optimization, achieving state-of-the-art visual quality with 45% fewer parameters and 1.5x faster rendering than 3DGS-MCMC.
UnifyFL: Enabling Decentralized Cross-Silo Federated Learning

Published: 8 May, 2025 at 10:21 AM

87.15 🤔

UnifyFL proposes a decentralized cross-silo federated learning framework using Ethereum blockchain and IPFS to enable trust-based collaboration among organizations, achieving comparable accuracy to centralized FL with flexible aggregation policies and efficient handling of stragglers through synchronous and asynchronous modes.

Posts

Improving the Language Understanding Capabilities of Large Language Models Using Reinforcement Learning

Can Pruning Improve Reasoning? Revisiting Long-CoT Compression with Capability in Mind for Better Reasoning

SelfBudgeter: Adaptive Token Allocation for Efficient LLM Reasoning

Deformable Beta Splatting

UnifyFL: Enabling Decentralized Cross-Silo Federated Learning