Tag: Efficiency
All the articles with the tag "Efficiency".
-
Exploring Effective Distillation of Self-Supervised Speech Models for Automatic Speech Recognition
This paper explores effective distillation of HuBERT for ASR by comparing student model structures, introducing a discriminative loss for improved low-resource performance, and proposing front-end distillation from waveform to Fbank features, achieving 17% parameter reduction and doubled inference speed with minor performance degradation.
-
Quantum-Enhanced LLM Efficient Fine Tuning
本文提出量子张量混合适配(QTHA)方法,通过整合量子神经网络和张量网络,实现LLM的参数高效微调,显著减少参数量并提升性能,为量子增强人工智能奠定基础。
-
Block Circulant Adapter for Large Language Models
本文提出块循环适配器方法,通过利用块循环矩阵和FFT优化LLM的微调过程,显著降低存储和计算成本,同时通过学习率调整确保训练稳定。
-
SEM: Reinforcement Learning for Search-Efficient Large Language Models
本文提出 *SEM* 框架,通过强化学习优化大型语言模型的搜索行为,在减少冗余搜索的同时提升回答准确性,显著提高推理效率。
-
Better Estimation of the KL Divergence Between Language Models
This paper introduces a Rao-Blackwellized Monte Carlo estimator for KL divergence between language models, achieving unbiased estimates with provably lower variance than standard Monte Carlo methods, and demonstrates improved stability and performance in RLHF fine-tuning for sentiment-controlled generation.