Tag: Large Language Model
All the articles with the tag "Large Language Model".
-
CRANE: Reasoning with constrained LLM generation
This paper introduces CRANE, a reasoning-augmented constrained decoding algorithm that alternates between unconstrained and constrained generation to preserve LLM reasoning capabilities while ensuring syntactic correctness, achieving up to 10% accuracy improvement on symbolic reasoning benchmarks like GSM-Symbolic and FOLIO.
-
Adversarial Attacks on LLM-as-a-Judge Systems: Insights from Prompt Injections
本文通过提出攻击框架和实验评估,揭示了LLM-as-a-judge系统的prompt injection漏洞,并推荐使用多模型委员会等策略提升鲁棒性。
-
Splitwiser: Efficient LM inference with constrained resources
Splitwiser introduces a method to split LLM inference phases on a single GPU using multiprocessing and NVIDIA MPS, achieving modest latency reductions (up to 18.2%) and throughput improvements (up to 1.42x) on Huggingface and vLLM pipelines, though constrained by overheads and scalability issues.
-
Dynamic Parametric Retrieval Augmented Generation for Test-time Knowledge Enhancement
本文提出动态参数化RAG框架DyPRAG,通过训练一个轻量级参数翻译器在测试时动态转换文档为参数知识,显著降低成本、提升泛化能力和缓解RAG幻觉问题。
-
Block Circulant Adapter for Large Language Models
本文提出块循环适配器方法,通过利用块循环矩阵和FFT优化LLM的微调过程,显著降低存储和计算成本,同时通过学习率调整确保训练稳定。