Tag: Multimodality
All the articles with the tag "Multimodality".
-
Deformable Beta Splatting
Deformable Beta Splatting (DBS) enhances real-time radiance field rendering by introducing deformable Beta Kernels for superior geometric fidelity, Spherical Beta for efficient color encoding, and kernel-agnostic MCMC optimization, achieving state-of-the-art visual quality with 45% fewer parameters and 1.5x faster rendering than 3DGS-MCMC.
-
Activation Space Interventions Can Be Transferred Between Large Language Models
This paper demonstrates that activation space interventions for AI safety, such as backdoor removal and refusal behavior, can be transferred between large language models using autoencoder mappings, enabling smaller models to align larger ones, though challenges remain in cross-architecture transfers and complex tasks like corrupted capabilities.
-
Reinforced MLLM: A Survey on RL-Based Reasoning in Multimodal Large Language Models
本文系统综述了基于强化学习的推理方法在多模态大语言模型(MLLMs)中的进展,分析了算法设计、奖励机制及应用,揭示了跨模态推理和奖励稀疏性等挑战,并提出了分层奖励和交互式RL等未来方向。
-
CCSK:Cognitive Convection of Self-Knowledge Based Retrieval Augmentation for Large Language Models
本文提出CCSK框架,通过Siamese Network和Response Quality Model动态融合查询相似性和响应质量,优化大型语言模型的信息检索决策,在多个问答数据集上显著提升了F1分数和准确率。
-
MegaScale-Infer: Serving Mixture-of-Experts at Scale with Disaggregated Expert Parallelism
本文提出MegaScale-Infer系统,通过分离注意力模块和FFN模块的并行策略以及高效M2N通信库,优化大规模MoE模型的推理效率,实现高达1.90倍的吞吐量提升。