Tag: Diffusion Model
All the articles with the tag "Diffusion Model".
-
Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations
The Video Prediction Policy (VPP) introduces a novel generalist robot policy that leverages predictive visual representations from fine-tuned video diffusion models to learn implicit inverse dynamics, achieving significant improvements of 41.5% on the Calvin ABC→D benchmark and 31.6% in real-world dexterous manipulation tasks over state-of-the-art baselines.
-
Diff-Prompt: Diffusion-Driven Prompt Generator with Mask Supervision
本文提出Diff-Prompt方法,使用扩散模型基于掩码监督生成细粒度提示信息,显著提升预训练多模态模型在复杂指代表达理解任务上的性能,同时保持高效微调。