Tag: Image Generation

All the articles with the tag "Image Generation".

PICD: Versatile Perceptual Image Compression with Diffusion Rendering

Published: 15 May, 2025 at 11:10 AM

95.81 🤔

PICD introduces a versatile perceptual image compression codec using diffusion rendering with three-tiered conditioning to achieve high text accuracy and visual quality for both screen and natural images, outperforming existing methods in key metrics like FID and text accuracy.
Discrete Visual Tokens of Autoregression, by Diffusion, and for Reasoning

Published: 16 May, 2025 at 11:36 AM

93.91 🤔

Selftok introduces a non-spatial autoregressive visual tokenizer using diffusion timesteps, unifying vision-language models and enabling effective reinforcement learning for superior text-to-image generation, as demonstrated on GenEval and DPG-Bench benchmarks.
X-Fusion: Introducing New Modality to Frozen Large Language Models

Published: 4 May, 2025 at 04:31 PM

66.52 🤔

本文提出X-Fusion框架，通過凍結LLM參數並添加雙塔結構，高效實現多模態理解和生成，同時保留原始語言能力。
Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models

Published: 4 May, 2025 at 04:31 PM

59.95 🤔

本文提出Token-Shuffle方法，通过利用视觉词汇维度冗余动态合并和恢复图像令牌，实现高效的高分辨率文本到图像生成，同时在统一自回归框架下保持出色性能。