Tag: Generative AI

All the articles with the tag "Generative AI".

AI in Money Matters

Published: 15 May, 2025 at 11:03 AM

99.13 🤔

This paper investigates the cautious adoption of Large Language Models like ChatGPT in the Fintech industry through qualitative interviews, highlighting professionals' optimism for routine task automation, concerns over regulatory inadequacies, and interest in bespoke models to ensure compliance and data control.
PICD: Versatile Perceptual Image Compression with Diffusion Rendering

Published: 15 May, 2025 at 11:10 AM

95.81 🤔

PICD introduces a versatile perceptual image compression codec using diffusion rendering with three-tiered conditioning to achieve high text accuracy and visual quality for both screen and natural images, outperforming existing methods in key metrics like FID and text accuracy.
VideoUFO: A Million-Scale User-Focused Dataset for Text-to-Video Generation

Published: 16 May, 2025 at 11:10 AM

94.40 🤔

This paper introduces VideoUFO, a million-scale dataset of 1.09 million video clips across 1,291 user-focused topics for text-to-video generation, curated from YouTube with minimal overlap with existing datasets, demonstrating improved performance on worst-performing topics when training a simple model like MVDiT.
Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling

Published: 23 May, 2025 at 11:14 AM

91.73 🤔

Token Recycling 提出了一种无训练的推测解码方法，通过回收候选词并利用邻接矩阵构建草稿树，实现大型语言模型推理约 2 倍加速，相较于其他无训练方法提升超 30%。
Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations

Published: 8 May, 2025 at 10:22 AM

89.20 🤔

The Video Prediction Policy (VPP) introduces a novel generalist robot policy that leverages predictive visual representations from fine-tuned video diffusion models to learn implicit inverse dynamics, achieving significant improvements of 41.5% on the Calvin ABC→D benchmark and 31.6% in real-world dexterous manipulation tasks over state-of-the-art baselines.

Tag: Generative AI

AI in Money Matters

PICD: Versatile Perceptual Image Compression with Diffusion Rendering

VideoUFO: A Million-Scale User-Focused Dataset for Text-to-Video Generation

Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling

Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations