CVPR 2023 - MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation
AI Breakdown

CVPR 2023 - MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation

2023-05-18
In this episode we discuss MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation by Ludan Ruan, Yiyang Ma, Huan Yang, Huiguo He, Bei Liu, Jianlong Fu, Nicholas Jing Yuan, Qin Jin, Baining Guo. The paper proposes a joint audio-video generation framework called Multi-Modal Diffusion (MM-Diffusion) that generates high-quality realistic videos with aligned audio. The model consists of two-coupled denoising autoencoders and a sequential multi-modal U-Net. A...
View more
Comments (3)

More Episodes

All Episodes>>

Get this podcast on your phone, Free

Create Your Podcast In Minutes

  • Full-featured podcast site
  • Unlimited storage and bandwidth
  • Comprehensive podcast stats
  • Distribute to Apple Podcasts, Spotify, and more
  • Make money with your podcast
Get Started
It is Free