Computer Vision - Thinking with Video Video Generation as a Promising Multimodal Reasoning Paradigm
PaperLedge

Computer Vision - Thinking with Video Video Generation as a Promising Multimodal Reasoning Paradigm

2025-11-08
Alright learning crew, Ernis here, ready to dive into some seriously cool research that's pushing the boundaries of AI! We're talking about how we can make these AI models, like the ones powering chatbots and image generators, actually understand the world around them. Now, for a while, the big thing has been "Thinking with Text" and "Thinking with Images." Basically, we feed these AI models tons of text and pictures, hoping they'll learn to reason and solve problems. Think of it like showing a student flashcards...
View more
Comments (3)

More Episodes

All Episodes>>

Get this podcast on your phone, Free

Create Your Podcast In Minutes

  • Full-featured podcast site
  • Unlimited storage and bandwidth
  • Comprehensive podcast stats
  • Distribute to Apple Podcasts, Spotify, and more
  • Make money with your podcast
Get Started
It is Free