Today we’re joined by Sherry Yang, senior research scientist at Google DeepMind and a PhD student at UC Berkeley. In this interview, we discuss her new paper, "Video as the New Language for Real-World Decision Making,” which explores how generative video models can play a role similar to language models as a way to solve tasks in the real world. Sherry draws the analogy between natural language as a unified representation of information and text prediction as a common task interface and demonstrates how video as a medium and generative video as a task exhibit similar properties. This formulation enables video generation models to play a variety of real-world roles as planners, agents, compute engines, and environment simulators. Finally, We explore UniSim, an interactive demo of Sherry's work and a preview of her vision for interacting with AI-generated environments.
The complete show notes for this episode can be found at twimlai.com/go/676.
Scaling Enterprise ML in 2020: Still Hard! with Sushil Thomas - #429
Enabling Clinical Automation: From Research to Deployment with Devin Singh - #428
Pixels to Concepts with Backpropagation w/ Roland Memisevic - #427
Fighting Global Health Disparities with AI w/ Jon Wang - #426
Accessibility and Computer Vision - #425
NLP for Equity Investing with Frank Zhao - #424
The Future of Education and AI with Salman Khan - #423
Why AI Innovation and Social Impact Go Hand in Hand with Milind Tambe - #422
What's Next for Fast.ai? w/ Jeremy Howard - #421
Feature Stores for MLOps with Mike del Balso - #420
Exploring Causality and Community with Suzana Ilić - #419
Decolonizing AI with Shakir Mohamed - #418
Spatial Analysis for Real-Time Video Processing with Adina Trufinescu
How Deep Learning has Revolutionized OCR with Cha Zhang - #416
Machine Learning for Food Delivery at Global Scale - #415
Open Source at Qualcomm AI Research with Jeff Gehlhaar and Zahra Koochak - #414
Visualizing Climate Impact with GANs w/ Sasha Luccioni - #413
ML-Powered Language Learning at Duolingo with Burr Settles - #412
Bridging The Gap Between Machine Learning and the Life Sciences with Artur Yakimovich - #411
Understanding Cultural Style Trends with Computer Vision w/ Kavita Bala - #410
Create your
podcast in
minutes
It is Free
20/20
The Dropout
Ten Percent Happier with Dan Harris
World News Tonight with David Muir
NEJM This Week