Today we’re joined by Sherry Yang, senior research scientist at Google DeepMind and a PhD student at UC Berkeley. In this interview, we discuss her new paper, "Video as the New Language for Real-World Decision Making,” which explores how generative video models can play a role similar to language models as a way to solve tasks in the real world. Sherry draws the analogy between natural language as a unified representation of information and text prediction as a common task interface and demonstrates how video as a medium and generative video as a task exhibit similar properties. This formulation enables video generation models to play a variety of real-world roles as planners, agents, compute engines, and environment simulators. Finally, We explore UniSim, an interactive demo of Sherry's work and a preview of her vision for interacting with AI-generated environments.
The complete show notes for this episode can be found at twimlai.com/go/676.
Advancing Hands-On Machine Learning Education with Sebastian Raschka - #565
Big Science and Embodied Learning at Hugging Face 🤗 with Thomas Wolf - #564
Full-Stack AI Systems Development with Murali Akula - #563
100x Improvements in Deep Learning Performance with Sparsity, w/ Subutai Ahmad - #562
Scaling BERT and GPT for Financial Services with Jennifer Glore - #561
Trends in Deep Reinforcement Learning with Kamyar Azizzadenesheli - #560
Deep Reinforcement Learning at the Edge of the Statistical Precipice with Rishabh Agarwal - #559
Designing New Energy Materials with Machine Learning with Rafael Gomez-Bombarelli - #558
Differentiable Programming for Oceanography with Patrick Heimbach - #557
Trends in Machine Learning & Deep Learning with Zachary Lipton - #556
Solving the Cocktail Party Problem with Machine Learning, w/ Jonathan Le Roux - #555
Machine Learning for Earthquake Seismology with Karianne Bergen - #554
The New DBfication of ML/AI with Arun Kumar - #553
Building Public Interest Technology with Meredith Broussard - #552
A Universal Law of Robustness via Isoperimetry with Sebastien Bubeck - #551
Trends in NLP with John Bohannon - #550
Trends in Computer Vision with Georgia Gkioxari - #549
Kids Run the Darndest Experiments: Causal Learning in Children with Alison Gopnik - #548
Hypergraphs, Simplicial Complexes and Graph Representations of Complex Systems with Tina Eliassi-Rad - #547
Deep Learning, Transformers, and the Consequences of Scale with Oriol Vinyals - #546
Create your
podcast in
minutes
It is Free
20/20
The Dropout
Ten Percent Happier with Dan Harris
World News Tonight with David Muir
NEJM This Week