Today we’re joined by Sherry Yang, senior research scientist at Google DeepMind and a PhD student at UC Berkeley. In this interview, we discuss her new paper, "Video as the New Language for Real-World Decision Making,” which explores how generative video models can play a role similar to language models as a way to solve tasks in the real world. Sherry draws the analogy between natural language as a unified representation of information and text prediction as a common task interface and demonstrates how video as a medium and generative video as a task exhibit similar properties. This formulation enables video generation models to play a variety of real-world roles as planners, agents, compute engines, and environment simulators. Finally, We explore UniSim, an interactive demo of Sherry's work and a preview of her vision for interacting with AI-generated environments.
The complete show notes for this episode can be found at twimlai.com/go/676.
Are LLMs Overhyped or Underappreciated? with Marti Hearst - #626
Are Large Language Models a Path to AGI? with Ben Goertzel - #625
Open Source Generative AI at Hugging Face with Jeff Boudier - #624
Generative AI at the Edge with Vinesh Sukumar - #623
Runway Gen-2: Generative AI for Video Creation with Anastasis Germanidis - #622
Watermarking Large Language Models to Fight Plagiarism with Tom Goldstein - 621
Does ChatGPT “Think”? A Cognitive Neuroscience Perspective with Anna Ivanova - #620
Robotic Dexterity and Collaboration with Monroe Kennedy III - #619
Privacy and Security for Stable Diffusion and LLMs with Nicholas Carlini - #618
Understanding AI’s Impact on Social Disparities with Vinodkumar Prabhakaran - #617
AI Trends 2023: Causality and the Impact on Large Language Models with Robert Osazuwa Ness - #616
Data-Centric Zero-Shot Learning for Precision Agriculture with Dimitris Zermas - #615
How LLMs and Generative AI are Revolutionizing AI for Science with Anima Anandkumar - #614
AI Trends 2023: Natural Language Proc - ChatGPT, GPT-4 and Cutting Edge Research with Sameer Singh - #613
AI Trends 2023: Reinforcement Learning - RLHF, Robotic Pre-Training, and Offline RL with Sergey Levine - #612
Supporting Food Security in Africa Using ML with Catherine Nakalembe - #611
Service Cards and ML Governance with Michael Kearns - #610
Reinforcement Learning for Personalization at Spotify with Tony Jebara - #609
Will ChatGPT take my job? - #608
Geospatial Machine Learning at AWS with Kumar Chellapilla - #607
Create your
podcast in
minutes
It is Free
20/20
The Dropout
Ten Percent Happier with Dan Harris
World News Tonight with David Muir
NEJM This Week