Today we’re joined by Sanmi Koyejo, assistant professor at Stanford University, to continue our NeurIPS 2024 series. In our conversation, Sanmi discusses his two recent award-winning papers. First, we dive into his paper, “Are Emergent Abilities of Large Language Models a Mirage?” We discuss the different ways LLMs are evaluated and the excitement surrounding their“emergent abilities” such as the ability to perform arithmetic Sanmi describes how evaluating model performance using nonlinear metrics can lead to the illusion that the model is rapidly gaining new capabilities, whereas linear metrics show smooth improvement as expected, casting doubt on the significance of emergence. We continue on to his next paper, “DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models,” discussing the methodology it describes for evaluating concerns such as the toxicity, privacy, fairness, and robustness of LLMs.
The complete show notes for this episode can be found at twimlai.com/go/671.
Optimization, Machine Learning and Intelligent Experimentation with Michael McCourt - #545
Jupyter and the Evolution of ML Tooling with Brian Granger - #544
Creating a Data-Driven Culture at ADP with Jack Berkowitz - #543
re:Invent Roundup 2021 with Bratin Saha - #542
Multi-modal Deep Learning for Complex Document Understanding with Doug Burdick - #541
Predictive Maintenance Using Deep Learning and Reliability Engineering with Shayan Mortazavi - #540
Building a Deep Tech Startup in NLP with Nasrin Mostafazadeh - #539
Models for Human-Robot Collaboration with Julie Shah - #538
Four Key Tools for Robust Enterprise NLP with Yunyao Li - #537
Machine Learning at GSK with Kim Branson - #536
The Benefit of Bottlenecks in Evolving Artificial Intelligence with David Ha - #535
Facebook Abandons Facial Recognition. Should Everyone Else Follow Suit? With Luke Stark - #534
Building Blocks of Machine Learning at LEGO with Francesc Joan Riera - #533
Exploring the FastAI Tooling Ecosystem with Hamel Husain - #532
Multi-task Learning for Melanoma Detection with Julianna Ianni - #531
House Hunters: Machine Learning at Redfin with Akshat Kaul - #530
Attacking Malware with Adversarial Machine Learning, w/ Edward Raff - #529
Learning to Ponder: Memory in Deep Neural Networks with Andrea Banino - #528
Advancing Deep Reinforcement Learning with NetHack, w/ Tim Rocktäschel - #527
Building Technical Communities at Stack Overflow with Prashanth Chandrasekar - #526
Create your
podcast in
minutes
It is Free
20/20
The Dropout
Ten Percent Happier with Dan Harris
World News Tonight with David Muir
NEJM This Week