Download - 791: Reinforcement Learning from Human Feedback (RLHF), with Dr. Nathan Lambert

Discover

Podcast Features
Monetization
Podbean App
- Podcast Studio
  Easy-to-use audio recorder app.
- Podcast App
  The best podcast player & podcast app.

Help and Support
Popular Topics

All Arts Business Comedy Education
Fiction Government Health & Fitness History Kids & Family
Leisure Music News Religion & Spirituality Science
Society & Culture Sports Technology True Crime TV & Film
Live

How to Start a Podcast
How to Start a Live Podcast
How to Monetize a podcast
How to Promote Your Podcast
How to Use Group Recording

Log in
Start your podcast for free

Podcasting
Advertisers
Enterprise
Pricing
Resources
- Help and Support
- Popular Topics
Discover

Super Data Science: ML & AI Podcast with Jon Krohn

Technology

791: Reinforcement Learning from Human Feedback (RLHF), with Dr. Nathan Lambert

2024-06-11

Download Right click and do "save link as"

Reinforcement learning through human feedback (RLHF) has come a long way. In this episode, research scientist Nathan Lambert talks to Jon Krohn about the technique’s origins of the technique. He also walks through other ways to fine-tune LLMs, and how he believes generative AI might democratize education.

This episode is brought to you by AWS Inferentia (go.aws/3zWS0au) and AWS Trainium (go.aws/3ycV6K0), and Crawlbase (crawlbase.com), the ultimate data crawling platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.

In this episode you will learn:

• Why it is important that AI is open [03:13]

• The efficacy and scalability of direct preference optimization [07:32]

• Robotics and LLMs [14:32]

• The challenges to aligning reward models with human preferences [23:00]

• How to make sure AI’s decision making on preferences reflect desirable behavior [28:52]

• Why Nathan believes AI is closer to alchemy than science [37:38]

Additional materials: www.superdatascience.com/791