Podcasting
Advertisers
Enterprise
Pricing
Resources
Discover Discover

Log in
Sign up free

Papers Read on AI

Can Wikipedia Help Offline Reinforcement Learning?

2022-08-13

Download 43

Fine-tuning reinforcement learning (RL) models has been challenging because of a lack of large scale off-the-shelf datasets as well as high variance in transferability among different environments. Recent work has looked at tackling offline RL from the perspective of sequence modeling with improved results as result of the introduction of the Transformer architecture. However, when the model is trained from scratch, it suffers from slow convergence speeds. In this paper, we look to take...

Fine-tuning reinforcement learning (RL) models has been challenging because of a lack of large scale off-the-shelf datasets as well as high variance in transferability among different environments. Recent work has looked at tackling offline RL from the perspective of sequence modeling with improved results as result of the introduction of the Transformer architecture. However, when the model is trained from scratch, it suffers from slow convergence speeds. In this paper, we look to take advantage of this formulation of reinforcement learning as sequence modeling and investigate the transferability of pre-trained sequence models on other domains (vision, language) when finetuned on offline RL tasks (control, games). To this end, we also propose techniques to improve transfer between these domains. 2022: Machel Reid, Yutaro Yamada, S. Gu https://arxiv.org/pdf/2201.12122v1.pdf

View more

Comments (3)

More Episodes

You may also like

The 404 Media Podcast

Big Technology Podcast

Cybersecurity Headlines

Noticias de la mañana

Software Engineering Daily

YPO Technology Network AI Brief

Get this podcast on your phone, Free

Create Your Podcast In Minutes

Full-featured podcast site
Unlimited storage and bandwidth
Comprehensive podcast stats
Distribute to Apple Podcasts, Spotify, and more
Make money with your podcast

It is Free

Podcast Services
MONETIZATION & MORE
KNOWLEDGE BASE
Support
Podbean

Privacy Policy
Cookie Policy
Terms of Use
Consent Preferences
Copyright © 2015-2026 Podbean.com