arxiv preprint - Retentive Network: A Successor to Transformer for Large Language Models
AI Breakdown

arxiv preprint - Retentive Network: A Successor to Transformer for Large Language Models

2023-07-25
In this episode we discuss Retentive Network: A Successor to Transformer for Large Language Models by Yutao Sun, Li Dong, Shaohan Huang, Shuming Ma, Yuqing Xia, Jilong Xue, Jianyong Wang, Furu Wei. The paper introduces RETNET as a successor to the Transformer architecture for language models. RETNET utilizes a retention mechanism that supports parallel, recurrent, and chunkwise recurrent computation paradigms for efficient training and inference. Experimental results show that RETNET...
View more
Comments (3)

More Episodes

All Episodes>>

Get this podcast on your phone, Free

Create Your Podcast In Minutes

  • Full-featured podcast site
  • Unlimited storage and bandwidth
  • Comprehensive podcast stats
  • Distribute to Apple Podcasts, Spotify, and more
  • Make money with your podcast
Get Started
It is Free