arxiv preprint - Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum
AI Breakdown

arxiv preprint - Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum

2024-05-29
In this episode, we discuss Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum by Hadi Pouransari, Chun-Liang Li, Jen-Hao Rick Chang, Pavan Kumar Anasosalu Vasu, Cem Koc, Vaishaal Shankar, Oncel Tuzel. The paper introduces a novel variable sequence length training technique called dataset decomposition to address inefficiencies in training large language models (LLMs) with fixed-length token sequences. It divides the dataset into buckets of sequences of the same...
View more
Comments (3)

More Episodes

All Episodes>>

Get this podcast on your phone, Free

Create Your Podcast In Minutes

  • Full-featured podcast site
  • Unlimited storage and bandwidth
  • Comprehensive podcast stats
  • Distribute to Apple Podcasts, Spotify, and more
  • Make money with your podcast
Get Started
It is Free