arxiv Preprint - EgoSchema: A Diagnostic Benchmark for Very Long-form Video Language Understanding
AI Breakdown

arxiv Preprint - EgoSchema: A Diagnostic Benchmark for Very Long-form Video Language Understanding

2023-08-27
In this episode we discuss EgoSchema: A Diagnostic Benchmark for Very Long-form Video Language Understanding by Karttikeya Mangalam, Raiymbek Akshulakov, Jitendra Malik. The paper presents EgoSchema, a benchmark dataset and evaluation metric for assessing the long-form video language understanding capabilities of vision and language systems. The dataset consists of over 5000 multiple choice question-answer pairs based on 250 hours of real video data, and the questions require selecting the...
View more
Comments (3)

More Episodes

All Episodes>>

Get this podcast on your phone, Free

Create Your Podcast In Minutes

  • Full-featured podcast site
  • Unlimited storage and bandwidth
  • Comprehensive podcast stats
  • Distribute to Apple Podcasts, Spotify, and more
  • Make money with your podcast
Get Started
It is Free