AHs 2025 GazeLLM: Multimodal LLMs incorporating Human Visual Attention
HCI Deep Dives

AHs 2025 GazeLLM: Multimodal LLMs incorporating Human Visual Attention

2025-12-27
Processing high-resolution video with AI requires massive computational resources. GazeLLM offers an elegant solution inspired by human vision: use eye-tracking to focus only on what matters. By cropping first-person video to a small region around the user's gaze point, the system reduces pixel input to just one-tenth while achieving task comprehension equal to or better than full-resolution video. User evaluations across six real-world activities—cooking, bike repair, first aid, and s...
View more
Comments (3)

More Episodes

All Episodes>>

Get this podcast on your phone, Free

Create Your Podcast In Minutes

  • Full-featured podcast site
  • Unlimited storage and bandwidth
  • Comprehensive podcast stats
  • Distribute to Apple Podcasts, Spotify, and more
  • Make money with your podcast
Get Started
It is Free