Scaling Vision Transformers
Papers Read on AI

Scaling Vision Transformers

2021-07-23
Attention-based neural networks such as the Vision Transformer (ViT) have recently attained state-of-the-art results on many computer vision benchmarks. We successfully train a ViT model with two billion parameters, which attains a new state-of-the-art on ImageNet of 90.45% top-1 accuracy. The model also performs well on few-shot learning, for example, reaching 84.86% top-1 accuracy on ImageNet with only 10 examples per class. 2021: Xiaohua Zhai, Alexander Kolesnikov, N. Houlsby, L....
View more
Comments (3)

More Episodes

All Episodes>>

Get this podcast on your phone, Free

Creat Yourt Podcast In Minutes

  • Full-featured podcast site
  • Unlimited storage and bandwidth
  • Comprehensive podcast stats
  • Distribute to Apple Podcasts, Spotify, and more
  • Make money with your podcast
Get Started
It is Free