Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models
Papers Read on AI

Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models

2022-09-20
Adaptive gradient algorithms [1–4] borrow the moving average idea of heavy ball acceleration to estimate accurate first- and second-order moments of gradient for accelerating convergence. However, Nesterov acceleration which converges faster than heavy ball acceleration in theory [5] and also in many empirical cases [6] is much less investigated under the adaptive gradient setting. In this work, we propose the ADAptive Nesterov momentum algorithm, Adan for short, to effec-tively speedup the tra...
View more
Comments (3)

More Episodes

All Episodes>>

Get this podcast on your phone, Free

Creat Yourt Podcast In Minutes

  • Full-featured podcast site
  • Unlimited storage and bandwidth
  • Comprehensive podcast stats
  • Distribute to Apple Podcasts, Spotify, and more
  • Make money with your podcast
Get Started
It is Free