More powerful deep learning with transformers (Ep. 84) (Rebroadcast)
Data Science at Home

More powerful deep learning with transformers (Ep. 84) (Rebroadcast)

2019-11-27
Some of the most powerful NLP models like BERT and GPT-2 have one thing in common: they all use the transformer architecture. Such architecture is built on top of another important concept already known to the community: self-attention.In this episode I explain what these mechanisms are, how they work and why they are so powerful. Don't forget to subscribe to our Newsletter or join the discussion on our Discord server   References Attention is all you need https://arxiv.org/abs/1706.03762 The illustrated tr...
View more
Comments (3)

More Episodes

All Episodes>>

Get this podcast on your phone, Free

Create Your Podcast In Minutes

  • Full-featured podcast site
  • Unlimited storage and bandwidth
  • Comprehensive podcast stats
  • Distribute to Apple Podcasts, Spotify, and more
  • Make money with your podcast
Get Started
It is Free