Some of the most powerful NLP models like BERT and GPT-2 have one thing in common: they all use the transformer architecture.
Such architecture is built on top of another important concept already known to the community: self-attention.
In this episode I explain what these mechanisms are, how they work and why they are so powerful.
Don't forget to subscribe to our Newsletter or join the discussion on our Discord server
References
[RB] Online learning is better than batch, right? Wrong! (Ep. 216)
Chatting with ChatGPT: Pros and Cons of Advanced Language AI (Ep. 215)
Accelerating Perception Development with Synthetic Data (Ep. 214)
Edge AI applications for military and space [RB] (Ep. 213)
From image to 3D model (Ep. 212)
Machine learning is physics (Ep. 211)
Autonomous cars cannot drive. Here is why. (Ep. 210)
Evolution of data platforms (Ep. 209)
[RB] Is studying AI in academia a waste of time? (Ep. 208)
Private machine learning done right (Ep. 207)
Edge AI for applications in military and space (Ep. 206)
[RB] What are generalist agents and why they can change the AI game (Ep. 205)
LIDAR, cameras and autonomous vehicles (Ep. 204)
Predicting Out Of Memory Kill events with Machine Learning (Ep. 203)
Is studying AI in academia a waste of time? (Ep. 202)
Zero-Cost Proxies: How to find the best neural network without training (Ep. 201)
Online learning is better than batch, right? Wrong! (Ep. 200)
What are generalist agents and why they can change the AI game (Ep. 199)
Streaming data with ease. With Chip Kent from Deephaven Data Labs (Ep. 198)
Learning from data to create personalized experiences with Matt Swalley from Omneky (Ep. 197)
Create your
podcast in
minutes
It is Free
Insight Story: Tech Trends Unpacked
Zero-Shot
Fast Forward by Tomorrow Unlocked: Tech past, tech future
The Unbelivable Truth - Series 1 - 26 including specials and pilot
Acquired