Some of the most powerful NLP models like BERT and GPT-2 have one thing in common: they all use the transformer architecture.
Such architecture is built on top of another important concept already known to the community: self-attention.
In this episode I explain what these mechanisms are, how they work and why they are so powerful.
Don't forget to subscribe to our Newsletter or join the discussion on our Discord server
References
Do you fear of AI? Why? (Ep. 176)
Composable models and artificial general intelligence (Ep. 175)
Ethics and explainability in AI with Erika Agostinelli from IBM (ep. 174)
Is neural hash by Apple violating our privacy? (Ep. 173)
Fighting Climate Change as a Technologist (Ep. 172)
AI in the Enterprise with IBM Global AI Strategist Mara Pometti (Ep. 171)
Speaking about data with Mikkel Settnes from Dreamdata.io (Ep. 170)
Send compute to data with POSH data-aware shell (Ep. 169)
How are organisations doing with data and AI? (Ep. 168)
Don't fight! Cooperate. Generative Teaching Networks (Ep. 167)
CSV sucks. Here is why. (Ep. 166)
Reinforcement Learning is all you need. Or is it? (Ep. 165)
What's happening with AI today? (Ep. 164)
2 effective ways to explain your predictions (Ep. 163)
The Netflix challenge. Fair or what? (Ep. 162)
Artificial Intelligence for Blockchains with Jonathan Ward CTO of Fetch AI (Ep. 161)
Apache Arrow, Ballista and Big Data in Rust with Andy Grove RB (Ep. 160)
GitHub Copilot: yay or nay? (Ep. 159)
Pandas vs Rust [RB] (Ep. 158)
A simple trick for very unbalanced data (Ep. 157)
Create your
podcast in
minutes
It is Free
Insight Story: Tech Trends Unpacked
Zero-Shot
Fast Forward by Tomorrow Unlocked: Tech past, tech future
The Unbelivable Truth - Series 1 - 26 including specials and pilot
Acquired