In this episode I explain how a community detection algorithm known as Markov clustering can be constructed by combining simple concepts like random walks, graphs, similarity matrix. Moreover, I highlight how one can build a similarity graph and then run a community detection algorithm on such graph to find clusters in tabular data.
You can find a simple hands-on code snippet to play with on the Amethix Blog
Enjoy the show!
References
[1] S. Fortunato, “Community detection in graphs”, Physics Reports, volume 486, issues 3-5, pages 75-174, February 2010.
[2] Z. Yang, et al., “A Comparative Analysis of Community Detection Algorithms on Artificial Networks”, Scientific Reports volume 6, Article number: 30750 (2016)
[3] S. Dongen, “A cluster algorithm for graphs”, Technical Report, CWI (Centre for Mathematics and Computer Science) Amsterdam, The Netherlands, 2000.
[4] A. J. Enright, et al., “An efficient algorithm for large-scale detection of protein families”, Nucleic Acids Research, volume 30, issue 7, pages 1575-1584, 2002.
Is Apple M1 good for machine learning? (Ep.136)
Rust and deep learning with Daniel McKenna (Ep. 135)
Scaling machine learning with clusters and GPUs (Ep. 134)
What is data ethics? (Ep. 133)
A Standard for the Python Array API (Ep. 132)
What happens to data transfer after Schrems II? (Ep. 131)
Test-First Machine Learning [RB] (Ep. 130)
Similarity in Machine Learning (Ep. 129)
Distill data and train faster, better, cheaper (Ep. 128)
Machine Learning in Rust: Amadeus with Alec Mocatta [RB] (ep. 127)
Top-3 ways to put machine learning models into production (Ep. 126)
Remove noise from data with deep learning (Ep.125)
What is contrastive learning and why it is so powerful? (Ep. 124)
Neural search (Ep. 123)
Let's talk about federated learning (Ep. 122)
How to test machine learning in production (Ep. 121)
Why synthetic data cannot boost machine learning (Ep. 120)
Machine learning in production: best practices [LIVE from twitch.tv] (Ep. 119)
Testing in machine learning: checking deeplearning models (Ep. 118)
Testing in machine learning: generating tests and data (Ep. 117)
Create your
podcast in
minutes
It is Free
Insight Story: Tech Trends Unpacked
Zero-Shot
Fast Forward by Tomorrow Unlocked: Tech past, tech future
The Unbelivable Truth - Series 1 - 26 including specials and pilot
Lex Fridman Podcast