In this 77th episode of www.learningmachines101.com , we explain the proper semantic interpretation of the Bayesian Information Criterion (BIC) and emphasize how this semantic interpretation is fundamentally different from AIC (Akaike Information Criterion) model selection methods. Briefly, BIC is used to estimate the probability of the training data given the probability model, while AIC is used to estimate out-of-sample prediction error. The probability of the training data given the model is called the “marginal likelihood”. Using the marginal likelihood, one can calculate the probability of a model given the training data and then use this analysis to support selecting the most probable model, selecting a model that minimizes expected risk, and support Bayesian model averaging. The assumptions which are required for BIC to be a valid approximation for the probability of the training data given the probability model are also discussed.
LM101-086: Ch8: How to Learn the Probability of Infinitely Many Outcomes
LM101-085:Ch7:How to Guarantee your Batch Learning Algorithm Converges
LM101-084: Ch6: How to Analyze the Behavior of Smart Dynamical Systems
LM101-083: Ch5: How to Use Calculus to Design Learning Machines
LM101-082: Ch4: How to Analyze and Design Linear Machines
LM101-081: Ch3: How to Define Machine Learning (or at Least Try)
LM101-080: Ch2: How to Represent Knowledge using Set Theory
LM101-079: Ch1: How to View Learning as Risk Minimization
LM101-078: Ch0: How to Become a Machine Learning Expert
LM101-076: How to Choose the Best Model using AIC and GAIC
LM101-075: Can computers think? A Mathematician's Response (remix)
LM101-074: How to Represent Knowledge using Logical Rules (remix)
LM101-073: How to Build a Machine that Learns to Play Checkers (remix)
LM101-072: Welcome to the Big Artificial Intelligence Magic Show! (Remix of LM101-001 and LM101-002)
LM101-071: How to Model Common Sense Knowledge using First-Order Logic and Markov Logic Nets
LM101-070: How to Identify Facial Emotion Expressions in Images Using Stochastic Neighborhood Embedding
LM101-069: What Happened at the 2017 Neural Information Processing Systems Conference?
LM101-068: How to Design Automatic Learning Rate Selection for Gradient Descent Type Machine Learning Algorithms
LM101-067: How to use Expectation Maximization to Learn Constraint Satisfaction Solutions (Rerun)
Create your
podcast in
minutes
It is Free
Insight Story: Tech Trends Unpacked
Zero-Shot
Fast Forward by Tomorrow Unlocked: Tech past, tech future
The Unbelivable Truth - Series 1 - 26 including specials and pilot
A Prairie Home Companion: News from Lake Wobegon