This 85th episode of Learning Machines 101 discusses formal convergence guarantees for a broad class of machine learning algorithms designed to minimize smooth non-convex objective functions using batch learning methods. In particular, a broad class of unsupervised, supervised, and reinforcement machine learning algorithms which iteratively update their parameter vector by adding a perturbation based upon all of the training data. This process is repeated, making a perturbation of the parameter vector based upon all of the training data until a parameter vector is generated which exhibits improved predictive performance. The magnitude of the perturbation at each learning iteration is called the “stepsize” or “learning rate” and the identity of the perturbation vector is called the “search direction”. Simple mathematical formulas are presented based upon research from the late 1960s by Philip Wolfe and G. Zoutendijk that ensure convergence of the generated sequence of parameter vectors. These formulas may be used as the basis for the design of artificially intelligent smart automatic learning rate selection algorithms. The material in this podcast is designed to provide an overview of Chapter 7 of my new book “Statistical Machine Learning” and is based upon material originally presented in Episode 68 of Learning Machines 101! Check out: www.learningmachines101.com for the show notes!!!
LM101-046: How to Optimize Student Learning using Recurrent Neural Networks (Educational Technology)
LM101-045: How to Build a Deep Learning Machine for Answering Questions about Images
LM101-044: What happened at the Deep Reinforcement Learning Tutorial at the 2015 Neural Information Processing Systems Conference?
LM101-043: How to Learn a Monte Carlo Markov Chain to Solve Constraint Satisfaction Problems (Rerun of Episode 22)
LM101-042: What happened at the Monte Carlo Markov Chain (MCMC) Inference Methods Tutorial at the 2015 Neural Information Processing Systems Conference?
LM101-041: What happened at the 2015 Neural Information Processing Systems Deep Learning Tutorial?
LM101-040: How to Build a Search Engine, Automatically Grade Essays, and Identify Synonyms using Latent Semantic Analysis
LM101-039: How to Solve Large Complex Constraint Satisfaction Problems (Monte Carlo Markov Chain and Markov Fields)[Rerun]
LM101-038: How to Model Knowledge Skill Growth Over Time using Bayesian Nets
LM101-037: How to Build a Smart Computerized Adaptive Testing Machine using Item Response Theory
LM101-036: How to Predict the Future from the Distant Past using Recurrent Neural Networks
LM101-035: What is a Neural Network and What is a Hot Dog?
LM101-034: How to Use Nonlinear Machine Learning Software to Make Predictions (Feedforward Perceptrons with Radial Basis Functions)[Rerun]
LM101-033: How to Use Linear Machine Learning Software to Make Predictions (Linear Regression Software)[RERUN]
LM101-032: How To Build a Support Vector Machine to Classify Patterns
LM101-031: How to Analyze and Design Learning Rules using Gradient Descent Methods (RERUN)
LM101-030: How to Improve Deep Learning Performance with Artificial Brain Damage (Dropout and Model Averaging)
LM101-029: How to Modernize Deep Learning with Rectilinear units, Convolutional Nets, and Max-Pooling
LM101-028: How to Evaluate the Ability to Generalize from Experience (Cross-Validation Methods)[RERUN]
LM101-027: How to Learn About Rare and Unseen Events (Smoothing Probabilistic Laws)[RERUN]
Create your
podcast in
minutes
It is Free
Insight Story: Tech Trends Unpacked
Zero-Shot
Fast Forward by Tomorrow Unlocked: Tech past, tech future
The Unbelivable Truth - Series 1 - 26 including specials and pilot
A Prairie Home Companion: News from Lake Wobegon