📘 Mathematics for Machine Learning
by Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong
https://t.co/zSpp67kJSg Note: this is probably the place you want to start. Start slowly and work on some examples. Pay close attention to the notation and get comfortable with it.
📘 Pattern Recognition and Machine Learning
by Christopher Bishop
Note: Prior to the book above, this is the book that I used to recommend to get familiar with math-related concepts used in machine learning. A very solid book in my view and it's heavily referenced in academia.
📘 The Elements of Statistical Learning
by Jerome H. Friedman, Robert Tibshirani, and Trevor Hastie
Mote: machine learning deals with data and in turn uncertainty which is what statistics teach. Get comfortable with topics like estimators, statistical significance,...
📘 Probability Theory: The Logic of Science
by E. T. Jaynes
Note: In machine learning, we are interested in building probabilistic models and thus you will come across concepts from probability theory like conditional probability and different probability distributions.
📺 Multivariate Calculus by Imperial College London
by Dr. Sam Cooper & Dr. David Dye
https://t.co/OYaqzlXmJG Note: backpropagation is a key algorithm for training deep neural nets that rely on Calculus. Get familiar with concepts like chain rule, Jacobian, gradient descent,.
📜 The Matrix Calculus You Need For Deep Learning
by Terence Parr & Jeremy Howard
https://t.co/Gk96dRsX5t Note: In deep learning, you need to understand a bunch of fundamental matrix operations. If you want to dive deep into the math of matrix calculus this is your guide.
📺 Mathematics for Machine Learning - Linear Algebra
by Dr. Sam Cooper & Dr. David Dye
https://t.co/lNYLiMKLma Note: a great companion to the previous video lectures. Neural networks perform transformations on data and you need linear algebra to get better intuitions.
📘 Information Theory, Inference and Learning Algorithms
by David J. C. MacKay
Note: When you are applying machine learning you are dealing with information processing which in essence relies on ideas from information theory such as entropy and KL Divergence,...