Comparison of machine learning textbooks
This page is a comparison of machine learning textbooks, especially at the so-called introductory level. It includes books that focus on presenting multiple learning methods, and excludes books that focus solely on e.g. reinforcement learning.
The page count excludes any appendixes.
Not sure what other columns would be useful. Level of mathematical rigor? The approach taken (e.g. probably approximately correct framework)? Topics covered? Code samples (e.g. code for plots provided, or code for implementations provided, and in which language)? How amenable the book is to self-study? Ultimately what I care about is how easily I can understand the book/how much "fit" I have with the book, but this is difficult to generalize to others (who have different backgrounds and preferences).
|Machine Learning: A Probabilistic Perspective||Kevin P. Murphy||1008||"This book is suitable for upper-level undergraduate students and beginning graduate students in computer science, statistics, electrical engineering, econometrics, or any one else who has the appropriate mathematical background. Specifically, the reader is assumed to already be familiar with basic multivariate calculus, probability, linear algebra, and computer programming. Prior exposure to statistics is helpful but not necessary."|||
|Introduction to Machine Learning||Alex Smola and S.V.N. Vishwanathan||196||?|
|Understanding Machine Learning: From Theory to Algorithms||Shai Shalev-Shwartz and Shai Ben-David||368||"We made an attempt to keep the book as self-contained as possible. However, the reader is assumed to be comfortable with basic notions of probability, linear algebra, analysis, and algorithms. The first three parts of the book are intended for first year graduate students in computer science, engineering, mathematics, or statistics. It can also be accessible to undergraduate students with the adequate background. The more advanced chapters can be used by researchers intending to gather a deeper theoretical understanding."|||
|Pattern Recognition and Machine Learning||Christopher M. Bishop||676||"It is aimed at advanced undergraduates or first year PhD students, as well as researchers and practitioners, and assumes no previous knowledge of pattern recognition or machine learning concepts. Knowledge of multivariate calculus and basic linear algebra is required, and some familiarity with probabilities would be helpful though not essential as the book includes a self-contained introduction to basic probability theory."|| ,
|Introduction to Machine Leaning (second edition)||Ethem Alpaydin||516||"This is an introductory textbook, intended for senior undergraduate and graduate-level courses on machine learning, as well as engineers working in the industry who are interested in the application of these methods. The prerequisites are courses on computer programming, probability, calculus, and linear algebra. The aim is to have all learning algorithms sufficiently explained so it will be a small step from the equations given in the book to a computer program. For some cases, pseudocode of algorithms are also included to make this task easier."|
|The Elements of Statistical Learning: Data Mining, Inference, and Prediction (second edition)||Trevor Hastie, Robert Tibshirani, and Jerome Friedman||698||"This book is designed for researchers and students in a broad variety of fields: statistics, artificial intelligence, engineering, finance and others. We expect that the reader will have had at least one elementary course in statistics, covering basic topics including linear regression."|||
|An Introduction to Statistical Learning with Applications in R||Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani||418|| "One of the first books in this area—The Elements of Statistical Learning (ESL) (Hastie, Tibshirani, and Friedman)—was published in 2001, with a second edition in 2009. ESL has become a popular text not only in statistics but also in related fields. One of the reasons for ESL's popularity is its relatively accessible style. But ESL is intended for individuals with advanced training in the mathematical sciences. An Introduction to Statistical Learning (ISL) arose from the perceived need for a broader and less technical treatment of these topics. In this new book, we cover many of the same topics as ESL, but we concentrate more on the applications of the methods and less on the mathematical details. We have created labs illustrating how to implement each of the statistical learning methods using the popular statistical software package R. These labs provide the reader with valuable hands-on experience.
This book is appropriate for advanced undergraduates or master's students in statistics or related quantitative fields or for individuals in other disciplines who wish to use statistical learning tools to analyze their data."
|Foundation of Machine Learning||Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar||340||"The book is intended for students and researchers in machine learning, statistics and other related areas. It can be used as a textbook for both graduate and advanced undergraduate classes in machine learning or as a reference text for a research seminar. […] The reader is assumed to be familiar with basic concepts in linear algebra, probability, and analysis of algorithms. However, to further help him, we present in the appendix a concise linear algebra and a probability review, and a short introduction to convex optimization. We have also collected in the appendix a number of useful tools for concentration bounds used in this book."|||
|Machine Learning: A Bayesian and Optimization Perspective||Sergios Theodoridis||1012||"The book addresses the needs of advanced graduate, postgraduate, and research students as well as of practicing scientists and engineers whose interests lie beyond black-box solutions."|
|Bayesian Reasoning and Machine Learning||David Barber||618||"The book is designed to appeal to students with only a modest mathematical background in undergraduate calculus and linear algebra. No formal computer science or statistical background is required to follow the book, although a basic familiarity with probability, calculus and linear algebra would be useful. The book should appeal to students from a variety of backgrounds, including Computer Science, Engineering, applied Statistics, Physics, and Bioinformatics that wish to gain an entry to probabilistic approaches in Machine Learning. In order to engage with students, the book introduces fundamental concepts in inference using only minimal reference to algebra and calculus. More mathematical techniques are postponed until as and when required, always with the concept as primary and the mathematics secondary."|
|Information Theory, Inference, and Learning Algorithms||David J.C. MacKay||596||"This book is aimed at senior undergraduates and graduate students in Engineering, Science, Mathematics, and Computing. It expects familiarity with calculus, probability theory, and linear algebra as taught in a first- or second-year undergraduate course on mathematics for scientists and engineers."|
|A Probabilistic Theory of Pattern Recognition||Luc Devroye, László Györfi, and Gábor Lugosi||574||?|
|Learning From Data: A Short Course||Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien Lin||182||?|
|Machine Learning: The Art and Science of Algorithms that Make Sense of Data||Peter Flach||366||?|
|Machine Learning||Tom M. Mitchell||390||"Because of the interdisciplinary nature of the material, this book makes few assumptions about the background of the reader. Instead, it introduces basic concepts from statistics, artificial intelligence, information theory, and other disciplines as the need arises, focusing on just those concepts most relevant to machine learning. The book is intended for both undergraduate and graduate students in fields such as computer science, engineering, statistics, and the social sciences, and as a reference for software professionals and practitioners. Two principles that guided the writing of the book were that it should be accessible to undergraduate students and that it should contain the material I would want my own Ph.D. students to learn before beginning their doctoral research in machine learning."|||
|Statistical Learning Theory (course notes for CS229T/Stats231 at Stanford)||Percy Liang||210||Understanding of machine learning, linear algebra, and probability. Knowledge of convex optimization also helpful. |||