Probability and statistics for data science : math + R + data /
Norman Matloff
- xxxii, 412 pages : illustrations ; 24 cm
- Chapman & Hall/CRC data science series .
- Series in computer science and data analysis .
Contains bibliographical references (pages 391-394) and index
Basic probability models -- Monte Carlo simulation -- Discrete random variables: expected value -- Discrete random variables: variance -- Discrete parametric distribution families -- Continuous probability models -- Statistics: prologue -- Fitting continuous models -- The family of normal distributions -- Introduction to statistical inference -- Multivariate distributions -- The multivariate normal family of distributions -- Mixture distributions -- Multivariate description and dimension reduction -- Predictive modeling -- Model parsimony and overfitting -- Introduction to discrete time Markov chains -- Appendices: A. R Quick Start -- B. Matrix algebra
"Probability and Statistics for Data Science: Math + R + Data covers "math stat"--distributions, expected value, estimation etc.--but takes the phrase "Data Science" in the title quite seriously: * Real datasets are used extensively. * All data analysis is supported by R coding. * Includes many Data Science applications, such as PCA, mixture distributions, random graph models, Hidden Markov models, linear and logistic regression, and neural networks. * Leads the student to think critically about the "how" and "why" of statistics, and to "see the big picture." * Not "theorem/proof"-oriented, but concepts and models are stated in a mathematically precise manner. Prerequisites are calculus, some matrix algebra, and some experience in programming." --Amazon.com