Applications of data mining and ‘big data’ increasingly take center stage in our modern, knowledge-driven society, supported by advances in computing power, automated data acquisition, social media development and interactive, linkable internet software. This book presents a coherent, technical introduction to modern statistical learning and analytics, starting from the core foundations of statistics and probability. It includes an overview of probability and statistical distributions, basics of data manipulation and visualization, and the central components of standard statistical inferences. The majority of the text extends beyond these introductory topics, however, to supervised learning in linear regression, generalized linear models, and classification analytics. Finally, unsupervised learning via dimension reduction, cluster analysis, and market basket analysis are introduced.
Extensive examples using actual data (with sample R programming code) are provided, illustrating diverse informatic sources in genomics, biomedicine, ecological remote sensing, astronomy, socioeconomics, marketing, advertising and finance, among many others.
This book will appeal as a classroom or training text to intermediate and advanced undergraduates, and to beginning graduate students, with sufficient background in calculus and matrix algebra. It will also serve as a source-book on the foundations of statistical informatics and data analytics to practitioners who regularly apply statistical learning to their modern data.
頁數:488
版次:第1版
年份:2015年
規格:精裝/雙色
ISBN:9781118619650
Part I Background: Introductory Statistical Analytics
1 Data analytics and data mining
2 Basic probability and statistical distributions
3 Data manipulation
4 Data visualization and statistical graphics
5 Statistical inference
Part II Statistical Learning and Data Analytics
6 Techniques for supervised learning: simple linear regression
7 Techniques for supervised learning: multiple linear regression
8 Supervised learning: generalized linear models
9 Supervised learning: classification
10 Techniques for unsupervised learning: dimension reduction
11 Techniques for unsupervised learning: clustering and association
A Matrix manipulation
B Brief introduction to R