Teaching Stats for Data Science

Authors
Citation
Kaplan Daniel, Teaching Stats for Data Science, American statistician , 72(1), 2018, pp. 89-96
Journal title
ISSN journal
00031305
Volume
72
Issue
1
Year of publication
2018
Pages
89 - 96
Database
ACNP
SICI code
Abstract
Data science. is a useful catchword for methods and concepts original to the field of statistics, but typically being applied to large, multivariate, observational records. Such datasets call for techniques not often part of an introduction to statistics: modeling, consideration of covariates, sophisticated visualization, and causal reasoning. This article re-imagines introductory statistics as an introduction to data science and proposes a sequence of 10 blocks that together compose a suitable course for extracting information from contemporary data. Recent extensions to the mosaic packages for R together with tools from the .tidyverse. provide a concise and readable notation for wrangling, visualization, model-building, and model interpretation: the fundamental computational tasks of data science.