# Author Rafael A Irizarry

If you don’t have any experience with R, this is an excellent way to start. Rafael’s explanation of R is friendly and provocative to keep the lector engaged. Certainly, it covers all the core topics and skills that a data scientist must have.

“This book is meant to be a textbook for a first course in Data Science. No previous knowledge of R is necessary, although some experience with programming may be helpful. The statistical concepts used to answer the case study questions are only briefly introduced, so a Probability and Statistics textbook is highly recommended for in-depth understanding of these concepts. If you read and understand all the chapters and complete all the exercises, you will be well-positioned to perform basic data analysis tasks and you will be prepared to learn the more advanced concepts and skills needed to become an expert.”

Lectures |
---|

Part 1: Basics of R and the tidyverse |

Learn R throughout the book |

Building blocks needed to keep learning |

Part 2: Data visualization with ggplot2 |

Use ggplot2 to generate graphs |

Describe important data visualization principles |

Part 3: Statistics with R |

Answer case study questions using probability, inference, and regression |

Demonstrate the importance of statistics in data analysis |

Part 4: Data wrangling with tidyverse |

Familiarize the reader with data wrangling |

Specific skills include web scraping, using regular expressions, and joining and reshaping data tables |

Part 5: Machine learning with caret |

Introduce machine learning through challenges |

Use the caret package to build prediction algorithms including K-nearest neighbors and random forests |

Part 6: Productivity tools for data science |

Brief introduction to productivity tools used in data science projects |

Tools include RStudio, UNIX/Linux shell, Git and GitHub, and knitr and R Markdown. |