A Packages used in this book

A.1 The mdsr package

The mdsr package contains most of the small data sets used in this book that are not available in other packages. To install it from CRAN, use install.packages(). To get the latest release, use the install_github() function from the remotes package. (See Section B.4.1 for more comprehensive information about R package maintenance.)

# this command only needs to be run once
# if you want the development version

The list of data sets provided can be retrieved using the data() function.

data(package = "mdsr")

The mdsr package includes some functions that simplify a number of tasks. In particular, the dbConnect_scidb() function provides a shorthand for connecting to the public SQL server hosted by Amazon Web Services. We use this function extensively in Chapter 15 and in our classes and projects.

In keeping with best practices, mdsr no longer loads any other packages. In every chapter in this book, a call to library(tidyverse) precedes a call to library(mdsr). These two steps will set up an R session to replicate the code in the book.

A.2 Other packages

As we discuss in Chapters 1 and 21, this book is not explicitly about “big data”—it is about mastering data science techniques for small and medium data with an eye towards big data. To that end, we need medium-sized data sets to work with. We have introduced several such data sets in this book, namely airlines, fec12, and fec16.

The airlines package, which was inspired by the nycflights13 package, gives R users the ability to download the full 33 years (and counting) of flight data from the United States Bureau of Transportation Statistics and bring it seamlessly into SQL without actually having to write any SQL code. The macleish package also uses the etl framework for hourly-updated weather data from the MacLeish field station.

The full list of packages used in this book appears below in Tables A.1 and A.2.

Table A.1: List of CRAN packages used in this book.
Table A.2: List of GitHub packages used in this book.
Package GitHub User Citation Title
A.3 Further resources

More information on the mdsr package can be found at http://www.github.com/mdsr-book/mdsr.