References

Aden-Buie, Garrick. 2022. Xaringanthemer: Custom Xaringan CSS Themes.
Alberani, D. 2014. “IMDbPY.”
Albert, James, Max Marchi, and Benjamin S. Baumer. 2018. Analyzing Baseball Data with R. 2nd ed. CRC Press: Boca Raton, FL. https://www.crcpress.com/Analyzing-Baseball-Data-with-R-Second-Edition/Marchi-Albert-Baumer/p/book/9780815353515.
Albert, Jim. 2003. Teaching Statistics Using Baseball. Mathematical Association of America: Washington, DC.
Albert, Jim, and Jingchen Hu. 2019. Probability and Bayesian Modeling. CRC Press.
Allaire, J. J., J. Horner, V. Marti, and N. Porte. 2014. Markdown: Markdown Rendering for r. http://CRAN.R-project.org/package=markdown.
Allaire, JJ, Joe Cheng, Yihui Xie, Jonathan McPherson, Winston Chang, Jeff Allen, Hadley Wickham, Aron Atkins, and Rob Hyndman. 2020. Rmarkdown: Dynamic Documents for R. https://CRAN.R-project.org/package=rmarkdown.
Allaire, JJ, Yihui Xie, Christophe Dervieux, Jonathan McPherson, Javier Luraschi, Kevin Ushey, Aron Atkins, et al. 2023. Rmarkdown: Dynamic Documents for r.
American Statistical Association Undergraduate Guidelines Workgroup. 2014. 2014 Curriculum Guidelines for Undergraduate Programs in Statistical Science. American Statistical Association.
Angwin, Julia, Jeff Larson, Surya Mattu, and Lauren Kirchner. 2016. “Machine Bias.” ProPublica.
Arnold, Jeffrey B. 2019. ggthemes: Extra Themes, Scales and Geoms for ggplot2. http://CRAN.R-project.org/package=ggthemes.
———. 2021. Ggthemes: Extra Themes, Scales and Geoms for Ggplot2. https://github.com/jrnold/ggthemes.
Bache, Stefan Milton, and Hadley Wickham. 2022. Magrittr: A Forward-Pipe Operator for r.
Ball, Richard, and Norm Medeiros. 2012. “Teaching Integrity in Empirical Research: A Protocol for Documenting Data Management and Analysis.” The Journal of Economic Education 43 (2): 182–89.
Barabási, A.-L., and R. Albert. 1999. “Emergence of Scaling in Random Networks.” Science 286 (5439): 509–12.
Barabási, A.-L., and J. Frangos. 2014. Linked: The New Science of Networks. Basic Books: New York.
Basu, Prithwish, Ben S. Baumer, Amotz Bar-Noy, Chi-Kin Chau, and Masdar City. 2015. “Social-Communication Composite Networks.” In Opportunistic Mobile Social Networks, edited by Jie Wu and Yunsheng Wang, 1–36. CRC Press: Boca Raton.
Baumer, B. S. 2015. “A Data Science Course for Undergraduates: Thinking with Data.” The American Statistician 69 (4): 334–42. https://doi.org/10.1080/00031305.2015.1081105.
Baumer, B. S., M. Çetinkaya-Rundel, A. Bray, L. Loi, and N. J. Horton. 2014. “R Markdown: Integrating a Reproducible Analysis Tool into Introductory Statistics.” Technology Innovations in Statistics Education 8 (1).
Baumer, B. S., and A. Zimbalist. 2014. The Sabermetric Revolution: Assessing the Growth of Analytics in Baseball. University of Pennsylvania Press: Philadelphia, PA.
Baumer, Ben. 2015. “In a Moneyball world, a Number of Teams Remain Slow to Buy into Sabermetrics.” In The Great Analytics Rankings, edited by Royce Webb. ESPN.com; ESPN.com.
Baumer, Ben S., George Rabanca, Amotz Bar-Noy, and Prithwish Basu. 2015. “Star Search: Effective Subgroups in Collaborative Social Networks.” In Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2015, Paris, France, August 25–28, 2015, edited by Jian Pei, Fabrizio Silvestri, and Jie Tang, 729–36. ACM. https://doi.org/10.1145/2808797.2810062.
Baumer, Ben, Prithwish Basu, and Amotz Bar-Noy. 2011. “Modeling and Analysis of Composite Network Embeddings.” In MSWiM, edited by Ahmed Helmy, Björn Landfeldt, and Luciano Bononi, 341–50. ACM.
Baumer, Benjamin S. 2021. Etl: Extract-Transform-Load Framework for Medium Data. https://github.com/beanumber/etl.
Baumer, Benjamin S., Randi L. Garcia, Albert Y. Kim, Katherine M. Kinnaird, and Miles Q. Ott. 2022. Integrating Data Science Ethics Into an Undergraduate Major: A Case Study.” Journal of Statistics and Data Science Education 30 (1): 15–28. https://doi.org/10.1080/26939169.2022.2038041.
Baumer, Benjamin S., and Eva Gjekmarkaj. 2017. Fec: Campaign Finance for Federal Elections. http://github.com/beanumber/fec.
Baumer, Benjamin S., Rose Goueth, Wencong Li, Dominique Kelly, and Albert Y. Kim. 2022. Macleish: Retrieve Data from MacLeish Field Station. https://github.com/beanumber/macleish.
Baumer, Benjamin S., Nicholas Horton, and Daniel Kaplan. 2023. Mdsr: Complement to Modern Data Science with r. https://github.com/mdsr-book/mdsr.
Baumer, Benjamin S., Yijin Wei, and Gary S. Bloom. 2016. “The Smallest Non-Autograph.” Discussiones Mathematicae Graph Theory 36 (3): 577–602. https://doi.org/10.7151/dmgt.1881.
Beck, Marcus W. 2022. NeuralNetTools: Visualization and Analysis Tools for Neural Networks.
Beckman, Matthew D., Mine Çetinkaya-Rundel, Nicholas J. Horton, Colin W. Rundel, Adam J. Sullivan, and Maria Tackett. 2021. “Implementing Version Control with Git and GitHub as a Learning Objective in Statistics and Data Science Courses.” Journal of Statistics and Data Science Education 29 (sup1): S132–44. https://doi.org/10.1080/10691898.2020.1848485.
Belle, G. van. 2008. Statistical Rules of Thumb (Second Edition). John Wiley & Sons: Hoboken, NJ.
Bengtsson, Henrik. 2023. Future: Unified Parallel and Distributed Processing in r for Everyone.
Benoit, Kenneth, David Muhr, and Kohei Watanabe. 2021. Stopwords: Multilingual Stopword Lists. https://github.com/quanteda/stopwords.
Bivand, Roger S, Edzer Pebesma, and Virgilio Gómez-Rubio. 2013. Applied Spatial Data Analysis with r (Second Edition). Springer Verlag: New York, NY. http://www.asdar-book.org/.
Bivand, Roger, Tim Keitt, and Barry Rowlingson. 2023. Rgdal: Bindings for the Geospatial Data Abstraction Library.
Bogdanov, Petko, Ben S. Baumer, Prithwish Basu, Amotz Bar-Noy, and Ambuj K. Singh. 2013. “As Strong as the Weakest Link: Mining Diverse Cliques in Weighted Graphs.” In Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2013, Prague, Czech Republic, September 23–27, 2013, Proceedings, Part I, edited by Hendrik Blockeel, Kristian Kersting, Siegfried Nijssen, and Filip Zelezný, 8188:525–40. Lecture Notes in Computer Science. Springer Verlag: New York, NY. https://doi.org/10.1007/978-3-642-40988-2_34.
Bombardier, C., L. Laine, A. Reicin, D. Shapiro, R. Burgos-Vargas, B. Davis, R. Day, et al. 2000. “Comparison of Upper Gastrointestinal Toxicity of Rofecoxib and Naproxen in Patients with Rheumatoid Arthritis.” New England Journal of Medicine 343: 1520–28.
Breiman, Leo. 2001. “Statistical Modeling: The Two Cultures.” Statistical Science 16 (3): 199–215.
Breiman, Leo, Adele Cutler, Andy Liaw, and Matthew Wiener. 2022. randomForest: Breiman and Cutler’s Random Forests for Classification and Regression. https://www.stat.berkeley.edu/~breiman/RandomForests/.
Brewer, C. A. 1994. “Color Use Guidelines for Mapping and Visualization.” Visualization in Modern Cartography 2: 123–48.
———. 1999. “Color Use Guidelines for Data Representation.” In Proceedings of the Section on Statistical Graphics, American Statistical Association, 55–60.
Bridgeford, Lydell C. 2014. Q&A: Statistical Proof of Discrimination Isn’t Static.”
Broman, Karl W., and Kara H. Woo. 2018. “Data Organization in Spreadsheets.” The American Statistician 72 (1): 2–10. https://doi.org/10.1080/00031305.2017.1375989.
Brownrigg, Ray. 2022. Maps: Draw Geographical Maps.
Bryan, Jennifer. 2023. Googlesheets4: Access Google Sheets Using the Sheets API V4.
Bryan, Jennifer, the STAT 545 TAs, and Jim Hester. 2018. Happy Git and GitHub for the useR. GitHub. https://happygitwithr.com/.
Cambon, Jesse, Diego Hernangómez, Christopher Belanger, and Daniel Possenriede. 2021. Tidygeocoder: Geocoding Made Easy.
Cannon, Ann R, George W Cobb, Bradley A Hartlaub, Julie M Legler, Robin H Lock, Thomas L Moore, Allan J Rossman, and Jeffrey Witmer. 2019. STAT2: Building Models for a World of Data (Second Edition). W. H. Freeman; Company: New York, NY.
CERN. 2008. LHC Guide: A Collection of Facts and Figures about the Large Hadron Collider (LHC) in the Form of Questions and Answers.” CERN.
Çetinkaya-Rundel, M., and Johanna Hardin. 2021. Introduction to Modern Statistics. OpenIntro.org.
Çetinkaya-Rundel, Mine, Johanna Hardin, Benjamin S. Baumer, Amelia McNamara, Nicholas J. Horton, and Colin W. Rundel. 2022. “An Educator’s Perspective of the Tidyverse.” Technology Innovations in Statistics Education. https://escholarship.org/uc/item/7kk4d922.
Chamandy, N., O. Muraldharan, and S. Wager. 2015. “Teaching Statistics at Google-Scale.” The American Statistician 69 (4): 283–91.
Chang, Winston. 2023a. Extrafont: Tools for Using Fonts. https://github.com/wch/extrafont.
———. 2023b. Webshot: Take Screenshots of Web Pages.
Chang, Winston, Joe Cheng, JJ Allaire, Carson Sievert, Barret Schloerke, Yihui Xie, Jeff Allen, Jonathan McPherson, Alan Dipert, and Barbara Borges. 2023. Shiny: Web Application Framework for r.
Cheng, Joe, Barret Schloerke, Bhaskar Karambelkar, and Yihui Xie. 2023. Leaflet: Create Interactive Web Maps with the JavaScript Leaflet Library.
Cleveland, W. S. 2001. “Data Science: An Action Plan for Expanding the Technical Areas of the Field of Statistics.” International Statistical Review 69 (1): 21–26.
Cleveland, W. S., and R. McGill. 1984. “Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods.” Journal of the American Statistical Association 79 (387): 531–54.
Cobb, George W. 2007. “The Introductory Statistics Course: A Ptolemaic Curriculum?” Technology Innovations in Statistics Education (TISE) 1 (1).
———. 2015. “Mere Renovation Is Too Little Too Late: We Need to Rethink Our Undergraduate Curriculum from the Ground Up.” The American Statistician 69 (4): 266–82.
Columbus, Louis. 2019. “Data Scientist Leads 50 Best Jobs in America for 2019 According to Glassdoor.” Forbes.com.
Committee on Professional Ethics. 1999. Ethical Guidelines for Statistical Practice. American Statistical Association.
Cook, R. D. 1982. Residuals and Influence in Regression. Chapman & Hall, London.
Cressie, N. 1993. Statistics for Spatial Data. John Wiley & Sons: Hoboken, NJ.
Csárdi, Gábor, Jim Hester, Hadley Wickham, Winston Chang, Martin Morgan, and Dan Tenenbaum. 2023. Remotes: R Package Installation from Remote Repositories, Including GitHub.
Csárdi, Gábor, Tamás Nepusz, Vincent Traag, Szabolcs Horvát, Fabio Zanini, Daniel Noom, and Kirill Müller. 2023. Igraph: Network Analysis and Visualization.
D’Ignazio, Catherine, and Lauren F. Klein. 2020. Data Feminism. MIT Press: Cambridge, MA. https://datafeminism.io.
De Veaux, Richard D, Mahesh Agarwal, Maia Averett, Benjamin S Baumer, Andrew Bray, Thomas C Bressoud, Lance Bryant, et al. 2017. “Curriculum Guidelines for Undergraduate Programs in Data Science.” Annual Review of Statistics and Its Application 4: 15–30. https://doi.org/10.1146/annurev-statistics-060116-053930.
Diez, D. M., C. D. Barr, and M. Çetinkaya-Rundel. 2019. OpenIntro Statistics (Fourth Edition). OpenIntro.org.
Donoho, David. 2017. “50 Years of Data Science.” Journal of Computational and Graphical Statistics 26 (4): 745–66. https://doi.org/10.1080/10618600.2017.1384734.
Dunnington, Dewey. 2023. Ggspatial: Spatial Data Framework for Ggplot2.
Easley, David, and Jon Kleinberg. 2010. Networks, Crowds, and Markets: Reasoning about a Highly Connected World. Cambridge University Press: Cambridge, UK.
Eddelbuettel, Dirk, Romain Francois, JJ Allaire, Kevin Ushey, Qiang Kou, Nathan Russell, Inaki Ucar, Douglas Bates, and John Chambers. 2023. Rcpp: Seamless r and c++ Integration.
Editorial. 2013. “Announcement: Reducing Our Irreproducibility.” Nature 496. https://doi.org/doi:10.1038/496398a.
Efron, Bradley. 2020. “Prediction, Estimation, and Attribution.” Journal of the American Statistical Association 115 (530): 636–55.
Efron, Bradley, and Trevor Hastie. 2016. Computer Age Statistical Inference: Algorithms, Evidence, and Data Science. Cambridge University Press: Cambridge, UK.
Efron, B., and R. J. Tibshirani. 1993. An Introduction to the Bootstrap. Chapman & Hall: London.
Ellenberg, Jonas H. 1983. “Ethical Guidelines for Statistical Practice: A Historical Perspective.” The American Statistician 37 (1): 1–4.
Engel, Claudia A. 2019. R for Geospatial Analysis and Mapping. The Geographic Information Science & Technology Body of Knowledge. https://doi.org/10.22224/gistbok/2019.1.3.
Engstrom, Richard L, and John K Wildgen. 1977. “Pruning Thorns from the Thicket: An Empirical Test of the Existence of Racial Gerrymandering.” Legislative Studies Quarterly, 465–79.
Erdős, P., and A. Rényi. 1959. “On Random Graphs.” Publicationes Mathematicae Debrecen 6: 290–97.
Euler, Leonhard. 1953. Leonhard Euler and the Königsberg Bridges.” Scientific American 189 (1): 66–70.
Feinerer, Ingo, and Kurt Hornik. 2023. Tm: Text Mining Package. https://tm.r-forge.r-project.org/.
Fellows, Ian. 2018. Wordcloud: Word Clouds.
Finzer, W. 2013. “The Data Science Education Dilemma.” Technology Innovations in Statistics Education 7 (2).
Firke, Sam. 2023. Janitor: Simple Tools for Examining and Cleaning Dirty Data.
Fox, J. 2009. “Aspects of the Social Organization and Trajectory of the R Project.” The R Journal 1 (2): 5–13. http://journal.r-project.org/archive/2009-2/RJournal_2009-2_Fox.pdf.
Fraley, Chris, Adrian E. Raftery, and Luca Scrucca. 2022. Mclust: Gaussian Mixture Modelling for Model-Based Clustering, Classification, and Density Estimation. https://mclust-org.github.io/mclust/.
Friedman, Jerome, Trevor Hastie, Rob Tibshirani, Balasubramanian Narasimhan, Kenneth Tay, Noah Simon, and James Yang. 2023. Glmnet: Lasso and Elastic-Net Regularized Generalized Linear Models. https://glmnet.stanford.edu.
Friendly, Michael, Chris Dalzell, Martin Monkman, and Dennis Murphy. 2023. Lahman: Sean Lahman Baseball Database. https://CRAN.R-project.org/package=Lahman.
Futschek, Gerald. 2006. “Algorithmic Thinking: The Key for Understanding Computer Science.” In International Conference on Informatics in Secondary Schools-Evolution and Perspectives, 159–68. Springer.
Gandrud, C. 2014. Reproducible Research with R and RStudio. CRC Press: Boca Raton, FL.
Ganz, Carl, Gábor Csárdi, Jim Hester, Molly Lewis, and Rachael Tatman. 2022. Available: Check If the Title of a Package Is Available, Appropriate and Interesting. https://github.com/r-lib/available.
Garey, Michael R, and David S Johnson. 1979. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman; Company: New York, NY.
Garnier, Simon. 2023a. Viridis: Colorblind-Friendly Color Maps for r.
———. 2023b. viridisLite: Colorblind-Friendly Color Maps (Lite Version).
Gelman, Andrew. 2011. “Ethics and Statistics: Open Data and Open Methods.” Chance 24 (4): 51–53.
———. 2012. “Ethics and Statistics: Ethics and the Statistical Use of Prior Information.” Chance 25 (4): 52–54.
———. 2020. “Ethics and Statistics: Statistics as Squid Ink.” Chance 33 (2): 25–27.
Gelman, Andrew, Jeffrey Fagan, and Alex Kiss. 2007. “An Analysis of the New York City Police Department’s “Stop-and-Frisk" Policy in the Context of Claims of Racial Bias.” Journal of the American Statistical Association 102 (479): 813–23. https://doi.org/10.1198/016214506000001040.
Gelman, Andrew, and Eric Loken. 2012. “Ethics and Statistics: Statisticians: When We Teach, We Don’t Practice What We Preach.” Chance 25 (1): 47–48.
Gelman, Andrew, and Aki Vehtari. 2021. “What Are the Most Important Statistical Ideas of the Past 50 Years?” Journal of the American Statistical Association 116 (536): 2087–97. https://doi.org/10.1080/01621459.2021.1938081.
Gelman, A., C. Pasarica, and R. Dodhia. 2002. “Let’s Practice What We Preach: Turning Tables into Graphs.” The American Statistician 56 (2): 121–30.
Glickman, Mark, Jason Brown, and Ryan Song. 2019. (A) Data in the Life: Authorship Attribution in Lennon-McCartney Songs.” Harvard Data Science Review 1 (1). https://doi.org/10.1162/99608f92.130f856e.
Good, Phillip I, and James W Hardin. 2012. Common Errors in Statistics (and How to Avoid Them). John Wiley & Sons: Hoboken, NJ.
Graphodatsky, Alexander S, Vladimir A Trifonov, and Roscoe Stanyon. 2011. “The Genome Diversity and Karyotype Evolution of Mammals.” Molecular Cytogenetics 4 (1): 1.
Green, J. L., and E. E. Blankenship. 2015. “Fostering Conceptual Understanding in Mathematical Statistics.” The American Statistician 69 (4): 315–25.
Hardin, J., R. Hoerl, N. J. Horton, D. Nolan, B. S. Baumer, O. Hall-Holt, P. Murrell, et al. 2015. “Data Science in Statistics Curricula: Preparing Students to ’Think with Data’.” The American Statistician 69 (4): 343–53.
Harrell, Frank E, Jr. 2023. Hmisc: Harrell Miscellaneous. https://hbiostat.org/R/Hmisc/.
Hastie, Trevor, Robert Tibshirani, and J Jerome H Friedman. 2009. The Elements of Statistical Learning. 2nd ed. Springer Verlag: New York, NY.
Henry, Lionel. 2020. “Interactivity and Programming in the Tidyverse.” RStudio::conf 2020. https://rstudio.com/resources/rstudioconf-2020/interactivity-and-programming-in-the-tidyverse/.
Henry, Lionel, and Hadley Wickham. 2023. Rlang: Functions for Base Types and Core r and Tidyverse Features.
Herndon, Thomas, Michael Ash, and Robert Pollin. 2014. “Does High Public Debt Consistently Stifle Economic Growth? A Critique of Reinhart and Rogoff.” Cambridge Journal of Economics 38 (2): 257–79. https://umassmed.edu/uploadedFiles/QHS/Camb.%20J.%20Econ.-2013-Herndon-cje-bet075.pdf.
Hester, Jim, and Davis Vaughan. 2023. Bench: High Precision Timing of r Expressions.
Hester, Jim, Hadley Wickham, and Gábor Csárdi. 2023. Fs: Cross-Platform File System Operations Based on Libuv.
Hesterberg, T. 2015. “What Teachers Should Know about the Bootstrap: Resampling in the Undergraduate Statistics Curriculum.” The American Statistician 69 (4): 371–86.
Hesterberg, T. C., D. S. Moore, S. Monaghan, A. Clipson, and R. Epstein. 2005. Bootstrap Methods and Permutation Tests. W. H. Freeman; Company: New York, NY. http://bcs.whfreeman.com/ips5e/content/cat_080/pdf/moore14.pdf.
Hoaglin, David C. 2016. “Regressions Are Commonly Misinterpreted.” Stata Journal 16 (1): 5–22.
Hodge, Jonathan K, Emily Marshall, and Geoff Patterson. 2010. “Gerrymandering and Convexity.” The College Mathematics Journal 41 (4): 312–24.
Horton, N. J. 2013. “I Hear, I Forget. I Do, I Understand: A Modified Moore-Method Mathematical Statistics Course.” The American Statistician 67 (3): 219–28.
———. 2015. “Challenges and Opportunities for Statistics and Statistical Education: Looking Back, Looking Forward.” The American Statistician 69 (2): 138–45.
Horton, N. J., B. S. Baumer, and H. Wickham. 2015. “Setting the Stage for Data Science: Integration of Data Management Skills in Introductory and Second Courses in Statistics.” Chance 28 (2).
Horton, N. J., E. R. Brown, and L. Qian. 2004. “Use of R as a Toolbox for Mathematical Statistics Exploration.” The American Statistician 58 (4): 343–57.
Horton, N. J., and J. S. Hardin. 2015. “Teaching the Next Generation of Statistics Students to “Think with Data": Special Issue on Statistics and the Undergraduate Curriculum.” The American Statistician 69 (4): 259–65.
Horton, N. J., and K. P. Kleinman. 2007. “Much Ado about Nothing: A Comparison ofmissing Data Methods and Software to Fit Incomplete Data Regression Models.” The American Statistician 61: 79–90.
Horton, Nicholas J., Rohan Alexander, Micaela Parker, Aneta Piekut, and Colin Rundel. 2022. “The Growing Importance of Reproducibility and Responsible Workflow in the Data Science and Statistics Curriculum.” Journal of Statistics and Data Science Education 30 (3): 207–8. https://doi.org/10.1080/26939169.2022.2141001.
Hothorn, Torsten, and Achim Zeileis. 2023. Partykit: A Toolkit for Recursive Partytioning. http://partykit.r-forge.r-project.org/partykit/.
Howe, Bill. 2014. “Data Manipulation at Scale: Systems and Algorithms.” University of Washington; Coursera. https://www.coursera.org/learn/data-manipulation.
Hubert, Lawrence, and Howard Wainer. 2012. A Statistical Guide for the Ethically Perplexed. CRC Press: Boca Raton, FL.
Huff, Darrell. 1954. How to Lie with Statistics. W.W. Norton & Company: New York, NY.
Hvitfeldt, Emil. 2022. Textdata: Download and Load Various Text Datasets. https://github.com/EmilHvitfeldt/textdata.
Hvitfeldt, Emil, and Max Kuhn. 2023. Discrim: Model Wrappers for Discriminant Analysis.
Hyafil, Laurent, and Ronald L Rivest. 1976. “Constructing Optimal Binary Decision Trees Is NP-Complete.” Information Processing Letters 5 (1): 15–17.
Ihaka, R., and R. Gentleman. 1996. “R: A Language for Data Analysis and Graphics.” Journal of Computational and Graphical Statistics 5 (3): 299–314.
Imai, Kosuke, and Kabir Khanna. 2016. “Improving Ecological Inference by Predicting Individual Ethnicity from Voter Registration Records.” Political Analysis, 263–72. https://doi.org/10.1093/pan/mpw001.
IMDB.com. 2013. “Internet Movie Database.”
Ioannidis, John PA. 2005. “Why Most Published Research Findings Are False.” Chance 18 (4): 40–47.
James, B. 1986. The Bill James Historical Baseball Abstract. Random House: New York, NY.
James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An Introduction to Statistical Learning. Springer Verlag: New York, NY.
Jeppson, Haley, Heike Hofmann, and Di Cook. 2021. Ggmosaic: Mosaic Plots in the Ggplot2 Framework. https://github.com/haleyjeppson/ggmosaic.
Kern, Silke, Ingmar Skoog, Anne Börjesson-Hanson, Kaj Blennow, Henrik Zetterberg, Svante Östling, Jürgen Kern, et al. 2014. “Higher CSF Interleukin-6 and CSF Interleukin-8 in Current Depression in Older Women. Results from a Population-Based Sample.” Brain, Behavior, and Immunity 41: 55–58.
Kern, Silke, Ingmar Skoog, Anne Börjesson-Hanson, S Östling, J Kern, P Gudmundsson, T Marlow, et al. 2013. “Retraction Notice to “Lower CSF Interleukin-6 Predicts Future Depression in a Population-Based Sample of Older Women Followed for 17 Years".” Brain, Behavior, and Immunity 32: 153–58.
Khanna, Kabir, Brandon Bertelsen, Santiago Olivella, Evan Rosenman, and Kosuke Imai. 2022. Wru: Who Are You? Bayesian Prediction of Racial Category Using Surname, First Name, Middle Name, and Geolocation. https://github.com/kosukeimai/wru.
Kim, Albert Y, and Adriana Escobedo-Land. 2015. “OkCupid Data for Introductory Statistics and Data Science Courses.” Journal of Statistics Education 23 (2). https://doi.org/10.1080/10691898.2015.11889737.
Kirkegaard, Emil OW, and Julius D Bjerrekær. 2016. “The OKCupid Dataset: A Very Large Public Dataset of Dating Site Users.” Open Differential Psychology 46: 1–10.
Kline, K. E, D. Kline, B. Hunt, and D. Heymann-Reder. 2008. SQL in a Nutshell. O’Reilly Media: Sebastopol, CA.
Knuth, D. 1992. “Literate Programming.” CSLI Lecture Notes, Stanford University 27.
Kuhn, Max. 2020. “CRAN Task View: Reproducible Research.” https://CRAN.R-project.org/view=ReproducibleResearch.
Kuhn, Max, and Davis Vaughan. 2023. Parsnip: A Common API to Modeling and Analysis Functions.
Kuhn, Max, Davis Vaughan, and Emil Hvitfeldt. 2023. Yardstick: Tidy Characterizations of Model Performance.
Kuhn, Max, and Hadley Wickham. 2023. Tidymodels: Easily Install and Load the Tidymodels Packages.
Kuiper, Shonda, and Jeff Sklar. 2012. Practicing Statistics: Guided Investigations for the Second Course. Pearson Education: New York, NY.
Laney, D. 2001. “3D Data Management: Controlling Data Volume, Velocity and Variety.” META Group Research Note 6: 70.
Lewis, M. 2003. Moneyball: The Art of Winning an Unfair Game. W.W. Norton & Company: New York, NY.
Little, R J A, and D B Rubin. 2002. Statistical Analysis with Missing Data (Second Edition). John Wiley & Sons: Hoboken, NJ.
Lovelace, Robin, Jakub Nowosad, and Jannes Muenchow. 2019. Geocomputation with R. CRC Press: Boca Raton, FL. https://geocompr.robinlovelace.net/.
Ludwig, Lew. 2012. “Technically Speaking.”
Lumley, Thomas. 2020. Biglm: Bounded Memory Linear and Generalized Linear Models.
Lunzer, Aran, and Amelia McNamara. 2017. “Introduction to the "Exploring Histograms" Online Essay.” Viewpoints Research Institute. http://www.vpri.org/pdf/rn2017003_histogram-intro.pdf.
Luraschi, Javier, Kevin Kuo, Kevin Ushey, JJ Allaire, Hossein Falaki, Lu Wang, Andy Zhang, Yitao Li, Edgar Ruiz, and The Apache Software Foundation. 2023. Sparklyr: R Interface to Apache Spark. https://spark.rstudio.com/.
Mackenzie, John. 2009. “Gerrymandering and Legislator Efficiency.” University of Delaware.
Marchi, M., and J. Albert. 2013. Analyzing Baseball Data with r. CRC Press: Boca Raton, FL.
McCallum, E., and S. Weston. 2011. Parallel r. O’Reilly Media: Sebastopol, CA.
McIlroy, Doug, Ray Brownrigg, Thomas P Minka, and Roger Bivand. 2023. Mapproj: Map Projections.
Mosteller, F. 1987. Fifty Challenging Problems in Probability with Solutions. Dover Publications: Mineola, NY.
Mosteller, Frederick, and David L Wallace. 1963. “Inference in an Authorship Problem: A Comparative Study of Discrimination Methods Applied to the Authorship of the Disputed Federalist Papers.” Journal of the American Statistical Association 58 (302): 275–309.
Müller, Kirill. 2020. Here: A Simpler Way to Find Your Files.
Müller, Kirill, and Lorenz Walthert. 2023. Styler: Non-Invasive Pretty Printing of r Code.
National Academies of Science, Engineering, and Medicine. 2018. “Data Science for Undergraduates: Opportunities and Options.” National Academies.
Neuwirth, Erich. 2022. RColorBrewer: ColorBrewer Palettes.
Nielsen, Finn Årup. 2011. “A New ANEW: Evaluation of a Word List for Sentiment Analysis in Microblogs.” CoRR abs/1103.2903. http://arxiv.org/abs/1103.2903.
Niemi, Richard G, Bernard Grofman, Carl Carlucci, and Thomas Hofeller. 1990. “Measuring Compactness and the Role of a Compactness Standard in a Test for Partisan and Racial Gerrymandering.” The Journal of Politics 52 (4): 1155–81.
Nolan, Deborah, and Duncan Temple Lang. 2010. “Computing in the Statistics Curricula.” The American Statistician 64 (2): 97–107. https://doi.org/10.1198/tast.2010.09132.
Nolan, D., and J. Perrett. 2016. “Teaching and Learning Data Visualization: Ideas and Assignments.” The American Statistician 70 (3): 260–69.
Nolan, D., and T. P. Speed. 1999. “Teaching Statistics Theory Through Applications.” The American Statistician 53: 370–75.
Nuzzo, R. 2014. Scientific method: Statistical errors.” Nature 506: 150–52.
O’Neil, C. 2016. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. Crown Publishing, New York, NY.
Ohm, Paul. 2010. “Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization.” UCLA Law Review 57: 1701. http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1450006.
Olford, R. W., and W. H. Cherry. 2003. “Picturing Probability: The Poverty of Venn Diagrams, the Richness of Eikosograms.” unpublished manuscript.
Ooms, Jeroen. 2023. Jsonlite: A Simple and Robust JSON Parser and Generator for r. https://jeroen.r-universe.dev/jsonlite https://arxiv.org/abs/1403.2805.
Ooms, Jeroen, David James, Saikat DebRoy, Hadley Wickham, and Jeffrey Horner. 2022. RMySQL: Database Interface and MySQL Driver for r.
Page, Lawrence, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. “The PageRank Citation Ranking: Bringing Order to the Web.” Stanford University InfoLab.
Paradis, Emmanuel, Simon Blomberg, Ben Bolker, Joseph Brown, Santiago Claramunt, Julien Claude, Hoa Sien Cuong, et al. 2023. Ape: Analyses of Phylogenetics and Evolution.
Pebesma, Edzer. 2023. Sf: Simple Features for r.
Pebesma, Edzer, and Roger Bivand. 2023. Sp: Classes and Methods for Spatial Data.
Pebesma, Edzer, Thomas Mailund, Tomasz Kalinowski, and Iñaki Ucar. 2023. Units: Measurement Units for r Vectors.
Pedersen, Thomas Lin. 2022a. Ggraph: An Implementation of Grammar of Graphics for Graphs and Networks.
———. 2022b. Transformr: Polygon and Path Transformations. https://github.com/thomasp85/transformr.
———. 2023a. Patchwork: The Composer of Plots.
———. 2023b. Tidygraph: A Tidy API for Graph Manipulation.
Pedersen, Thomas Lin, and David Robinson. 2022. Gganimate: A Grammar of Animated Graphics.
Pierson, S. 2016. “Jordan Urges Both Computational and Inferential Thinking in Data Science.” Amstat News.
Provost, F., and T. Fawcett. 2013. “Data Science and Its Relationship to Big Data and Data-Driven Decision Making.” Big Data 1 (1): 51–59.
Pruim, Randall. 2015. NHANES: Data from the US National Health and Nutrition Examination Study.
Pruim, Randall, Maria-Cristiana Gîrjău, and Nicholas Jon Horton. 2023. “Fostering Better Coding Practices for Data Scientists.” Harvard Data Science Review 5 (3). https://doi.org/10.1162/99608f92.97c9f60f.
Pruim, Randall, Daniel T. Kaplan, and Nicholas J. Horton. 2022a. Mosaic: Project MOSAIC Statistics and Mathematics Teaching Utilities.
Pruim, Randall, Daniel Kaplan, and Nicholas Horton. 2022b. mosaicData: Project MOSAIC Data Sets. https://github.com/ProjectMOSAIC/mosaicData.
R Core Team. 2020. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.
R Special Interest Group on Databases (R-SIG-DB), Hadley Wickham, and Kirill Müller. 2022. DBI: R Database Interface.
Raghunathan, T. E. 2004. What do we do with missing data? Some options for analysis of incomplete data.” Annual Review of Public Health 25: 99–117.
Ram, Karthik, and Karl Broman. 2021. aRxiv: Interface to the arXiv API.
Ram, K., and H. Wickham. 2018. Wesanderson: A Wes Anderson Palette Generator. http://CRAN.R-project.org/package=wesanderson.
Rice, J. A. 2006. Mathematical Statistics and Data Analysis (Third Edition). Cengage Learning: Boston, MA.
Rizzo, M. L. 2019. Statistical Computing with R (Second Edition). CRC Press: Boca Raton, FL.
Robinson, David, Alex Hayes, and Simon Couch. 2023. Broom: Convert Statistical Objects into Tidy Tibbles.
Robinson, David, and Julia Silge. 2023. Tidytext: Text Mining Using Dplyr, Ggplot2, and Other Tidy Tools. https://github.com/juliasilge/tidytext.
Rogoff, Kenneth, and Carmen Reinhart. 2010. “Growth in a Time of Debt.” American Economic Review 100 (2): 573–78. http://www.nber.org/papers/w15639.
Romano, J. P., and A. F. Siegel. 1986. Counterexamples in Probability and Statistics. Cengage Learning: Boston, MA.
Roose, Kevin. 2013. “Meet the 28-Year-Old Grad Student Who Just Shook the Global Austerity Movement.” New York Magazine. http://nymag.com/daily/intelligencer/2013/04/grad-student-who-shook-global-austerity-movement.html.
Rosling, O., A. R. Rönnlund, and H. Rosling. 2005. “Gapminder.” Gapminder.org.
RStudio, PBC. 2020. RStudio: Integrated Development Environment for R. 250 Northern Ave, Boston, MA, 02210: RStudio, PBC. http://www.rstudio.com.
Ruppert, David, M. P. Wand, and R. J Carroll. 2003. Semiparametric Regression. Cambridge University Press: Cambridge, UK.
Samet, J. H., M. J. Larson, N. J. Horton, K. Doyle, M. Winter, and R. Saitz. 2003. “Linking Alcohol- and Drug-Dependent Adults to Primary Medical Care: A Randomized Controlled Trial of a Multi-Disciplinary Health Intervention in a Detoxification Unit.” Addiction 98 (4): 509–16. https://doi.org/10.1046/j.1360-0443.2003.00328.x.
Sarkar, Deepayan. 2023. Lattice: Trellis Graphics for r. https://lattice.r-forge.r-project.org/.
Schliep, Klaus, and Klaus Hechenbichler. 2016. Kknn: Weighted k-Nearest Neighbors. https://github.com/KlausVigo/kknn.
Schwarz, A. 2005. The Numbers Game: Baseball’s Lifelong Fascination With Statistics. St. Martin’s Press: New York, NY.
Sievert, Carson, Chris Parmer, Toby Hocking, Scott Chamberlain, Karthik Ram, Marianne Corvellec, and Pedro Despouy. 2023. Plotly: Create Interactive Web Graphics via Plotly.js.
Silge, Julia, and David Robinson. 2016. “Tidytext: Text Mining and Analysis Using Tidy Data Principles in R.” Journal of Open Source Software 1 (3). https://doi.org/10.21105/joss.00037.
———. 2017. Text Mining with R: A Tidy Approach. O’Reilly Media: Sebastopol, CA. https://www.tidytextmining.com.
Slowikowski, Kamil. 2023. Ggrepel: Automatically Position Non-Overlapping Text Labels with Ggplot2. https://github.com/slowkow/ggrepel.
Spinu, Vitalie, Garrett Grolemund, and Hadley Wickham. 2023. Lubridate: Make Dealing with Dates a Little Easier.
Stonebraker, M., D. Abadi, D. J. DeWitt, S. Madden, E. Paulson, A. Pavlo, and A. Rasin. 2010. MapReduce and Parallel DBMSs: Friends or Foes?” Communications of the ACM 53 (1): 64–71.
Tan, Pang-Ning, Michael Steinbach, and Vipin Kumar. 2006. Introduction to Data Mining. 1st ed. Pearson Education: New York, NY.
Tapal, Marium, Rana Gahwagy, and Irene Ryan. 2023. Fec12: Data Package for 2012 Federal Elections. http://github.com/baumer-lab/fec12.
Temple Lang, Duncan. 2023. RCurl: General Network (HTTP/FTP/...) Client Interface for r.
Therneau, Terry, and Beth Atkinson. 2022. Rpart: Recursive Partitioning and Regression Trees. https://CRAN.R-project.org/package=rpart.
Torres-Manzanera, Emilio. 2018. Xkcd: Plotting Ggplot2 Graphics in an XKCD Style.
Travers, Jeffrey, and Stanley Milgram. 1969. “An Experimental Study of the Small World Problem.” Sociometry, 425–43.
Tufte, E. R. 1990. Envisioning Information. Graphics Press: Cheshire, CT.
———. 1997. Visual Explanations: Images and Quantities, Evidence and Narrative. Graphics Press: Cheshire, CT.
———. 2001. Visual Display of Quantitative Information (Second Edition). Graphics Press: Cheshire, CT.
———. 2003. The Cognitive Style of PowerPoint. Graphics Press: Cheshire, CT.
———. 2006. Beautiful Evidence. Graphics Press: Cheshire, CT.
Tukey, J. W. 1990. “Data-Based Graphics: Visual Display in the Decades to Come.” Statistical Science 5 (3): 327–39.
Ushey, Kevin, JJ Allaire, and Yuan Tang. 2023. Reticulate: Interface to Python.
Ushey, Kevin, and Hadley Wickham. 2023. Renv: Project Environments.
Vaidyanathan, Ramnath, Yihui Xie, JJ Allaire, Joe Cheng, Carson Sievert, and Kenton Russell. 2023. Htmlwidgets: HTML Widgets for r. https://github.com/ramnathv/htmlwidgets.
Vanderkam, Dan, JJ Allaire, Jonathan Owen, Daniel Gromer, and Benoit Thieurmel. 2018. Dygraphs: Interface to Dygraphs Interactive Time Series Charting Library. https://github.com/rstudio/dygraphs.
Vaughan, Davis, and Matt Dancho. 2022. Furrr: Apply Mapping Functions in Parallel Using Futures.
Vinten-Johansen, Peter, Howard Brody, Nigel Paneth, Stephen Rachman, Michael Rip, and David Zuck. 2003. Cholera, Chloroform, and the Science of Medicine: A Life of John Snow. Oxford University Press.
Walker, Kyle. 2020. Tigris: Load Census TIGER/Line Shapefiles into r. https://CRAN.R-project.org/package=tigris.
———. 2023. Tigris: Load Census TIGER/Line Shapefiles. https://github.com/walkerke/tigris.
Walker, Kyle, and Matt Herman. 2023. Tidycensus: Load US Census Boundary and Attribute Data as Tidyverse and Sf-Ready Data Frames. https://walker-data.com/tidycensus/.
Wang, Victor. 2006. “The OBP/SLG Ratio: What Does History Say?” By the Numbers 16 (3): 3.
Wang, Yilun, and Michal Kosinski. 2018. “Deep Neural Networks Are More Accurate Than Humans at Detecting Sexual Orientation from Facial Images.” Journal of Personality and Social Psychology 114 (2): 246. https://doi.org/10.1037/pspa0000098.
Wasserstein, Ronald L., and Nicole A. Lazar. 2016. “The ASA’s Statement on p–Values: Context, Process, and Purpose.” The American Statistician 70 (2): 129–33.
Wasserstein, Ronald L., Allen L. Schirm, and Nicole A. Lazar. 2019. “Moving to a World Beyond p < 0.05.” The American Statistician 73 (sup1): 1–19.
Watts, Duncan J, and Steven H Strogatz. 1998. “Collective Dynamics of ‘Small-World’ Networks.” Nature 393 (6684): 440–42.
Weisberg, Sanford. 2018. Alr4: Data to Accompany Applied Linear Regression 4th Edition. http://www.z.umn.edu/alr4ed.
Wickham, H. 2011. “ASA 2009 Data Expo.” Journal of Computational and Graphical Statistics 20 (2): 281–83.
———. 2014. “Tidy Data.” The Journal of Statistical Software 59 (10).
———. 2019. Advanced r (Second Edition). CRC Press: Boca Raton, FL.
Wickham, Hadley. 2015. R Packages: Organize, Test, Document, and Share your Code. O’Reilly Media: Sebastopol, CA. http://r-pkgs.had.co.nz/.
———. 2016. Ggplot2: Elegant Graphics for Data Analysis. Springer Verlag: New York, NY. https://ggplot2.tidyverse.org.
———. 2019a. Assertthat: Easy Pre and Post Assertions.
———. 2019b. Babynames: US Baby Names 1880-2014. https://CRAN.R-project.org/package=babynames.
———. 2020a. Mastering Shiny. O’Reilly Media: Sebastopol, CA. https://mastering-shiny.org/.
———. 2020b. Tidyr: Easily Tidy Data with ‘Spread()‘ and ‘Gather()‘ Functions. https://CRAN.R-project.org/package=tidyr.
———. 2021a. Babynames: US Baby Names 1880-2017. https://github.com/hadley/babynames.
———. 2021b. Nycflights13: Flights That Departed NYC in 2013. https://github.com/hadley/nycflights13.
———. 2022a. Rvest: Easily Harvest (Scrape) Web Pages.
———. 2022b. Stringr: Simple, Consistent Wrappers for Common String Operations.
———. 2023a. Forcats: Tools for Working with Categorical Variables (Factors).
———. 2023b. Modelr: Modelling Functions That Work with the Pipe.
———. 2023c. Tidyverse: Easily Install and Load the Tidyverse.
Wickham, Hadley, and Jennifer Bryan. 2023a. Bigrquery: An Interface to Google’s BigQuery ’API’.
———. 2023b. Readxl: Read Excel Files.
Wickham, Hadley, Jennifer Bryan, Malcolm Barrett, and Andy Teucher. 2023. Usethis: Automate Package and Project Setup.
Wickham, Hadley, Winston Chang, Robert Flight, Kirill Müller, and Jim Hester. 2021. Sessioninfo: R Session Information.
Wickham, Hadley, Winston Chang, Lionel Henry, Thomas Lin Pedersen, Kohske Takahashi, Claus Wilke, Kara Woo, Hiroaki Yutani, and Dewey Dunnington. 2023. Ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics.
Wickham, Hadley, Dianne Cook, and Heike Hofmann. 2015. “Visualizing Statistical Models: Removing the Blindfold.” Statistical Analysis and Data Mining: The ASA Data Science Journal 8 (4): 203–25.
Wickham, Hadley, Romain François, Lionel Henry, Kirill Müller, and Davis Vaughan. 2023. Dplyr: A Grammar of Data Manipulation.
Wickham, Hadley, Maximilian Girlich, and Edgar Ruiz. 2023. Dbplyr: A Dplyr Back End for Databases.
Wickham, Hadley, and Lionel Henry. 2023. Purrr: Functional Programming Tools.
Wickham, Hadley, Jim Hester, and Jennifer Bryan. 2023. Readr: Read Rectangular Text Data.
Wickham, Hadley, Evan Miller, and Danny Smith. 2023. Haven: Import and Export SPSS, Stata and SAS Files.
Wickham, Hadley, and Dana Seidel. 2022. Scales: Scale Functions for Visualization.
Wickham, Hadley, Davis Vaughan, and Maximilian Girlich. 2023. Tidyr: Tidy Messy Data.
Wickham, H., and R. Francois. 2020. Dplyr: A Grammar of Data Manipulation. https://github.com/hadley/dplyr.
Wikipedia. 2016. “Hippocratic Oath.” Wikimedia Foundation.
Wilkinson, L., D. Wills, D. Rope, A. Norton, and R. Dubbs. 2005. The Grammar of Graphics (Second Edition). Springer Verlag: New York, NY.
Xie, Y. 2014. Dynamic Documents with R and Knitr. CRC Press: Boca Raton, FL.
Xie, Yihui. 2023a. Knitr: A General-Purpose Package for Dynamic Report Generation in r. https://yihui.org/knitr/.
———. 2023b. Xfun: Supporting Functions for Packages Maintained by Yihui Xie. https://github.com/yihui/xfun.
Xie, Yihui, Joe Cheng, and Xianying Tan. 2023. DT: A Wrapper of the JavaScript Library DataTables. https://github.com/rstudio/DT.
Yau, N. 2011. Visualize This: The Flowing Data Guide to Design, Visualization, and Statistics. John Wiley & Sons: Hoboken, NJ.
———. 2013. Data Points: Visualization That Means Something. John Wiley & Sons: Hoboken, NJ.
Zaslavsky, A. M., and N. J. Horton. 1998. “Balancing Disclosure Risk Against the Loss of Nonpublication.” Journal of Official Statistics 14 (4): 411–19. http://www.jos.nu/Articles/abstract.asp?article=144411.
Zhu, Hao. 2021. kableExtra: Construct Complex Table with Kable and Pipe Syntax.