Advanced Data Science

Land Acknowledgment

This class at the University of Richmond respectfully acknowledges the traditional custodians of the land we are on today, the Powhatan people, and pays respect to their elders past, present, and emerging.

To learn more about the land on which the University of Richmond exists, I recommend students read the report “Knowledge of This Cannot Be Hidden” by Shelby M. Driskill and Dr. Lauranett L. Lee, which discusses both the University’s geographic connection to the Powhatan people, as well as the presence of a burying ground for enslaved laborers on campus.

Accessibility

I strive to make this course accessible. If you encounter barriers to accessibility, please let me know as soon as possible.

Description

This course introduces advanced approaches to the analysis of data. It emphasizes what Hadley Wickham calls the “whole game” of data science: creating, importing, tidying, transforming, visualizing, and communicating about data. We will focus on data about and derived from texts in this class.

See the course catalog.

Learning Goals

By the end of this course, students will be able to…

Materials

Inclusivity

I expect you to…

Help

Grades

Assignment Percentage
Participation 10%
Code Interview 10%
Group Projects 60%
Final Project 20%
Letter Grade Range
A 93-100
A- 90-92
B+ 87-89
B 83-86
B- 80-82
C+ 77-79
C 73-76
C- 70-72
D+ 67-69
D 63-66
D- 60-62
F 0-59

Participation

Code Interview

Near the end of the semester, you will complete a code interview. I will ask you to explain your approach and solutions to a few randomly selected problems from the notebooks assigned throughout the semester.

If you do not complete the notebooks, you cannot succeed on the code interview.

Group Projects

Final Project

Unlike earlier projects in the semester, final projects are individual. Your final project will result in a formal report including original code, data visualizations, explanatory and interpretive writing, and proper citations.

Late work

Honor

This course is taught in accordance with the University of Richmond Honor Code, which can be accessed via The Honor Councils website.

If you are found to have violated the Honor Code, you will fail this course.

If you ever have any questions about whether an action would be an honor violation, re-read the syllabus. If it is still unclear, please ask me.

Generative Artificial Intelligence

Generative artificial intelligence (GenAI) programs, especially large language models (LLMs), are useful tools for coding. However, overreliance on GenAI impedes learning.

Moreover, LLMs often answer questions incorrectly or incompletely. In order to be an effective user of these technologies, it is crucial for you to be able to recognize when that happens, and how to respond.

Prohibited Uses of GenAI

Permitted Uses of GenAI

I ask you to do these things in this order when you can’t figure something out:

  1. Review the course notes.
  2. Talk to your classmates.
  3. Search for credible information online (e.g., StackOverflow).
  4. Talk to the Custom GPT I created for this class.
  5. If you use information from the Custom GPT, cite your interactions with it.

This page explains how to share a link to a ChatGPT interaction.

Schedule

The schedule outlines major topics to be covered in the course. I reserve the right to change the schedule as the semester progresses. If I do change the schedule, I will inform you as far in advance as possible.

Meeting Date Topic
1 01-13 Introduction
2 01-15 Quarto and Markdown
01-22 NO CLASS
3 01-27 tidyverse review
4 01-29 Tidy texts
5 02-03 Sampling and corpora
6 02-05 Relative frequencies
7 02-10 Review
8 02-12 Group Project 1 workshop
9 02-17 Sentiment analysis
10 02-19 tf-idf
11 02-24 K-nearest neighbors
12 02-26 Text classification
13 03-03 Group Project 2 workshop
14 03-05 Elastic nets
03-10 Spring Break
03-12 Spring Break
15 03-17 Gradient boosted trees
16 03-19 Logistic regression
17 03-24 Word embeddings
18 03-26 Modeling text data
19 03-31 Text classification
20 04-02 Topic modeling
21 04-07 Review
22 04-09 Group Project 3 workshop
23 04-14 Code Interviews
24 04-16 Workshop
25 04-21 Workshop
26 04-23 Final Project presentations

Communication

Wellness

Health

Counseling and Psychological Services

Mental health is crucial for academic success. Counseling and Psychological Services at the University of Richmond supports student success and enhances student well-being by providing comprehensive clinical services to currently enrolled, full-time, degree-seeking students.

Title IX

The University of Richmond and its faculty are committed to ensuring a safe and supportive learning environment. If you disclose to me or another mandatory reporter an incident of sexual misconduct (including sexual harassment or sexual violence), I am obligated by law to share that information with the University’s Title IX Coordinator. For more information on our sexual misconduct policy, how to report, and confidential resources available to you, please visit the University’s Title IX help page.

Religious Observance

Any student may be excused from class or other assignments because of religious observance. I will make reasonable accommodations when students’ religious practices conflict with their academic responsibilities. If you will miss an academic obligation because of religious observance, you are responsible for contacting me within the first two weeks of the semester. You are also responsible for completing missed work in a timely manner.

For more information, see the University’s religious observance policy.

Resources

The University of Richmond has many resources on campus that may help you succeed.

Disability Services

The University of Richmond’s office of Disability Services strives to ensure that students with disabilities and/or temporary conditions (i.e., concussions & injuries) are provided opportunity for full participation and equal access. Students who are experiencing a barrier to access due to a disability and/or temporary condition are encouraged to apply for accommodations by visiting: disability.richmond.edu. Disability Services can be reached at or 804-662-5001.

Once accommodations have been approved, students must 1) Submit their Disability Accommodation Notice (DAN) to each of their professors via the Disability Services Student Portal available at this link: sl.richmond.edu/be. and 2) Request a meeting with each professor to create an accommodation implementation plan. It is important to complete these steps as soon as possible because accommodations are never retroactive, and professors are permitted a reasonable amount of time for implementation. Disability Services is available to assist, as needed.

Weinstein Learning Center

The Weinstein Learning Center is your go-to destination for academic support. Their services are tailored to help you achieve your academic goals throughout your time at University of Richmond. To learn more and view service schedules and appointment times, visit https://wlc.richmond.edu. Available services include:

Academic Skills Coaching

Meet with a professional staff member who will collaborate with you to assess and develop your academic and life skills (e.g., critical reading and thinking, information conceptualization, concentration, test preparation, time management, stress management, and more).

Content Tutoring

Peer consultants offer assistance in specific courses and subject areas. They are available for appointments (in-person and virtual) and drop-in sessions. See schedules at https://wlc.richmond.edu for supported courses and drop-in times.

English Language Learning

Attend one-on-one or group consultations, workshops, and other services focused on English, academic, and/or intercultural skills.

Quantitative and Programming Resources

Peer consultants and professional staff offer workshops or one-on-one appointments to build quantitative and programming skills and provide statistical assistance for research projects.

Speech and Communication

Prepare and practice for academic presentations, speaking engagements, and other occasions of public expression. Peer consultants offer recording, playback, and coaching for both individual and group presentations. Students can expect recommendations regarding clarity, organization, style, and delivery.

Technology Learning

Visit our student lab dedicated to supporting digital media projects. Services include camera checkout, video/audio recording assistance, use of virtual reality equipment, poster printing, 3D printing and modeling, and consultation services on a variety of software.

Writing

Assists student writers at all levels of experience, across all majors. Meet with peer consultants who can offer feedback on written work and suggest pre-writing, drafting, and revision strategies.

Boatwright Library

Students may consult librarians to assist with their research, which may be especially useful for the final project. Use the Ask a Librarian service to reach librarians by email, phone, chat, text, or in person.

Acknowledgments

The course builds on previous iterations of DSST389 taught by Lilla Orr and Taylor Arnold.


  1. Humanities Data in R: Exploring Networks, Geospatial Data, Images, and Text, 2nd ed. 2024, Quantitative Methods in the Humanities and Social Sciences (Springer International Publishing, 2024), https://doi.org/10.1007/978-3-031-62566-4.↩︎

  2. Practical Statistics for Data Scientists: 50+ Essential Concepts Using R and Python, Second edition (O’Reilly Media, Inc, 2020).↩︎

  3. Data Visualization: A Practical Introduction (Princeton University Press, 2019).↩︎

  4. Statistical Inference via Data Science: A Modern Dive into R and the Tidyverse, Second edition (CRC Press, 2025).↩︎

  5. An Introduction to Statistical Learning: With Applications in R, Springer Texts in Statistics (Springer US, 2021), https://doi.org/10.1007/978-1-0716-1418-1.↩︎

  6. Text Analysis with R: For Students of Literature, 2nd edition, Quantitative Methods in the Humanities and Social Sciences (Springer, 2020).↩︎

  7. Tidy Modeling with R: A Framework for Modeling in the Tidyverse (O’Reilly Media, 2022).↩︎

  8. Text Mining with R: A Tidy Approach, First edition (O’Reilly, 2017).↩︎

  9. The Visual Display of Quantitative Information, 2nd ed (Graphics Press, 2001).↩︎

  10. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data, Second edition (O’Reilly, 2023).↩︎

  11. R Packages: Organize, Test, Document, and Share Your Code, Second edition (O’Reilly, 2023).↩︎