Introduction to Data Science

Land Acknowledgment

This class at the University of Richmond respectfully acknowledges the traditional custodians of the land we are on today, the Powhatan people, and pays respect to their elders past, present, and emerging.

To learn more about the land on which the University of Richmond exists, I recommend students read the report “Knowledge of This Cannot Be Hidden” by Shelby M. Driskill and Dr. Lauranett L. Lee, which discusses both the University’s geographic connection to the Powhatan people, as well as the presence of a burying ground for enslaved laborers on campus.

Accessibility

I strive to make this course accessible. If you encounter barriers to accessibility, please let me know as soon as possible.

Course Description

Topics will include techniques for collecting, organizing, analyzing, modeling, and presenting data. Applications to a variety of fields will be emphasized. Includes an extensive introduction to statistical programming. (See course catalog.)

Learning Goals

By the end of this course, students will be able to…

Prerequisites

Structure

This course has three units, each of which focuses on an aspect of data science:

Unit Aspect
1 Visualization
2 Collection & Manipulation
3 Application

Materials

Inclusivity

I expect you to…

If we spend time reviewing material that you already know, remember that it may be the first time some of your peers are learning this information. Students come to this class with different levels of knowledge. And everyone learns best in different ways.

Help

Grades

The tables below show the weights of each assignment and the associated ranges. I do not offer extra credit. I round fractional grades.

Assignment Percentage
Participation 10%
Homework 10%
Exam 1 20%
Exam 2 20%
Exam 3 20%
Final Project 20%
Letter Grade Range
A 93-100
A- 90-92
B+ 87-89
B 83-86
B- 80-82
C+ 77-79
C 73-76
C- 70-72
D+ 67-69
D 63-66
D- 60-62
F 0-59

Schedule

Bb No. Date Topic
00 08-26 Introduction
01 08-28 Tabular data
02 09-02 Grammar of graphics
03 09-04 Aesthetics and scales
04 09-09 Organizing data
05 09-11 Summarizing data
06 09-16 Creating features
09-18 Exam 1
07 09-23 Creating data
08 09-25 Data feminism
09 09-30 Table joins
10 10-02 Table pivots
11 10-07 Review 1
10-09 No class
12 10-16 Review 2
13 10-21 Tidy models
14 10-23 Exam 2 review
10-28 Exam 2
15 10-30 Project assignments
16 11-04 Dates and times
17 11-06 Spatial data
18 11-11 Spatial joins
19 11-13 Time zones
11-18 Exam 3
11-20 Workshop 1
11-25 Workshop 2
12-02 Workshop 3
12-04 Final projects

Assignments

Late assignments are penalized one letter grade (i.e., -10 points) per day.

Homework

Most class meetings will have a reading posted on our website. A few questions are included at the end of each reading. These must be completed before class. Please bring the written responses with you to class. You will upload completed questions to the course website.

You will be given an opportunity to begin working on homework during class. Your homework grade will be based on the proportion of assignments that you complete on time. If you put in a good-faith effort on the homework, it will be marked complete.

Participation

Exams

Each of the three exams has two halves: a take-home open-book part and an in-class closed-book part.

The take-home will be distributed in advance of the in-class exam. Answers to the take-home should be submitted by the beginning of the in-class exam. A list of topics for the in-class exam will be posted on the course website.

Final Project

The final project will be due during the last week of class. The project will ask you to find or create a new data set, and apply techniques learned throughout the class to analyze it. The project will take the form of a digital poster session and a one-page reflection. Detailed instructions will be provided later in the course.

Honor

This course is taught in accordance with the University of Richmond Honor Code, which can be accessed via The Honor Councils website. You are encouraged to collaborate on homework, but each student must contribute work to the group. On exams, cheating includes, but is not limited to, viewing another’s work with or without their consent, or duplicating any portion of it. If you are found to have violated the Honor Code, you fail this course. If you ever have any questions about whether an action would be an honor violation, please ask.

Generative Artificial Intelligence

Generative artificial intelligence (GenAI) programs, especially large language models (LLMs), can be useful tools for coding. However, over reliance on LLMs or other resources where you can copy answers to programming problems directly (e.g., StackOverflow) will impede your learning.

Moreover, LLMs sometimes answer questions incorrectly and/or using different methodologies than those we study in class. When LLMs make stuff up (e.g., libraries that don’t exist), this is referred to as “hallucination.” In order to be an effective user of these technologies, it is crucial for you to recognize when that happens and how to respond.

Prohibited uses of GenAI

  1. Submitting any model output, in part or in whole, as your own work. This includes code and writing.
  2. Uploading any data used in this course (e.g., .csv files) to multimodal GenAI tools like ChatGPT.
  3. Using GenAI tools on the take-home portion of exams.

Any of the above uses would be treated as violations of the honor code.

Permitted uses of GenAI

If you get stuck on a homework problem, I ask you to try these things in this order:

  1. Review the course notes.
  2. Talk to your classmates.
  3. Search for credible information online.
  4. After trying all of the above, talk to the Custom GPT generated for this class.
  5. If you use information from the Custom GPT, you must cite your interactions with it.

Communication

Resources

The University of Richmond has many resources on campus that may help you succeed.

Weinstein Learning Center

The Weinstein Learning Center is your go-to destination for academic support. Our services are tailored to help you achieve your academic goals throughout your time at University of Richmond. To learn more and view service schedules and appointment times, visit https://wlc.richmond.edu. Available services include:

Academic Skills Coaching

Meet with a professional staff member who will collaborate with you to assess and develop your academic and life skills (e.g., critical reading and thinking, information conceptualization, concentration, test preparation, time management, stress management, and more).

Content Tutoring

Peer consultants offer assistance in specific courses and subject areas. They are available for appointments (in-person and virtual) and drop-in sessions. See schedules at https://wlc.richmond.edu for supported courses and drop-in times.

English Language Learning

Attend one-on-one or group consultations, workshops, and other services focused on English, academic, and/or intercultural skills.

Quantitative and Programming Resources

Peer consultants and professional staff offer workshops or one-on-one appointments to build quantitative and programming skills and provide statistical assistance for research projects.

Speech and Communication

Prepare and practice for academic presentations, speaking engagements, and other occasions of public expression. Peer consultants offer recording, playback, and coaching for both individual and group presentations. Students can expect recommendations regarding clarity, organization, style, and delivery.

Technology Learning

Visit our student lab dedicated to supporting digital media projects. Services include camera checkout, video/audio recording assistance, use of virtual reality equipment, poster printing, 3D printing and modeling, and consultation services on a variety of software.

Writing

Assists student writers at all levels of experience, across all majors. Meet with peer consultants who can offer feedback on written work and suggest pre-writing, drafting, and revision strategies.

Boatwright Library

Students may consult librarians to assist with their research, which may be especially useful for the final project. Use the Ask a Librarian service to reach librarians by email, phone, chat, text, or in person.

Counseling and Psychological Services

Mental health is crucial for academic success. Counseling and Psychological Services at the University of Richmond supports student success and enhances student well-being by providing comprehensive clinical services to currently enrolled, full-time, degree-seeking students.

Acknowledgments

The course builds on previous iterations of the course taught by Taylor Arnold and Lilla Orr.


  1. Humanities Data in R: Exploring Networks, Geospatial Data, Images, and Text, 1st ed. 2015, Quantitative Methods in the Humanities and Social Sciences (Springer International Publishing : Imprint: Springer, 2015), https://doi.org/10.1007/978-3-319-20702-5.↩︎

  2. Practical Statistics for Data Scientists: 50+ Essential Concepts Using R and Python, Second edition (O’Reilly Media, Inc, 2020).↩︎

  3. Text Analysis with R: For Students of Literature, 2nd edition, Quantitative Methods in the Humanities and Social Sciences (Springer, 2020).↩︎

  4. Tidy Modeling with R: A Framework for Modeling in the Tidyverse (O’Reilly Media, 2022).↩︎

  5. Text Mining with R: A Tidy Approach, First edition (O’Reilly, 2017).↩︎

  6. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data, Second edition (O’Reilly, 2023).↩︎