Introduction to Data Science
Land Acknowledgment
This class at the University of Richmond respectfully acknowledges the traditional custodians of the land we are on today, the Powhatan people, and pays respect to their elders past, present, and emerging.
To learn more about the land on which the University of Richmond exists, I recommend students read the report “Knowledge of This Cannot Be Hidden” by Shelby M. Driskill and Dr. Lauranett L. Lee, which discusses both the University’s geographic connection to the Powhatan people, as well as the presence of a burying ground for enslaved laborers on campus.
Accessibility
I strive to make this course accessible. If you encounter barriers to accessibility, please let me know as soon as possible.
Course Description
Topics will include techniques for collecting, organizing, analyzing, modeling, and presenting data. Applications to a variety of fields will be emphasized. Includes an extensive introduction to statistical programming. (See course catalog.)
Learning Goals
By the end of this course, students will be able to…
- Collect, manipulate, visualize, and explore data.
- Describe best practices for structuring tabular data.
- Articulate relationships between a data set and analyses thereof.
- Use programming language documentation and cookbooks to solve problems.
- Understand key aspects of the R programming language and
the
tidyverse
. - Use the RStudio integrated development environment.
Prerequisites
- This course neither requires nor expects any prior experience with computer programming.
- While we discuss some statistical concepts, students only need experience with algebra.
Structure
This course has three units, each of which focuses on an aspect of data science:
Unit | Aspect |
---|---|
1 | Visualization |
2 | Collection & Manipulation |
3 | Application |
Materials
- Please bring a computer, pencil, and something to write on to each class.
- Class materials can be found on the course Blackboard site.
- You will submit assignments via Blackboard.
Inclusivity
I expect you to…
- Treat your classmates with respect.
- Support each other in your learning.
- Be patient.
If we spend time reviewing material that you already know, remember that it may be the first time some of your peers are learning this information. Students come to this class with different levels of knowledge. And everyone learns best in different ways.
Help
- We should have a lot of time during class to answer
questions about course material.
- I will also encourage you to collaborate with peers during class time.
- I encourage you to study with your classmates, though I expect all work that you submit to be your own.
- Come to office hours or schedule a meeting for any questions
or personal concerns that cannot be addressed in class.
- To schedule, email me with your availability at least one day before you’d like to meet.
- Note that I generally do not answer conceptual questions
about homework before they are due.
- Homework is graded on timely completion and effort, not accuracy.
Grades
The tables below show the weights of each assignment and the associated ranges. I do not offer extra credit. I round fractional grades.
Assignment | Percentage |
---|---|
Participation | 10% |
Homework | 10% |
Exam 1 | 20% |
Exam 2 | 20% |
Exam 3 | 20% |
Final Project | 20% |
Letter Grade | Range |
---|---|
A | 93-100 |
A- | 90-92 |
B+ | 87-89 |
B | 83-86 |
B- | 80-82 |
C+ | 77-79 |
C | 73-76 |
C- | 70-72 |
D+ | 67-69 |
D | 63-66 |
D- | 60-62 |
F | 0-59 |
Schedule
- The schedule below outlines the major topics of the course.
- I reserve the right to change the syllabus as needed
- I will inform you of any changes as far in advance as possible.
- Bold items are worth a large percentage of your grade.
Bb No. | Date | Topic |
---|---|---|
00 | 08-26 | Introduction |
01 | 08-28 | Tabular data |
02 | 09-02 | Grammar of graphics |
03 | 09-04 | Aesthetics and scales |
04 | 09-09 | Organizing data |
05 | 09-11 | Summarizing data |
06 | 09-16 | Creating features |
– | 09-18 | Exam 1 |
07 | 09-23 | Creating data |
08 | 09-25 | Data feminism |
09 | 09-30 | Table joins |
10 | 10-02 | Table pivots |
11 | 10-07 | Review 1 |
– | 10-09 | No class |
12 | 10-16 | Review 2 |
13 | 10-21 | Tidy models |
14 | 10-23 | Exam 2 review |
– | 10-28 | Exam 2 |
15 | 10-30 | Project assignments |
16 | 11-04 | Dates and times |
17 | 11-06 | Spatial data |
18 | 11-11 | Spatial joins |
19 | 11-13 | Time zones |
– | 11-18 | Exam 3 |
– | 11-20 | Workshop 1 |
– | 11-25 | Workshop 2 |
– | 12-02 | Workshop 3 |
– | 12-04 | Final projects |
Assignments
Late assignments are penalized one letter grade (i.e., -10 points) per day.
Homework
Most class meetings will have a reading posted on our website. A few questions are included at the end of each reading. These must be completed before class. Please bring the written responses with you to class. You will upload completed questions to the course website.
You will be given an opportunity to begin working on homework during class. Your homework grade will be based on the proportion of assignments that you complete on time. If you put in a good-faith effort on the homework, it will be marked complete.
Participation
- After a brief lecture reviewing material from the reading, you will work on practice notebooks during class in small groups.
- Everyone may have up to two unexcused absences.
- Additional unexcused absences will harm your participation grade.
- You may email me to request an excused absence.
- Your request will be approved if it is for illness, hospitalization, death in the family, important religious holidays, or university activities (e.g., field trips, University-sponsored athletic events).
- You do not need to provide details if request an
absence for one of the reasons above.
- Simply say, “I’m sick.” You don’t need to tell me, e.g., “I have strep throat.”
- Students who receive excused absences may submit homework late.
- Please request an excused absence before missing class if
possible.
- If you request an excused absence after missing class, you must do so as soon as possible.
- If you cannot attend a scheduled exam or the final project presentation, please let me know as far in advance as possible.
Exams
Each of the three exams has two halves: a take-home open-book part and an in-class closed-book part.
The take-home will be distributed in advance of the in-class exam. Answers to the take-home should be submitted by the beginning of the in-class exam. A list of topics for the in-class exam will be posted on the course website.
Final Project
The final project will be due during the last week of class. The project will ask you to find or create a new data set, and apply techniques learned throughout the class to analyze it. The project will take the form of a digital poster session and a one-page reflection. Detailed instructions will be provided later in the course.
Honor
This course is taught in accordance with the University of Richmond Honor Code, which can be accessed via The Honor Councils website. You are encouraged to collaborate on homework, but each student must contribute work to the group. On exams, cheating includes, but is not limited to, viewing another’s work with or without their consent, or duplicating any portion of it. If you are found to have violated the Honor Code, you fail this course. If you ever have any questions about whether an action would be an honor violation, please ask.
Generative Artificial Intelligence
Generative artificial intelligence (GenAI) programs, especially large language models (LLMs), can be useful tools for coding. However, over reliance on LLMs or other resources where you can copy answers to programming problems directly (e.g., StackOverflow) will impede your learning.
Moreover, LLMs sometimes answer questions incorrectly and/or using different methodologies than those we study in class. When LLMs make stuff up (e.g., libraries that don’t exist), this is referred to as “hallucination.” In order to be an effective user of these technologies, it is crucial for you to recognize when that happens and how to respond.
Prohibited uses of GenAI
- Submitting any model output, in part or in whole, as your own work. This includes code and writing.
- Uploading any data used in this course (e.g.,
.csv
files) to multimodal GenAI tools like ChatGPT. - Using GenAI tools on the take-home portion of exams.
Any of the above uses would be treated as violations of the honor code.
Permitted uses of GenAI
If you get stuck on a homework problem, I ask you to try these things in this order:
- Review the course notes.
- Talk to your classmates.
- Search for credible information online.
- After trying all of the above, talk to the Custom GPT generated for this class.
- If you use information from the Custom GPT, you must cite your interactions with it.
- This page explains how to share a link to a ChatGPT interaction.
- An adequate citation would be: “I got this help from the Custom GPT to solve this problem.”
Communication
- I respond to email within 1 to 2 business days.
- Do not expect responses over the weekend.
- If you have not received a response after 2 business days, feel free to write me again.
- For the most prompt response, schedule your email to arrive early in the morning (e.g., 7 or 8 AM).
Resources
The University of Richmond has many resources on campus that may help you succeed.
Weinstein Learning Center
The Weinstein Learning Center is your go-to destination for
academic support. Our services are tailored to help you achieve
your academic goals throughout your time at University of
Richmond. To learn more and view service schedules and
appointment times, visit https://wlc.richmond.edu. Available services
include:
Academic Skills Coaching
Meet with a professional staff member who will collaborate
with you to assess and develop your academic and life skills
(e.g., critical reading and thinking, information
conceptualization, concentration, test preparation, time
management, stress management, and more).
Content Tutoring
Peer consultants offer assistance in specific courses and
subject areas. They are available for appointments (in-person
and virtual) and drop-in sessions. See schedules at https://wlc.richmond.edu for supported courses
and drop-in times.
English Language Learning
Attend one-on-one or group consultations, workshops, and
other services focused on English, academic, and/or
intercultural skills.
Quantitative and Programming Resources
Peer consultants and professional staff offer workshops or
one-on-one appointments to build quantitative and programming
skills and provide statistical assistance for research
projects.
Speech and Communication
Prepare and practice for academic presentations, speaking
engagements, and other occasions of public expression. Peer
consultants offer recording, playback, and coaching for both
individual and group presentations. Students can expect
recommendations regarding clarity, organization, style, and
delivery.
Technology Learning
Visit our student lab dedicated to supporting digital media
projects. Services include camera checkout, video/audio
recording assistance, use of virtual reality equipment, poster
printing, 3D printing and modeling, and consultation services on
a variety of software.
Writing
Assists student writers at all levels of experience, across all majors. Meet with peer consultants who can offer feedback on written work and suggest pre-writing, drafting, and revision strategies.
Boatwright Library
Students may consult librarians to assist with their research, which may be especially useful for the final project. Use the Ask a Librarian service to reach librarians by email, phone, chat, text, or in person.
Counseling and Psychological Services
Mental health is crucial for academic success. Counseling and Psychological Services at the University of Richmond supports student success and enhances student well-being by providing comprehensive clinical services to currently enrolled, full-time, degree-seeking students.
Acknowledgments
The course builds on previous iterations of the course taught by Taylor Arnold and Lilla Orr.
Recommended Reading
- Taylor Arnold and Lauren Tilton1
- Peter C. Bruce, Andrew Bruce, and Peter Gedeck2
- Matthew Lee Jockers and Rosamond Thalken3
- Max Kuhn and Julia Silge4
- Julia Silge and David Robinson5
- Hadley Wickham, Mine Çetinkaya-Rundel, and Garrett Grolemund6
Humanities Data in R: Exploring Networks, Geospatial Data, Images, and Text, 1st ed. 2015, Quantitative Methods in the Humanities and Social Sciences (Springer International Publishing : Imprint: Springer, 2015), https://doi.org/10.1007/978-3-319-20702-5.↩︎
Practical Statistics for Data Scientists: 50+ Essential Concepts Using R and Python, Second edition (O’Reilly Media, Inc, 2020).↩︎
Text Analysis with R: For Students of Literature, 2nd edition, Quantitative Methods in the Humanities and Social Sciences (Springer, 2020).↩︎
Tidy Modeling with R: A Framework for Modeling in the Tidyverse (O’Reilly Media, 2022).↩︎
Text Mining with R: A Tidy Approach, First edition (O’Reilly, 2017).↩︎
R for Data Science: Import, Tidy, Transform, Visualize, and Model Data, Second edition (O’Reilly, 2023).↩︎