Literary Text Mining
Description
Computational methods have made it possible to analyze literature in new ways and at new scales. This course trains students in theories and methods of computational literary studies. It requires no background in computer programming or literary criticism. We begin with fundamentals of the Python programming language before moving on to computational analyses of literary texts. Our analyses will be informed by critical readings. While programming distinguishes this course from many other humanities classes, this course is fundamentally about literature, and what we can learn about it at the new scales afforded by computation.
Objectives
Students will be able to:
- Plan and justify a program of computational literary analysis.
- Create a corpus and metadata to study a research question.
- Use all basic elements of Python proficiently.
- Use several data science and text mining packages.
- Visualize data.
- Integrate data analysis into literary criticism.
Technology
All students will need a laptop running macOS, Linux, or Windows that they can bring to class. (MacOS is preferred.) Chromebooks and iPads will not work for this course. If you need access to an appropriate laptop, please let me know ASAP.
Readings
All readings for this course will be posted on Canvas. Please note that many readings will be excerpted. If you need paper copies, please let me know ASAP.
Calendar
I give abbreviated titles for some readings in the calendar. Full citations are available in the bibliography at the end of the syllabus.
Date | Reading |
---|---|
Sep 23 | Kirschenbaum, “What is ‘digital humanities’?”1 |
Sep 25 | Liu, “The Meaning of the Digital Humanities”2 |
Sep 30 | Algee-Hewitt and McGurl, “Between Canon and Corpus”3 Porter, “Popularity and Prestige”4 |
Oct 2 | Burrows, Computation into Criticism5 |
Oct 7 | Piper, Enumerations6 |
Oct 9 | Arnold and Tilton, “Role of Statistics in DH”7 |
Oct 14 | No class |
Oct 16 | Ramsay, “An Algorithmic Criticism”8 |
Oct 21 | Bode, “Equivalence of ‘Close’ and ‘Distant’”9 |
Oct 23 | Guldi, “Critical Search”10 |
Oct 28 | Mandell, “Gender and Cultural Analytics”11 Underwood et al, “The Transformation of Gender”12 |
Oct 30 | Heuser and Le-Khac, “A Quantitative Literary History”13 |
Nov 4 | Fields, Racecraft (GIGO)14 McPherson, “Why are the digital humanities so white?”15 |
Nov 6 | Ruberg et al., “Toward a Queer Digital Humanities”16 |
Nov 11 | Risam, “The Stakes of Postcolonial Digital Humanities”17 |
Nov 13 | English, “Now, Not Now”18 Blei, “Probabilistic Topic Models”19 |
Nov 18 | Underwood, Distant Horizons20 |
Nov 20 | D’Ignazio and Klein, “Feminist Data Visualization”21 Clement, “Text Analysis”22 Drucker, “Graphical Display”23 |
Nov 25 | Recess |
Nov 27 | Recess |
Dec 2 | Presentations |
Dec 4 | Presentations |
Dec 13 | Final paper due |
Literary Lab meetings
Students are encouraged to attend meetings of the Stanford Literary Lab, a research group that focuses on computational text analysis: https://litlab.stanford.edu/
Grading
I round grades to the nearest integer.
Letter | Range |
---|---|
A | 93 or more |
A- | 90-92 |
B+ | 87-89 |
B | 83-86 |
B- | 80-82 |
C+ | 77-79 |
C | 73-76 |
C- | 70-72 |
D+ | 67-69 |
D | 60-66 |
F | 59 or less |
Evaluation
Item | Percentage | Description |
---|---|---|
Homework | 40% | Weekly assignments that apply material from class. |
Participation | 20% | Attendance, discussion, and lab work. |
Presentation | 10% | One in-class presentation of original research. |
Final Essay | 30% | Uses evidence from computational analysis to advance a literary critical argument. |
Homework will include programming practice, new problems, and written reflections on readings or results.
Attendance is crucial in this course since computational methods build on each other. If you must miss a class, I expect you to cover the material you missed during office hours, or with classmates.
Further details on the presentation and final essay will be circulated later in the quarter.
Miscellaneous Items
Accommodations
- Please let me know about any accommodations you require ASAP.
- Contact the Office of Accessible Education if you need an updated letter.
Absences
- Students are allowed one unexcused absence.
- Email me at least 24 hours prior to class to secure an excused absence.
- If you know you will be absent on specific days this quarter (e.g. away games), please let me know as far in advance as possible.
- I usually reply to email within a day. If you have not heard from me after that time, please write again.
Office hours
- If you cannot make my regular office hours, I would be happy to schedule a meeting at another time that works for both of us.
- To discuss writing in progress, I ask that you send me your text two days before we meet.
Late work
- I apply a -10% penalty for each day that passes after the original deadline to late work.
- I do not accept work that is more than four days late.
- (Yes, weekends count.)
Other
- Take advantage of the Hume Center for Writing and Speaking while drafting and revising your final essay.
- If you get stuck on a programming problem, I encourage you to reach out to your classmates to solve it together.
- Mental and physical health are essential for you to do your best work. If you are feeling unwell at any point during the quarter, please talk to me.
Appendix
Honor Code
The Honor Code is an undertaking of the students, individually and collectively:
that they will not give or receive aid in examinations; that they will not give or receive unpermitted aid in class work, in the preparation of reports, or in any other work that is to be used by the instructor as the basis of grading;
that they will do their share and take an active part in seeing to it that others as well as themselves uphold the spirit and letter of the Honor Code.
The faculty on its part manifests its confidence in the honor of its students by refraining from proctoring examinations and from taking unusual and unreasonable precautions to prevent the forms of dishonesty mentioned above. The faculty will also avoid, as far as practicable, academic procedures that create temptations to violate the Honor Code.
While the faculty alone has the right and obligation to set academic requirements, the students and faculty will work together to establish optimal conditions for honorable academic work.
Fundamental Standard
Students are expected to show both within and without the University such respect for order, morality, personal honor, and the rights of others as is demanded of good citizens. Failure to do this will be sufficient cause for removal from the University.
Students with Documented Disabilities
Students who may need an academic accommodation based on the impact of a disability must initiate the request with the Office of Accessible Education (OAE). Professional staff will evaluate the request with required documentation, recommend reasonable accommodations, and prepare an Accommodation Letter for faculty dated in the current quarter in which the request is being made. Students should contact the OAE as soon as possible since timely notice is needed to coordinate accommodations.
The Office of Accessible Education
563 Salvatierra Walk
650-723-1066
https://studentaffairs.stanford.edu/oae
Campus Resources
Hume Center for Speaking and Writing
https://sites.stanford.edu/undergrad/tutoring-support/hume-center
Office of Sexual Assault & Relationship Abuse Education
& Response (SARA)
https://sara.stanford.edu
Counseling and Psychological Services (CAPS)
https://vaden.stanford.edu/caps
The Bridge Peer Counseling
https://haas.stanford.edu/students/cardinal-commitment/bridge
English for Foreign Students
https://language.stanford.edu/programs/efs/languages/english-foreign-students
Academic Skills Coaching
http://learningconnection.stanford.edu/academic-skills-coaching
Undergraduate Advising
https://undergrad.stanford.edu/advising
Community Center Resources
https://undergrad.stanford.edu/tutoring-support/community-center-resources
Acknowledgments
This syllabus extends past iterations of this course taught by Mark Algee-Hewitt and Ryan Heuser. It has also benefitted, directly and indirectly, from syllabi by David Bamman, Andrew Goldstone, Laura McGrath, and J. D. Porter.
Matthew Kirschenbaum, “What Is Digital Humanities and What’s It Doing in English Departments?” in Debates in the Digital Humanities, ed. Matthew K. Gold (University of Minnesota Press, 2012).↩︎
Alan Liu, “The Meaning of the Digital Humanities,” PMLA/Publications of the Modern Language Association of America 128, no. 2 (2013): 409–23, https://doi.org/10.1632/pmla.2013.128.2.409.↩︎
Mark Algee-Hewitt and Mark McGurl, “Between Canon and Corpus: Six Perspectives on 20th-Century Novels,” Stanford Literary Lab Pamphlets, no. 8 (January 2015).↩︎
J. D. Porter, “Popularity/Prestige,” Pamphlets of the Stanford Literary Lab, no. 17 (September 2018).↩︎
J. F. Burrows, Computation into Criticism: A Study of Jane Austen’s Novels and an Experiment in Method (Clarendon Press ; Oxford University Press, 1987).↩︎
Andrew Piper, Enumerations: Data and Literary Study (The University of Chicago Press, 2018).↩︎
Taylor Arnold and Lauren Tilton, “New Data? The Role of Statistics in DH,” in 7567060, ed. Matthew K. Gold and Lauren F. Klein, Debates in the Digital Humanities (University of Minnesota Press, 2019).↩︎
Stephen Ramsay, Reading Machines: Toward an Algorithmic Criticism, Topics in the Digital Humanities (University of Illinois Press, 2011).↩︎
Katherine Bode, “The Equivalence of ‘Close’ and ‘Distant’ Reading; or, Toward a New Object for Data-Rich Literary History,” Modern Language Quarterly: A Journal of Literary History 78, no. 1 (2017): 77–106, https://doi.org/10.1215/00267929-3699787.↩︎
Jo Guldi, “Critical Search: A Procedure for Guided Reading in Large-Scale Textual Corpora,” Journal of Cultural Analytics 3, no. 1 (2018), https://doi.org/10.22148/16.030.↩︎
Laura Mandell, “Gender and Cultural Analytics: Finding or Making Stereotypes?” in Debates in the Digital Humanities 2019, ed. Matthew K. Gold and Lauren F Klein, Debates in the Digital Humanities 5 (University of Minnesota Press, 2019).↩︎
Ted Underwood, David Bamman, and Sabrina Lee, “The Transformation of Gender in English-Language Fiction,” Journal of Cultural Analytics, ahead of print, 2018, https://doi.org/10.22148/16.019.↩︎
Ryan Heuser and Long Le-Khac, “A Quantitative Literary History of 2,958 Nineteenth-Century British Novels: The Semantic Cohort Method,” Pamphlets of the Stanford Literary Lab, no. 4 (May 2012).↩︎
Karen E. Fields and Barbara Jeanne Fields, Racecraft: The Soul of Inequality in American Life (Verso, 2012).↩︎
Tara McPherson, “Why Are the Digital Humanities So White? Or Thinking the Histories of Race and Computation,” in Debates in the Digital Humanities, ed. Matthew K. Gold (Univ Of Minnesota Press, 2012).↩︎
Bonnie Ruberg, Jason Boyd, and James Howe, “Toward a Queer Digital Humanities,” in Bodies of Information, ed. Elizabeth Losh and Jacqueline Wernimont, Intersectional Feminism and the Digital Humanities (University of Minnesota Press, 2018), https://doi.org/10.5749/j.ctv9hj9r9.11.↩︎
Roopika Risam, New Digital Worlds: Postcolonial Digital Humanities in Theory, Praxis, and Pedagogy (Northwestern University Press, 2019).↩︎
James F. English, “Now, Not Now: Counting Time in Contemporary Fiction Studies,” Modern Language Quarterly 77, no. 3 (2016): 395–418, https://doi.org/10.1215/00267929-3570667.↩︎
David M. Blei, “Probabilistic Topic Models,” Communications of the ACM 55, no. 4 (2012): 77, https://doi.org/10.1145/2133806.2133826.↩︎
Ted Underwood, Distant Horizons: Digital Evidence and Literary Change (The University of Chicago Press, 2019).↩︎
Catherine D’Ignazio and Lauren F. Klein, Data Feminism, Strong Ideas Series (The MIT Press, 2020).↩︎
Tanya Clement, “Text Analysis, Data Mining, and Visualizations in Literary Scholarship,” in Literary Studies in the Digital Age, 2013.↩︎
Johanna Drucker, “Humanities Approaches to Graphical Display,” Digital Humanities Quarterly 005, no. 1 (2011).↩︎