Literary Text Mining

Description

Computational methods have made it possible to analyze literature in new ways and at new scales. This course trains students in theories and methods of computational literary studies. It requires no background in computer programming or literary criticism. We begin with fundamentals of the Python programming language before moving on to computational analyses of literary texts. Our analyses will be informed by critical readings. While programming distinguishes this course from many other humanities classes, this course is fundamentally about literature, and what we can learn about it at the new scales afforded by computation.

Objectives

Students will be able to:

Technology

All students will need a laptop running macOS, Linux, or Windows that they can bring to class. (MacOS is preferred.) Chromebooks and iPads will not work for this course. If you need access to an appropriate laptop, please let me know ASAP.

Readings

All readings for this course will be posted on Canvas. Please note that many readings will be excerpted. If you need paper copies, please let me know ASAP.

Calendar

I give abbreviated titles for some readings in the calendar. Full citations are available in the bibliography at the end of the syllabus.

Date Reading
Sep 23 Kirschenbaum, “What is ‘digital humanities’?”1
Sep 25 Liu, “The Meaning of the Digital Humanities”2
Sep 30 Algee-Hewitt and McGurl, “Between Canon and Corpus”3
Porter, “Popularity and Prestige”4
Oct 2 Burrows, Computation into Criticism5
Oct 7 Piper, Enumerations6
Oct 9 Arnold and Tilton, “Role of Statistics in DH”7
Oct 14 No class
Oct 16 Ramsay, “An Algorithmic Criticism”8
Oct 21 Bode, “Equivalence of ‘Close’ and ‘Distant’”9
Oct 23 Guldi, “Critical Search”10
Oct 28 Mandell, “Gender and Cultural Analytics”11
Underwood et al, “The Transformation of Gender”12
Oct 30 Heuser and Le-Khac, “A Quantitative Literary History”13
Nov 4 Fields, Racecraft (GIGO)14
McPherson, “Why are the digital humanities so white?”15
Nov 6 Ruberg et al., “Toward a Queer Digital Humanities”16
Nov 11 Risam, “The Stakes of Postcolonial Digital Humanities”17
Nov 13 English, “Now, Not Now”18
Blei, “Probabilistic Topic Models”19
Nov 18 Underwood, Distant Horizons20
Nov 20 D’Ignazio and Klein, “Feminist Data Visualization”21
Clement, “Text Analysis”22
Drucker, “Graphical Display”23
Nov 25 Recess
Nov 27 Recess
Dec 2 Presentations
Dec 4 Presentations
Dec 13 Final paper due

Literary Lab meetings

Students are encouraged to attend meetings of the Stanford Literary Lab, a research group that focuses on computational text analysis: https://litlab.stanford.edu/

Grading

I round grades to the nearest integer.

Letter Range
A 93 or more
A- 90-92
B+ 87-89
B 83-86
B- 80-82
C+ 77-79
C 73-76
C- 70-72
D+ 67-69
D 60-66
F 59 or less

Evaluation

Item Percentage Description
Homework 40% Weekly assignments that apply material from class.
Participation 20% Attendance, discussion, and lab work.
Presentation 10% One in-class presentation of original research.
Final Essay 30% Uses evidence from computational analysis to advance a literary critical argument.

Homework will include programming practice, new problems, and written reflections on readings or results.

Attendance is crucial in this course since computational methods build on each other. If you must miss a class, I expect you to cover the material you missed during office hours, or with classmates.

Further details on the presentation and final essay will be circulated later in the quarter.

Miscellaneous Items

Accommodations

Absences

Email

Office hours

Late work

Other

Appendix

Honor Code

The Honor Code is an undertaking of the students, individually and collectively:

  1. that they will not give or receive aid in examinations; that they will not give or receive unpermitted aid in class work, in the preparation of reports, or in any other work that is to be used by the instructor as the basis of grading;

  2. that they will do their share and take an active part in seeing to it that others as well as themselves uphold the spirit and letter of the Honor Code.

The faculty on its part manifests its confidence in the honor of its students by refraining from proctoring examinations and from taking unusual and unreasonable precautions to prevent the forms of dishonesty mentioned above. The faculty will also avoid, as far as practicable, academic procedures that create temptations to violate the Honor Code.

While the faculty alone has the right and obligation to set academic requirements, the students and faculty will work together to establish optimal conditions for honorable academic work.

Fundamental Standard

Students are expected to show both within and without the University such respect for order, morality, personal honor, and the rights of others as is demanded of good citizens. Failure to do this will be sufficient cause for removal from the University.

Students with Documented Disabilities

Students who may need an academic accommodation based on the impact of a disability must initiate the request with the Office of Accessible Education (OAE). Professional staff will evaluate the request with required documentation, recommend reasonable accommodations, and prepare an Accommodation Letter for faculty dated in the current quarter in which the request is being made. Students should contact the OAE as soon as possible since timely notice is needed to coordinate accommodations.

The Office of Accessible Education
563 Salvatierra Walk
650-723-1066
https://studentaffairs.stanford.edu/oae

Campus Resources

Hume Center for Speaking and Writing
https://sites.stanford.edu/undergrad/tutoring-support/hume-center

Office of Sexual Assault & Relationship Abuse Education & Response (SARA)
https://sara.stanford.edu

Counseling and Psychological Services (CAPS)
https://vaden.stanford.edu/caps

The Bridge Peer Counseling
https://haas.stanford.edu/students/cardinal-commitment/bridge

English for Foreign Students
https://language.stanford.edu/programs/efs/languages/english-foreign-students

Academic Skills Coaching
http://learningconnection.stanford.edu/academic-skills-coaching

Undergraduate Advising
https://undergrad.stanford.edu/advising

Community Center Resources
https://undergrad.stanford.edu/tutoring-support/community-center-resources

Acknowledgments

This syllabus extends past iterations of this course taught by Mark Algee-Hewitt and Ryan Heuser. It has also benefitted, directly and indirectly, from syllabi by David Bamman, Andrew Goldstone, Laura McGrath, and J. D. Porter.


  1. Matthew Kirschenbaum, “What Is Digital Humanities and What’s It Doing in English Departments?” in Debates in the Digital Humanities, ed. Matthew K. Gold (University of Minnesota Press, 2012).↩︎

  2. Alan Liu, “The Meaning of the Digital Humanities,” PMLA/Publications of the Modern Language Association of America 128, no. 2 (2013): 409–23, https://doi.org/10.1632/pmla.2013.128.2.409.↩︎

  3. Mark Algee-Hewitt and Mark McGurl, “Between Canon and Corpus: Six Perspectives on 20th-Century Novels,” Stanford Literary Lab Pamphlets, no. 8 (January 2015).↩︎

  4. J. D. Porter, “Popularity/Prestige,” Pamphlets of the Stanford Literary Lab, no. 17 (September 2018).↩︎

  5. J. F. Burrows, Computation into Criticism: A Study of Jane Austen’s Novels and an Experiment in Method (Clarendon Press ; Oxford University Press, 1987).↩︎

  6. Andrew Piper, Enumerations: Data and Literary Study (The University of Chicago Press, 2018).↩︎

  7. Taylor Arnold and Lauren Tilton, “New Data? The Role of Statistics in DH,” in 7567060, ed. Matthew K. Gold and Lauren F. Klein, Debates in the Digital Humanities (University of Minnesota Press, 2019).↩︎

  8. Stephen Ramsay, Reading Machines: Toward an Algorithmic Criticism, Topics in the Digital Humanities (University of Illinois Press, 2011).↩︎

  9. Katherine Bode, “The Equivalence of Close and Distant Reading; or, Toward a New Object for Data-Rich Literary History,” Modern Language Quarterly: A Journal of Literary History 78, no. 1 (2017): 77–106, https://doi.org/10.1215/00267929-3699787.↩︎

  10. Jo Guldi, “Critical Search: A Procedure for Guided Reading in Large-Scale Textual Corpora,” Journal of Cultural Analytics 3, no. 1 (2018), https://doi.org/10.22148/16.030.↩︎

  11. Laura Mandell, “Gender and Cultural Analytics: Finding or Making Stereotypes?” in Debates in the Digital Humanities 2019, ed. Matthew K. Gold and Lauren F Klein, Debates in the Digital Humanities 5 (University of Minnesota Press, 2019).↩︎

  12. Ted Underwood, David Bamman, and Sabrina Lee, “The Transformation of Gender in English-Language Fiction,” Journal of Cultural Analytics, ahead of print, 2018, https://doi.org/10.22148/16.019.↩︎

  13. Ryan Heuser and Long Le-Khac, “A Quantitative Literary History of 2,958 Nineteenth-Century British Novels: The Semantic Cohort Method,” Pamphlets of the Stanford Literary Lab, no. 4 (May 2012).↩︎

  14. Karen E. Fields and Barbara Jeanne Fields, Racecraft: The Soul of Inequality in American Life (Verso, 2012).↩︎

  15. Tara McPherson, “Why Are the Digital Humanities So White? Or Thinking the Histories of Race and Computation,” in Debates in the Digital Humanities, ed. Matthew K. Gold (Univ Of Minnesota Press, 2012).↩︎

  16. Bonnie Ruberg, Jason Boyd, and James Howe, “Toward a Queer Digital Humanities,” in Bodies of Information, ed. Elizabeth Losh and Jacqueline Wernimont, Intersectional Feminism and the Digital Humanities (University of Minnesota Press, 2018), https://doi.org/10.5749/j.ctv9hj9r9.11.↩︎

  17. Roopika Risam, New Digital Worlds: Postcolonial Digital Humanities in Theory, Praxis, and Pedagogy (Northwestern University Press, 2019).↩︎

  18. James F. English, “Now, Not Now: Counting Time in Contemporary Fiction Studies,” Modern Language Quarterly 77, no. 3 (2016): 395–418, https://doi.org/10.1215/00267929-3570667.↩︎

  19. David M. Blei, “Probabilistic Topic Models,” Communications of the ACM 55, no. 4 (2012): 77, https://doi.org/10.1145/2133806.2133826.↩︎

  20. Ted Underwood, Distant Horizons: Digital Evidence and Literary Change (The University of Chicago Press, 2019).↩︎

  21. Catherine D’Ignazio and Lauren F. Klein, Data Feminism, Strong Ideas Series (The MIT Press, 2020).↩︎

  22. Tanya Clement, “Text Analysis, Data Mining, and Visualizations in Literary Scholarship,” in Literary Studies in the Digital Age, 2013.↩︎

  23. Johanna Drucker, “Humanities Approaches to Graphical Display,” Digital Humanities Quarterly 005, no. 1 (2011).↩︎