Literary Text Mining

Contents

Description

Computational methods have made it possible to analyze literature in new ways and at new scales. This course trains students in theories and methods of computational literary studies. It requires no background in computer programming or literary criticism. We begin with fundamentals of the Python programming language before moving on to computational analyses of literary texts. Our analyses will be informed by critical readings. While programming distinguishes this course from many other humanities classes, this course is fundamentally about literature, and what we can learn about it at the new scales afforded by computation.

Objectives

Students will be able to:

  • Plan and justify a program of computational literary analysis.
  • Create a corpus and metadata to study a research question.
  • Use all basic elements of Python proficiently.
  • Use several data science and text mining packages.
  • Visualize data.
  • Integrate data analysis into literary criticism.

Technology

All students will need a laptop running macOS, Linux, or Windows that they can bring to class. (MacOS is preferred.) Chromebooks and iPads will not work for this course. If you need access to an appropriate laptop, please let me know ASAP.

Readings

All readings for this course will be posted on Canvas. Please note that many readings will be excerpted. If you need paper copies, please let me know ASAP.

Calendar

I give abbreviated titles for some readings in the calendar. Full citations are available in the bibliography at the end of the syllabus.

Date Reading
Sep 23 Kirschenbaum, “What is ‘digital humanities’?”Matthew Kirschenbaum, “What Is Digital Humanities and What’s It Doing in English Departments?” in Debates in the Digital Humanities, ed. Matthew K. Gold (University of Minnesota Press, 2012).

Sep 25 Liu, “The Meaning of the Digital Humanities”Alan Liu, “The Meaning of the Digital Humanities,” PMLA/Publications of the Modern Language Association of America 128, no. 2 (2013): 409–23, https://doi.org/10.1632/pmla.2013.128.2.409.

Sep 30 Algee-Hewitt and McGurl, “Between Canon and Corpus”Mark Algee-Hewitt and Mark McGurl, “Between Canon and Corpus: Six Perspectives on 20th-Century Novels,” Stanford Literary Lab Pamphlets, no. 8 (January 2015).


Porter, “Popularity and Prestige”J. D. Porter, “Popularity/Prestige,” Pamphlets of the Stanford Literary Lab, no. 17 (September 2018).

Oct 2 Burrows, Computation into CriticismJ. F. Burrows, Computation into Criticism: A Study of Jane Austen’s Novels and an Experiment in Method (Clarendon Press ; Oxford University Press, 1987).

Oct 7 Piper, EnumerationsAndrew Piper, Enumerations: Data and Literary Study (The University of Chicago Press, 2018).

Oct 9 Arnold and Tilton, “Role of Statistics in DH”Taylor Arnold and Lauren Tilton, “New Data? The Role of Statistics in DH,” in 7567060, ed. Matthew K. Gold and Lauren F. Klein, Debates in the Digital Humanities (University of Minnesota Press, 2019).

Oct 14 No class
Oct 16 Ramsay, “An Algorithmic Criticism”Stephen Ramsay, Reading Machines: Toward an Algorithmic Criticism, Topics in the Digital Humanities (University of Illinois Press, 2011).

Oct 21 Bode, “Equivalence of ‘Close’ and ‘Distant’”Katherine Bode, “The Equivalence of Close and Distant Reading; or, Toward a New Object for Data-Rich Literary History,” Modern Language Quarterly: A Journal of Literary History 78, no. 1 (2017): 77–106, https://doi.org/10.1215/00267929-3699787.

Oct 23 Guldi, “Critical Search”Jo Guldi, “Critical Search: A Procedure for Guided Reading in Large-Scale Textual Corpora,” Journal of Cultural Analytics 3, no. 1 (2018), https://doi.org/10.22148/16.030.

Oct 28 Mandell, “Gender and Cultural Analytics”Laura Mandell, “Gender and Cultural Analytics: Finding or Making Stereotypes?” in Debates in the Digital Humanities 2019, ed. Matthew K. Gold and Lauren F. Klein, Debates in the Digital Humanities 5 (University of Minnesota Press, 2019).


Underwood et al, “The Transformation of Gender”Ted Underwood, David Bamman, and Sabrina Lee, “The Transformation of Gender in English-Language Fiction,” Journal of Cultural Analytics, ahead of print, 2018, https://doi.org/10.22148/16.019.

Oct 30 Heuser and Le-Khac, “A Quantitative Literary History”Ryan Heuser and Long Le-Khac, “A Quantitative Literary History of 2,958 Nineteenth-Century British Novels: The Semantic Cohort Method,” Pamphlets of the Stanford Literary Lab, no. 4 (May 2012).

Nov 4 Fields, Racecraft (GIGO)Karen E. Fields and Barbara Jeanne Fields, Racecraft: The Soul of Inequality in American Life (Verso, 2012).


McPherson, “Why are the digital humanities so white?”Tara McPherson, “Why Are the Digital Humanities So White? Or Thinking the Histories of Race and Computation,” in Debates in the Digital Humanities, ed. Matthew K. Gold (Univ Of Minnesota Press, 2012).

Nov 6 Ruberg et al., “Toward a Queer Digital Humanities”Bonnie Ruberg, Jason Boyd, and James Howe, “Toward a Queer Digital Humanities,” in Bodies of Information, ed. Elizabeth Losh and Jacqueline Wernimont, Intersectional Feminism and the Digital Humanities (University of Minnesota Press, 2018), https://doi.org/10.5749/j.ctv9hj9r9.11.

Nov 11 Risam, “The Stakes of Postcolonial Digital Humanities”Roopika Risam, New Digital Worlds: Postcolonial Digital Humanities in Theory, Praxis, and Pedagogy (Northwestern University Press, 2019).

Nov 13 English, “Now, Not Now”James F. English, “Now, Not Now: Counting Time in Contemporary Fiction Studies,” Modern Language Quarterly 77, no. 3 (2016): 395–418, https://doi.org/10.1215/00267929-3570667.


Blei, “Probabilistic Topic Models”David M. Blei, “Probabilistic Topic Models,” Communications of the ACM 55, no. 4 (2012): 77, https://doi.org/10.1145/2133806.2133826.

Nov 18 Underwood, Distant HorizonsTed Underwood, Distant Horizons: Digital Evidence and Literary Change (The University of Chicago Press, 2019).

Nov 20 D’Ignazio and Klein, “Feminist Data Visualization”Catherine D’Ignazio and Lauren F. Klein, Data Feminism, Strong Ideas Series (The MIT Press, 2020).


Clement, “Text Analysis”Tanya Clement, “Text Analysis, Data Mining, and Visualizations in Literary Scholarship,” in Literary Studies in the Digital Age, 2013.


Drucker, “Graphical Display”Johanna Drucker, “Humanities Approaches to Graphical Display,” Digital Humanities Quarterly 005, no. 1 (2011).

Nov 25 Recess
Nov 27 Recess
Dec 2 Presentations
Dec 4 Presentations
Dec 13 Final paper due

Literary Lab meetings

Students are encouraged to attend meetings of the Stanford Literary Lab, a research group that focuses on computational text analysis: https://litlab.stanford.edu/

Grading

I round grades to the nearest integer.

Letter Range
A 93 or more
A- 90-92
B+ 87-89
B 83-86
B- 80-82
C+ 77-79
C 73-76
C- 70-72
D+ 67-69
D 60-66
F 59 or less

Evaluation

Item Percentage Description
Homework 40% Weekly assignments that apply material from class.
Participation 20% Attendance, discussion, and lab work.
Presentation 10% One in-class presentation of original research.
Final Essay 30% Uses evidence from computational analysis to advance a literary critical argument.

Homework will include programming practice, new problems, and written reflections on readings or results.

Attendance is crucial in this course since computational methods build on each other. If you must miss a class, I expect you to cover the material you missed during office hours, or with classmates.

Further details on the presentation and final essay will be circulated later in the quarter.

Miscellaneous Items

Accommodations

  • Please let me know about any accommodations you require ASAP.
  • Contact the Office of Accessible Education if you need an updated letter.

Absences

  • Students are allowed one unexcused absence.
  • Email me at least 24 hours prior to class to secure an excused absence.
  • If you know you will be absent on specific days this quarter (e.g. away games), please let me know as far in advance as possible.

Email

  • I usually reply to email within a day. If you have not heard from me after that time, please write again.

Office hours

  • If you cannot make my regular office hours, I would be happy to schedule a meeting at another time that works for both of us.
  • To discuss writing in progress, I ask that you send me your text two days before we meet.

Late work

  • I apply a -10% penalty for each day that passes after the original deadline to late work.
  • I do not accept work that is more than four days late.
  • (Yes, weekends count.)

Other

  • Take advantage of the Hume Center for Writing and Speaking while drafting and revising your final essay.
  • If you get stuck on a programming problem, I encourage you to reach out to your classmates to solve it together.
  • Mental and physical health are essential for you to do your best work. If you are feeling unwell at any point during the quarter, please talk to me.

Appendix

Honor Code

The Honor Code is an undertaking of the students, individually and collectively:

  1. that they will not give or receive aid in examinations; that they will not give or receive unpermitted aid in class work, in the preparation of reports, or in any other work that is to be used by the instructor as the basis of grading;

  2. that they will do their share and take an active part in seeing to it that others as well as themselves uphold the spirit and letter of the Honor Code.

The faculty on its part manifests its confidence in the honor of its students by refraining from proctoring examinations and from taking unusual and unreasonable precautions to prevent the forms of dishonesty mentioned above. The faculty will also avoid, as far as practicable, academic procedures that create temptations to violate the Honor Code.

While the faculty alone has the right and obligation to set academic requirements, the students and faculty will work together to establish optimal conditions for honorable academic work.

Fundamental Standard

Students are expected to show both within and without the University such respect for order, morality, personal honor, and the rights of others as is demanded of good citizens. Failure to do this will be sufficient cause for removal from the University.

Students with Documented Disabilities

Students who may need an academic accommodation based on the impact of a disability must initiate the request with the Office of Accessible Education (OAE). Professional staff will evaluate the request with required documentation, recommend reasonable accommodations, and prepare an Accommodation Letter for faculty dated in the current quarter in which the request is being made. Students should contact the OAE as soon as possible since timely notice is needed to coordinate accommodations.

The Office of Accessible Education
563 Salvatierra Walk
650-723-1066
https://studentaffairs.stanford.edu/oae

Campus Resources

Hume Center for Speaking and Writing
https://sites.stanford.edu/undergrad/tutoring-support/hume-center

Office of Sexual Assault & Relationship Abuse Education & Response (SARA)
https://sara.stanford.edu

Counseling and Psychological Services (CAPS)
https://vaden.stanford.edu/caps

The Bridge Peer Counseling
https://haas.stanford.edu/students/cardinal-commitment/bridge

English for Foreign Students
https://language.stanford.edu/programs/efs/languages/english-foreign-students

Academic Skills Coaching
http://learningconnection.stanford.edu/academic-skills-coaching

Undergraduate Advising
https://undergrad.stanford.edu/advising

Community Center Resources
https://undergrad.stanford.edu/tutoring-support/community-center-resources

Acknowledgments

This syllabus extends past iterations of this course taught by Mark Algee-Hewitt and Ryan Heuser. It has also benefitted, directly and indirectly, from syllabi by David Bamman, Andrew Goldstone, Laura McGrath, and J. D. Porter.