Averaging Americans: Literature, Statistics, and Inequality discovers a significant trend in US literary history by computationally analyzing more than 18,000 works of US fiction across the long nineteenth century. American authors increasingly make generalizations about individuals and groups of people using statistical reasoning, and become increasingly confident in asserting fixed characteristics as central tendencies of groups. I argue that this trend reflects an increasing awareness of and anxiety about questions of population, polity, and representativeness, one that manifests in the attempt to shore up qualitative judgments through quantification. Statistical reasoning is central to issues of race, class, and power that have preoccupied Americanists for decades, yet its influence on American literature has scarcely been studied. This is not because writers of the period were unacquainted with the power of statistics to (mis)represent the world; Mark Twain famously warned his readers about “three kinds of lies: lies, damned lies, and statistics.”
In the data, I find a cohort of words including average, center, typical, commonplace, someone, anyone, and everyone that rise at faster rates than almost every other term across the long nineteenth century. These often outpace words that we would expect to increase rapidly, such as telegraph or railway. These terms quantitatively distinguish postbellum literature from antebellum literature better than almost any others, and undergo particular growth after Reconstruction. What makes words like average and typical interesting for literary criticism is precisely that, at first, they appear to decrease rather than increase particularity, literature’s usual claim to power as compared to the sciences. Readers may have no reason to imagine an “average American” differently from an otherwise uncharacterized American—unless, of course, that assertion of qualitative majoritarianism serves an end in itself by reinforcing quantitatively false notions about who represents America. The rising authority of quantification coincides with its abuse.
Critics have long argued that literary realism and naturalism invest in the question of their own representativeness, a point that has usually been made with reference to the influence of the natural sciences on these modes of writing. Early theorists and practitioners of American literary realism argue for the need to turn away from the idealism of the prewar period, and instead represent the world “as it is,” which, for some, depends upon a scientific worldview committed to quantitative exactitude and proportionality. Aesthetically, this attempt to avoid misrepresentation reached its apogee in what is often described as the pessimistic determinism of literary naturalism. I extend this longstanding critical discussion about the role of the sciences in the literary representation of the world to the increasing authority of statistics as a mode of knowledge that undergirds multiple scientific discourses, and one that rapidly achieves both prestige and popular authority during this period as a way to better know the natural world and society. The statistical imagination offered authors and other intellectuals new ways of thinking about the relationship between individuals and society.