Or, “The Stuff of Fame: Material and Printed Constructions of Renown in America.”

A collaboration between the Smithsonian Institution and the Stanford Literary Lab.

With Kenneth Cohen, Laura McGrath, Mark Algee-Hewitt, Robyn Asleson, J.D. Porter, Charlotte Lindemann.

I am the lead data scientist on this project.

Taking as its point of departure not the history of singular stars but the evolution of the firmament, we have compiled a corpus from millions of newspaper and magazine articles between 1860 and 1960 to study the history of celebrity in the United States. Computational methods not only enable us to trace and compare both the usage and prevalence of the vocabulary of renown — evident in words such as “famous,” “infamous,” “notorious,” “celebrity,” “star,” and related terms — but we also employ Named Entity Recognition to extract individual names to which we can apply quantitative metrics of celebrity (how often are they named in a national context over time, how wide is the geographical spread across which the name appears?). This approach to names and references already has uncovered diverse famous people who have not received biographical study, and revealed how unexpected objects, places, and images have been famous in their own right and central to constructing individuals’ renown over time. Together, these methods and further experiments allow us for the first time to understand the development of fame in all its complexity, attuned to demographic, geographic, temporal, material, and conceptual evolutions that the existing scholarship has largely missed or underestimated.

This graph shows the number of articles per year in our corpus.