Social Science / Data Science

Jo Guldi: Towards a Practice of Text Mining to Understand Change Over Historical Time

Part of the Social Science / Data Science event series

Recorded on March 8, 2023, this video features a lecture by Jo Guldi, Professor of History and Practicing Data Scientist at Southern Methodist University. Professor Guldi’s lecture was entitled “Towards a Practice of Text-Mining to Understand Change Over Historical Time: The Persistence of Memory in British Parliamentary Debates in the Nineteenth Century.”

Co-sponsored by Social Science Matrix, the UC Berkeley Department of History, and D-Lab, this talk was presented as part of the Social Science / Data Science event series, a collaboration between Social Science Matrix and D-Lab.


A world awash in text requires interpretive tools that traditional quantitative science cannot provide. Text mining is dangerous because analysts trained in quantification often lack a sense of what could go wrong when archives are biased or incomplete. Professor Guldi’s talk reviewed a brief catalogue of disasters created by data science experts who voyage into humanistic study. It finds a solution in “hybrid knowledge,” or the application of historical methods to algorithm and analysis.

Case studies engage recent work from the philosophy of history (including Koselleck, Erle, Assman, Tanaka, Chakrabarty, Jay, Sewell, and others) and investigate the “fit” of algorithms with each historical frame of reference on the past. This talk profiles recent research into the status of “memory” in British politics. It profiled the persistence of references to previous eras in British history, to historical conditions per se, and to futures hoped for and planned, using NLP analysis. It presented the promise and limits of text-mining strategies such as Named Entity Recognition and Parts of Speech Analysis for modeling temporal experience as a whole, suggesting how these methods might support students of social science and the humanities, and also revealing how traditional topics in these subjects offer a new research frontier for students of data science and informatics.

About Jo Guldi

Jo Guldi, Professor of History and Practicing Data Scientist at Southern Methodist University, is author of four books: Roads to Power: Britain Invents the Infrastructure State (Harvard 2012), The History Manifesto (Cambridge 2014), The Long Land War: The Global Struggle for Occupancy Rights (Yale 2022), and The Dangerous Art of Text Mining (Cambridge forthcoming). Her historical work ranges from archival studies in nation-building, state formation, and the use of technology by experts. She has also been a pioneer in the field of text mining for historical research, where statistical and machine-learning approaches are hybridized with historical modes of inquiry to produce new knowledge. Her publications on digital methods include “The Distinctiveness of Different Eras,” American Historical Review (August 2022) and “The Official Mind’s View of Empire, in Miniature: Quantifying World Geography in Hansard’s Parliamentary Debates,” Journal of World History 32, no. 2 (June 2021): 345–70. She is a former junior fellow at the Harvard Society of Fellows.

Listen to the lecture below, or on Google Podcasts or Apple Podcasts.

You May Like



Published April 15, 2023

Jo Guldi, “The Long Land War: The Global Struggle for Occupancy Rights”

 Most nations in Asia, Latin America, and Africa experienced some form of “land reform” in the 20th century. But what is land reform? In her book, The Long Land War: The Global Struggle for Occupancy Rights, Professor Jo Guldi approaches the problem from the point of view of Britain’s disintegrating empire. She makes the case that land […]

Learn More >



Published April 15, 2023

Economics and Geopolitics in US International Relations: China, Europe, and the Global South

 The pandemic and the war in Ukraine have reshaped global geopolitics, trade, and security. How will these changes affect the relationship between the US and China, Europe, and the Global South? How will they impact US firms operating globally, and how might foreign leaders — and notably the Chinese leadership — respond? Recorded on […]

Learn More >

Matrix On Point


Published April 15, 2023

Matrix on Point: Myths and Misinformation

In this panel, recorded on March 15, 2023, a group of scholars who study false histories and conspiracy theories discussed how misinformation circulates, and the effects of such myths and stories on society.

Learn More >