Developing the Data Set of Nineteenth-Century Knowledge


A project to study the structure and transformation of nineteenth-century knowledge via computational analysis of several editions of the Encyclopedia Britannica from 1788 to 1911.

This project draws on historic editions of the Encyclopedia Britannica, a vital resource of knowledge to build one of the most extensive, open, digital collections available today for studying the structure of nineteenth-century knowledge and its transformation. The most comprehensive representation extant of what constituted official knowledge at the time, they also demonstrate changes in the nature of knowledge in the English-speaking world. The project creates the first accurate textual data for this corpus and extends its usability by applying innovative methods to automatically generate metadata for each of the 100,000 entries. Each entry will be tagged with both current and historical subject categories. At the end of the grant period, all of the data will be made freely available, and a series of experiments will be conducted to identify the feasibility of tracking concept drift across time within the corpus.

Read more about the 19th Century Knowledge Project.

Project Team

  • Sam Grabus, Lead Research Assistant
  • Peter Logan, Project PI
  • Jane Greenberg, Project PI


Grabus, S. (In Press, 2020). Evaluating the Impact of the Long S upon 18th-Century Encyclopedia Britannica Automatic Subject Metadata Generation Results. Information Technology and Libraries.
*Winner of 2020 LITA/Ex Libris Student Writing Award 
[ALA Press Release]

Grabus, S., & Pascua, S (2020). 19th-Century knowledge representation: Indexing the data set of 19th-century knowledge (2018) & SKOS of the 1910 LCSH (2019). Poster presented at the LEADS Forum. Philadelphia, PA

Logan, P. M., Greenberg, J., & Grabus, S. (2019). Knowledge Representation: Old, New, and Automated Indexing. In proceedings of Digital Humanities Conference 2019, Utrecht, The Netherlands. [Abstract]

Grabus, S., Greenberg, J., Logan, P., & Boone, J. (2019). Representing aboutness: Automatically indexing 19th-Century Encyclopedia Britannica entries. NASKO, Vol. 7. pp. 138-148. [Paper].