In publica commoda

Veranstaltung


Creating Data and Tools for Open Cultural Analysis Activities: TORCHLITE and Beyond

Titel der Veranstaltung Creating Data and Tools for Open Cultural Analysis Activities: TORCHLITE and Beyond
Veranstalter Lehrstuhl Scientific Information Analytics (Prof. Dr. Bela Gipp)
Referent/in Prof. J. Stephen Downie
Einrichtung Referent/in University of Illinois at Urbana-Champaign
Veranstaltungsart Vortrag
Kategorie Forschung
Anmeldung erforderlich Nein
Beschreibung How can researchers explore millions of digitized texts when much of the data remains under copyright?
In this upcoming talk, Prof. J. Stephen Downie (University of Illinois Urbana-Champaign) will present new approaches to open cultural analytics through the HathiTrust Research Center’s TORCHLITE project.

The HathiTrust Research Center (HTRC) provides analytic access to 19 million volumes found in the HathiTrust Digital Library (HTDL). Roughly 10 million of the volumes in the collection are under copyright restrictions and cannot be freely shared with scholars. To provide more open access to HathiTrust’s materials, the HTRC has released its Extracted Features (EF) 2.5 Dataset, which contains over 3 trillion unigram tokens found on each of the 6 billion pages in the corpus.

Prof. Downie will provide a briefing update on the HTRC’s ongoing “Tools for Open Research and Computation with HathiTrust: Leveraging Intelligent Text Extraction” (TORCHLITE) project. Funded by the National Endowment for Humanities (NEH), TORCHLITE strives to create easy-to-use text analysis tools, dashboards, and application programming interfaces (APIs) to facilitate open cultural analytics research using the uniquely valuable HTDL data. The talk will highlight motivations, challenges, and accomplishments of the TORCHLITE to date, along with its upcoming next steps that envision the creation of an international consortium of similar groups, tentatively called the “International Consortium for Open Cultural Analytics,” which is designed to encourage Extracted Feature access to otherwise closed collections. The talk will conclude with a conversation about the upcoming sunsetting of HTRC and the role that EF will play in continuing the work of the HTRC team and other digital humanities scholars.

Speaker:
J. Stephen Downie is a professor and the Executive Associate Dean at the School of Information Sciences, University of Illinois at Urbana-Champaign. He is also the Illinois Co-Director of the HathiTrust Research Center. Professor Downie conducts work in Digital Libraries, Digital Humanities and Music Information Retrieval. He holds degrees from the University of Western Ontario, including a BA (music theory and composition), a Master’s of Library and Information Science (MLIS), and a PhD in Library and Information Science.
Zeit Beginn: 30.04.2026, 16:30 Uhr
Ende: 30.04.2026 , 17:30 Uhr
Ort Historisches Gebäude der Niedersächsischen Staats- und Universitätsbibliothek Göttingen (Papendiek 14)
Vortragsraum 1.207
Kontakt 05513925833
meuschke@uni-goettingen.de
Dateianhang