Automatic topic detection and collaborative topic tagging in Archives Portal Europe’s multilingual environment – by Kerstin Arnold

The European Digital Treasures team wants to present the various presentations held within the workshop “New Digital Exponential Technologies Towards The Generation Of Business Models” on 2nd and 3rd of September, 2021 at the Provincial Historical Archive of Alicante (Spain). For this reason, we will post about each of the presentations within the upcoming weeks – stay tuned!

Kerstin Arnold.

The first speech was held by Kerstin Arnold who has been working in the archives domain for more than 15 years! Having  been part of various projects creating and establishing Archives Portal Europe, Kerstin is now the initiative’s acting COO in the role of the APEF Manager. She holds a Master degree each in Communication Science and in Library and Information Management and also is a member of the  Technical Subcommittee on Encoded Archival Standards (TS-EAS) at the Society of American Archivists.

Abstract. Archives Portal Europe is a comprehensive and open resource on archives from and about Europe, that currently holds archival descriptions from more than 30 countries and in more than 20 languages. Following traditional approaches of archival description, the portal allows users to access the documents via the contextual entities of the records creators and the holding repositories, next to a general keyword search. To evaluate options for subject- or topic-based access points, Archives Portal Europe is working on an automated cross-lingual topic detection tool that aims at enabling users to identify relevant documents related to a topic well beyond the narrowness of direct keyword matching. Synergising different approaches for concept-based and entity-based topics, the tool then also is meant to allow for active topic tagging in order to improve coverage of topic-based relations between the heterogeneous and multilingual documents present in Archives Portal Europe. Building on the current status quo in the portal, this paper presents the tool’s set-up, initial results from the proof-of-concept phase, and next steps envisaged during alpha and beta development of the tool, which will be made available as Open Source to also be of benefit for other, similar initiatives in the cultural heritage sector.

You can watch the whole session on YouTube here and read the manuscript paper here!

Written by Kerstin Arnold & the European Digital Treasures Team.