PhD Defence in Digital Media: “Integration of models for linked data in cultural heritage and contributions to the FAIR principles”

Candidate:
Inês Dias Koch

Date, Time and Location
1st of July 2025, 14:30, Sala de Atos da Faculdade de Engenharia da Universidade do Porto

Title:
“Integration of models for linked data in cultural heritage and contributions to the FAIR principles”

President of the Jury:
João Carlos Pascoal Faria (PhD), Full Professor, Department of Informatics Engineering, Faculdade de Engenharia da Universidade do Porto.

Members:
Maja Žumer (PhD), Full Professor, Department of Library and Information Science of the University of Ljubljana, Slovenia;

María Poveda Villalón (PhD), Associate Professor, Departament of Artificial Intelligence of the Technical University of Madrid, Spain;

José Luís Brinquete Borbinha (PhD), Full Professor, Department of Computer Science and Engineering, Instituto Superior Técnico da Universidade de Lisboa;

Pedro Manuel Rangel Santos Henriques (PhD), Full Professor, Department of Informatics, Escola de Engenharia da Universidade do Minho;

Carla Alexandra Teixeira Lopes (PhD), Associate Professor, Department of Informatics Engineering, Faculdade de Engenharia da Universidade do Porto (Supervisor);

Mariana Curado Malta (PhD), Assistant Professor, Department of Informatics Engineering, Faculdade de Engenharia da Universidade do Porto.

The thesis was co-supervised by Maria Cristina de Carvalho Alves Ribeiro (PhD), Retired Associate Professor in the Department of Informatics Engineering, Faculdade de Engenharia da Universidade do Porto.

Abstract:

The various areas of Cultural Heritage, such as archives, museums, and libraries, have invested in tech-
nological development and innovation to make their resources available to users more efficiently and completely. To this end, the description of these resources is essential so that they are explained in
terms of their context and content, as well as to facilitate their intelligibility and accessibility. In this
sense, each area has begun to develop its own models and standards for describing the cultural objects
it deals with. This has made these standards sector-specific and only able to fulfil the information needs
within the area of knowledge they were developed, exploring only the information described within their
domain. As a result, linking resources from different information sources is challenging.
With the need to make the standards and models more interoperable, linked data models emerged in
Cultural Heritage. These models make it possible to link the various concepts from the different heritage
areas efficiently and effectively, considering the Semantic Web’s characteristics.
In Portugal, the Portuguese National Archives felt the need to develop a linked data model to describe
their cultural objects, which led to the creation of the EPISA Project, the project from which this research
emerged. Thus, this work aims to develop a linked data model to describe archival records, as well as to
connect them with other heritage domains, integrating them with existing linked data models, promoting
the access and reuse of data from heritage institutions based on the specialised description associated
with the cultural objects of these institutions. Additionally, it aims to link existing data models to data from other sources available on the Web, such as Wikidata and DBpedia.
We carry out a study that includes existing data models in Cultural Heritage, such as CIDOC CRM
in museums, RiC-CM in archives, and LRMoo in libraries, along with models that have emerged within
Web projects, such as DBpedia and Wikidata. By describing archival objects, as well as creating and
exploring relationships between other data models, this study identifies common characteristics and
principles, as well as the distinctive aspects of each area. Furthermore, it identifies the possibility of linking elements of the various models, ensuring that the models can be adapted to applications without losing the richness of the conceptualisation carried out in each of the domains.
In a context in which the Web promotes the explicitness of data semantics through the Semantic
Web and provides tools to represent it, it is necessary, on the one hand, to create links between models from different communities and, on the other, to adjust the complexity of each model to each application according to its specific requirements. The FAIR Principles (Findable, Accessible, Interoperable, Reusable) were therefore used as one of the sources for the requirements that data and metadata must fulfil to have a modular structure. We bring together a collection of use cases linked to archives users, including profiles ranging fromcollection managers to heritage promoters and informal users. In addition, we compile and evaluate a set of data modelling experiences using different models.
This work resulted in ArchOnto, a modular ontology that describes archive records. It was developed
considering existing archive standards and validated by experts in the field, specifically archivists from the Portuguese National Archives. ArchOnto is based on CIDOC CRM, combined with four other
specific ontologies also developed in this work.
The development of ArchOnto led to the creation of a prototype platform designed to explore and
manipulate archive records. Additionally, it offers the potential to apply this ontology to other domains, specifically to the representation of cinematographic records.

Keywords: Cultural Heritage; Linked Open Data; Data Integration; Semantic Web; FAIR Principles; Digital Humanities.

Posted in Events, Highlights, News, PhD Defenses.