Main Content

Agile quality assurance of metadata on cultural objects in the context of data integration processes

Funded by the German Research Foundation (2023-2025).

The quality of metadata of cultural objects plays a critical role in their accessibility and further use. This applies in principle to all data offerings, but it is particularly true for common platforms such as the German Digital Library (DDB) and the Graphikportal as well as for the growing data offerings of NFDI consortia such as NFDI4Culture and Text+. The quality of the data to be integrated is very often different from that expected by the target systems. The data must be analysed and, if necessary, adapted prior to the integration into a target system in a usually time-consuming process and in dialogue between data providers and data recipients. If software is used for quality assurance, it must be adapted by software developers as soon as changes are made to the quality requirements. So far, domain experts have hardly had a chance to carry out quality assurance on their own with existing tools, much less to adapt existing quality requirements.

The aim of the project is to enable domain experts with little technological knowledge to define and carry out the subject-specific quality assurance of metadata themselves. To this end, a process and web-based software will be developed that allows the domain experts to semi-automatically define the quality requirements prior to the actual quality assurance. Subsequently, the required input for various quality assurance techniques, e.g. Schematron, or existing software and frameworks, such as the Metadata Quality Assessment Framework, will be generated automatically. These existing components are embedded in the quality assurance process and can be used without the technical expertise previously required. The quality analysis and associated possible quality improvements will be defined separately and can be carried out independently of each other. The software developed in the project will be the core of a self-contained process for agile quality assurance, which in turn can be integrated into existing data integration processes, but can also be used independently of them. The process will be more automated and easier to adapt than previous quality assurance mechanisms. The concrete use cases that will be addressed are the quality assurance of LIDO data for integration into the DDB and the Graphikportal, and of TEI header data in the TextGrid repository. The evaluation of the approach is embedded in the NFDI consortia NFDI4Culture and Text+. Since the approach is independent of concrete data formats and technologies and thus generic, transferability to the quality assurance of metadata in other application areas is possible.

Cooperation partners

Péter Király, PhD, Gesellschaft für wissenschaftliche Datenverarbeitung Göttingen (GWDG) - LinkedIn profile
Regine Stein, Niedersächsische Staats- und Universitätsbibliothek Göttingen (SUB)