It’s all about the message to be finally heard – Looking at FAIR Digital Objects (FDOs) from a PID perspective

On October 28th 2022 the Leiden Declaration on FAIR Digital Objects was signed by more than 130 representatives, most of them from European infrastructure and research institutions. The declaration calls for optimal conditions for digital, open science. This demand is operationalized by the concept of FAIR Digital Objects.

The presentation of the declaration formed the final of the 1st International Conference on FAIR Digital Objects (FDO2022) that took place in Leiden, Netherlands, from October 26th to October 28th.  The goal of the conference was to illuminate the concept of FAIR Digital Objects (FDOs) from research, technology and policy perspectives. Numerous speakers representing key players in the context of especially European digital research infrastructures including the European Open Science Cloud (EOSC) presented their works according to and their thoughts on FDOs and how to interpret the concept.

The idea of FDOs is to provide a framework that helps to optimize the conditions of digital research data management on a global scale by finding a way to apply the FAIR principles to digital objects such as research data and all components that are involved in their production, storage and distribution. The principles describe technical requirements to improve the Findability, Accessibility, Interoperability and Re-usability of research resources and emphasize the role that standardized metadata, persistent identifiers and protocols play in meeting these goals. Explanations of the principles are provided in our TIB Blog post.

The noble claim of supporting the implementation of FAIR principles by providing a way to operationalize them was preceded by two years of working on the FAIR Digital Object Framework (FDOF). The result of it is an abstract model of a digital object that internalizes the different aspects of the principles. The FDOF provides a technical approach and defines a digital object as a bit sequence that can be located in a specific digital environment (e.g. a repository) and can be of different types (e.g. data, datasets, documents, software, metadata etc.). This digital object is defined as FAIR – according to the FAIR Principles – if it is wedded to a persistent identifier (PID), describing metadata and is classified by the FDOF typing system (see Figure 1). The typing system contains information about the type of an object, its encoding format, what kind of entities the object represents as well as applicable and allowed operations with the object.

Figure 1: FDO Model

Although PIDs are a fundamental component of FDOs, the framework states a difference between the FDO approach and existing PID approaches to FAIR: The first implies that a digital object can be found directly by machines and does only need a small subset of metadata that contains information e.g. about the type of object, where to find more metadata (bibliographic and content metadata), which format it has and which licenses are applied to it. The concept states that a FDO wants to overcome the deficiencies of landing pages as used by current PID infrastructures. However, it overlooks that the information about an object that should be available within the FDO realm are already addressed by PID infrastructures such as the DOI system – provided that it follows best practice guidelines (see e.g. DataCite Best Practices for DOI Landing Pages): Digital objects should ideally be findable via a human- AND machine-readable landing page containing a specific set of metadata and linked entities to provide answers on precisely the questions about access, interoperability and further use.

Landing pages are a constitutional component for the long-term discoverability of digital objects, because they do not aim at direct access to the object, whose location can change again and again, but provide information about the object and its location in the long term, if standards and best practices in metadata maintenance are complied with. Metadata are important sources for the context of a digital object and provide, among other things,  information about the provenance, linkage to other objects (e.g. via the relation research data – publications – data curator – author – research project – etc.), about licenses, storage locations, or storage formats. The FDO concept criticizes the technical hurdles of this indirect access for artificial clients and the dependence on maintenance routines, compliance with standards and best practices, but the system is still one of the most efficient ways to achieve real persistence of digital objects at the moment. The development of and adherence to standards, orientation to best practices, specifications for maintenance of PID metadata, and sustainable financing models are social agreements that are negotiated, built, and carried forward over a long process within the PID community. From a technical perspective, they may have weaknesses, but from a social perspective, they provide a stable network that aims for persistence.

What the FDO approach ultimately wants to enable is more independence from the respective access conditions of the repositories or other storage locations of digital objects, in order to enable more efficient handling of e.g. research data. However, there are not only technical hurdles on the way here, but also numerous legal and organizational, or discipline-specific ones.

The discussions at the FDO2022 conference have revealed that the expectations of what FDOs should accomplish offer as much room for interpretation as the FAIR principles itself: From a purely technical view of what FDOs are and must achieve, namely to be machine-actionable; a manifestation of converged standards across disciplines and sectors in data management; but they can also be a political idea which is intended to break down technical, social, organizational, and legal barriers in digital research data production and consumption. The statements about FDO thus also varied from a concept that will revolutionize the flow of knowledge in research and industry to the acknowledgement that many actors have been trying for a long time to make their research resources available as FAIR as possible.

Common to all interpretations is that the goal should be to design infrastructures in such a way that they maximally support researchers in their day to day work and minimally burden them with the requirements of FAIR data management. This is in any case a genuine task of scientific infrastructure services. There is common agreement that the technical prerequisites for FAIR data management are the easiest to solve. More complex are social agreement processes on standards, for example. The greatest and still persistent challenge, however, is a reliable provision of financial and human resources to make research objects available in a persistent manner.

