Using an ensemble of FAIR assessment approaches to inform the design of future FAIRness testing: a case study evaluating World Data Center for Climate (WDCC)-preserved (meta)data

Karsten Peters von Gehlen; Andrej Fast; Heinke Höck; Andrea Lammert; Hannes Thiemann

doi:10.1002/essoar.10510032.1

loading page

Using an ensemble of FAIR assessment approaches to inform the design of future FAIRness testing: a case study evaluating World Data Center for Climate (WDCC)-preserved (meta)data

Karsten Peters von Gehlen,
Andrej Fast,
Heinke Höck,
Andrea Lammert,
Hannes Thiemann

Abstract

From a research data repositories’ perspective, offering data management services in-line with the FAIR principles is becoming more and more of a selling point to compete on the market. In order to do so, the services offered must be evaluated and credited following transparent and credible procedures. Several FAIRness evaluation methods are openly available for being applied to archived (meta)data. However, there exists no standardized and globally accepted FAIRness testing procedure to date. Here, we apply an ensemble of 5 FAIRness evaluation approaches to selected datasets archived in the WDCC. The selection represents the majority of WDCC-archived datasets (by volume) and reflects the entire spectrum of data curation levels. Two tests are purely automatic, two are purely manual and one test applies a hybrid method (manual and automatic combined) for evaluation. The results of our evaluation show a mean FAIR score of 0.67 of 1. Manual approaches show higher scores than automated ones. The hybrid approach shows the highest score. Computed statistics show agreement between the tests at the data collection level. None of the five evaluation approaches is fully fit-for-purpose to evaluate (discipline-specific) FAIRness, but all have their merit. Manual testing captures domain- and repository-specific aspects of FAIR. Machine-actionability of archived (meta)data is judged by the evaluator. Automatic approaches evaluate the machine-actionable features of archived (meta)data. These have to be accessible by an automated agent and comply with globally established standards. An evaluation of contextual metadata (essential for reusability) is not possible. Correspondingly, the hybrid method combines the advantages and eliminates the deficiencies of manual and automatic evaluation. We recommend that future operational FAIRness evaluation be based on a mature hybrid approach. The automatic part of the evaluation would retrieve and evaluate as much machine-actionable discipline specific (meta)data content as possible and be then complemented by a manual evaluation focusing on the contextual aspects of FAIR. Design and adoption of the discipline-specific aspects will have to be conducted in concerted community efforts. We illustrate a possible structure of this process with an example from climate research.