The open data challenge: An analysis of 124,000 data availability statements, and an ironic lesson about data management plans
Abstract
Data availability statements can provide useful information about how researchers actually share research data. We used unsupervised machine learning to analyse 124,000 data availability statements submitted by research authors to 176 Wiley journals between 2013 and 2019. We categorised the data availability statements, and looked at trends over time. We found expected increases in the number of data availability statements submitted over time, and marked increases that correlate with policy changes made by journals. Our open data challenge becomes to use what we have learned to present researchers with relevant and easy options that help them to share and make an impact with new research data.