2.2 Open science and metadata education to address data needs
To address the data-related challenges outlined above, we support the ongoing shift towards open science within the Canadian hydrology community. Open science is a movement to make scientific publications, data, and software publicly accessible. The movement already has a strong following. For example, funding agencies in Europe are mandating open access publications (Schiltz et al., 2018), publishing datasets in data journals is becoming increasingly popular (Carlson and Oda, 2018) and “negative” results are being discussed and published more often (van Emmerink et al., 2018). Open science is also popular among the global hydrological community where a survey of 336 hydrologists showed that 97% of participants felt all data should be shared, though no consensus was formed on exactly how to share data and acknowledge the person or group who collected them (Blume et al., 2017).
In a Canadian context, we suggest the hydrology community could benefit from enhanced use of data sharing platforms (or developing Canada-focused communities on existing platforms) to help combat the fragmented state of many datasets. The use of communal databases or online repositories (e.g. Zenodo) that allow for responsible and consistent storage of datasets and models would ensure data are visible and accessible, contain sufficient metadata, and are properly quality-controlled. The adoption of such communal databases could reduce research redundancy, facilitate integrated research efforts and comparative studies, and lead to more broadly applicable findings and higher impact publications from the Canadian hydrologic community.
Beyond simply making data accessible, including appropriate metadata is essential to effective data-sharing. Since ECRs are often producing and archiving datasets, we would benefit from more integration of data management practices into graduate training curriculum. Furthermore, data stewardship efforts could be enhanced by including standardized procedures and templates within individual research groups, which has been shown to increase model sharing (Weiler and Beven, 2015). These templates could include naming conventions, file formats, metadata structure, and collection techniques during fieldwork. Templates could be shared with incoming ECRs, enhancing learning, promoting institutional memory and allowing ECRs to focus on new findings. Considering the short residence time of some ECR positions, longer-term members of the research team such as laboratory managers, field technicians, and professional research associates, could play a key role in developing and maintaining standardized datasets. Data management and protocol development require a time investment, but we argue this initial cost is rewarded by facilitating data sharing and the subsequent advance in scientific understanding.