Seamless Long-Tail and Big Data Access via the EarthCube Brokering
Cyberinfrastructure BALTO
Abstract
The EarthCube BALTO broker (Brokered Alignment of Long-Tail
Observations) provides streamlined access to both long-tail and big data
using Web Services through several distinct mechanisms. First, we
updated the OPeNDAP framework Hyrax, software that serves big data from
USGS, NASA, and other sources, with a BALTO extension that tags dataset
landing pages with JSON-LD encoding automatically. Therefore, the big
data made available through Hyrax are now searchable via EarthCube
GeoCODES (formerly P418) and Google Dataset Search. The BALTO broker
extension to Hyrax makes thousands of datasets easily searchable and
accessible. Second, we focused our efforts on a geodynamics use-case
aimed at advancing our understanding of continental rifting processes
through the use of an NSF mantle convection code called ASPECT. By
addressing this use-case, we implemented a web services brokering
capability in ASPECT that allows for remotely accessing datasets via a
URL defined in an ASPECT parameter file. Third, through another use-case
in ASPECT aimed at testing hypotheses involving global mantle flow, we
developed a brokering mechanism for a “plug-in” that accesses NetCDF
seismic tomography data from the NSF seismology facility IRIS, then
transforms it into the format needed by ASPECT to run global mantle flow
models constrained by seismic tomography. Fourth, we demonstrate methods
to allow any scientist or citizen scientist to make their in-situ IoT
based sensor data collection efforts available to the world. Finally, we
are developing a Jupyter Notebook with a GUI that allows for users to
search Hyrax servers for big datasets and long-tail data. These
cyberinfrastructure developments comprise the entire EarthCube BALTO
brokering capabilities.