Abstract
Continuous collection and analysis of high-resolution phenotype data is
critical to develop crops resilient to the consequences of climate
change. Though web-accessible tools for parallel, reproducible
scientiSic workSlows render big data increasingly tractable, software
for plant science remains inadequate for large-scale precision
agriculture. Cyberinfrastructure must present minimal barriers to entry,
accommodate rapidly changing dependencies, support a wide variety of use
cases, and weave together sensors at the edge, laptops, clusters, and
cloud storage into a coherent virtual workspace. PlantIT is a web portal
intended as such an environment. Platforms like PlantIT and its
precursor DIRT [1] permit efSicient phenotyping and equip
geographically distributed researchers with a code-optional interface.
WorkSlows are published in Docker images, deployed as Singularity
containers to public or private computing resources, and monitored in
real time. Data are stored automatically in the CyVerse Data Store and
can be annotated according to the MIAPPE [2] standard. GitHub
integration provides versioning and repositories can be activated with a
single conSiguration Sile, like Travis or GitHub Actions. Containers
allow for a range of use cases, including image-based trait
measurements, 3D reconstructions, morphological growth simulations, and
crop modeling. Pseudo-batch/stream processing is also necessary; as data
scales, manual batch jobs rapidly become infeasible, and (re-)analysis
must occur upon arrival in near-real-time. We suggest web-accessible
phenotyping automation software may address bottlenecks and help reveal
undiscovered relationships between genes, traits, and the environment.