Abstract
Machine learning (ML) models that classify a sample as non-indicative or
indicative of life can play an important role in planning life-detection
missions. They are based on clearly defined and consistent algorithms,
regardless of sample type or origin, and make their predictions from
weighted combinations of multiple features rather than from any singular
feature. These weighted combinations can reveal the most informative
measurements within the operational constraints of a life-detection
mission. The Ladder of Life Detection (Neveu 2018) identifies the need
for an understanding of how combinations of multiple biosignatures
affect overall confidence. The present work provides a starting point to
answer this need, and future work will expand the data types to obtain
even more predictive combinations of features. Elemental composition and
isotope fractionation were chosen as the data types, as they are
available for both biogenic and abiogenic systems and not unique to
Earth biochemistry. Measurements of these data types across a wide range
of unambiguously non-indicative or indicative samples were gathered from
published literature. The varied sample measurements were then
integrated into twenty-one representative samples. The ML models only
made binary classifications of non-indicative or indicative of life.
Nonetheless, the indicative samples broadly fell into three categories:
mixed, non-alive, and alive. Four classification algorithms were trained
and tested with Monte Carlo simulations using a 70:30 train to
validation ratio. Between the models, around 75% of the test samples
were correctly classified, with variations in sensitivity and
specificity of the models. For elemental abundances predictive of a
non-indicative of life sample: all models found Ti and Si as strong and
Fe, Al, Mn, and Mg as medium. For predicting an indicative of life
sample, all models found C, N, and Carbon-13 as strong and K, H, P, and
Ca as medium. A weighted combination of multiple biosignatures is shown
to be a more effective approach to classifying sample-data than relying
on any individual biosignature or on an unweighted group of
biosignatures. Different models also made different chronic
misclassifications, suggesting that combining the outputs of multiple
models may be more effective than relying on the output of a singular
model. Which type of model to use may depend on the application, e.g.,
higher sensitivity models might be preferred in first-pass situations
where false-negatives are more costly than false-positives. Lastly, the
weighted combination of measurements in a model suggests how to combine
biosignatures to affect the overall confidence of the classification.
These results provide evidence of elemental biosignatures beyond the
CHNOPS of Earth-based life and serve as a proof of concept for
algorithmic biosignature classification.