Essential Site Maintenance: Authorea-powered sites will be updated circa 15:00-17:00 Eastern on Tuesday 5 November.
There should be no interruption to normal services, but please contact us at [email protected] in case you face any issues.

loading page

New deep learning-based methods for visualizing ecosystem properties using environmental DNA metabarcoding data
  • +7
  • Letizia Lamperti,
  • Théophile Sanchez,
  • Sara Si Moussi,
  • David Mouillot,
  • Camille Albouy,
  • Benjamin Flück,
  • Morgane Bruno,
  • Alice Valentini,
  • Loïc Pellissier,
  • Stéphanie Manel
Letizia Lamperti
EPHE PSL

Corresponding Author:[email protected]

Author Profile
Théophile Sanchez
ETH Zurich
Author Profile
Sara Si Moussi
TIMC-IMAG
Author Profile
David Mouillot
MARBEC
Author Profile
Camille Albouy
ETH Zurich
Author Profile
Benjamin Flück
ETH Zurich
Author Profile
Morgane Bruno
CEFE
Author Profile
Alice Valentini
SPYGEN
Author Profile
Loïc Pellissier
ETH Zürich
Author Profile
Stéphanie Manel
CNRS
Author Profile

Abstract

1. Metabarcoding of environmental DNA (eDNA) has recently improved our understanding of biodiversity patterns in marine and terrestrial ecosystems. However, the complexity of these data prevents current methods to extract and analyze all the relevant ecological information they contain. Therefore, ecological modeling could greatly benefit from new methods providing better dimensionality reduction and clustering. 2. Here we present two new deep learning-based methods that combine different types of neural networks to ordinate eDNA samples and visualize ecosystem properties in a two-dimensional space: the first is based on variational autoencoders (VAEs) and the second on deep metric learning (DML). The strength of our new methods lies in the combination of several inputs: the number of sequences found for each molecular operational taxonomic unit (MOTU), together with the genetic sequence information of each detected MOTU within an eDNA sample. 3. Using three different datasets, we show that our methods represent well three different ecological indicators in a two-dimensional latent space: MOTU richness per sample, sequence α-diversity per sample, and sequence ꞵ-diversity between samples. We show that our nonlinear methods are better at extracting features from eDNA datasets while avoiding the major biases associated with eDNA. Our methods outperform traditional dimension reduction methods such as Principal Component Analysis, t-distributed Stochastic Neighbour Embedding, and Uniform Manifold Approximation and Projection for dimension reduction. 4. Our results suggest that neural networks provide a more efficient way of extracting structure from eDNA metabarcoding data, thereby improving their ecological interpretation and thus biodiversity monitoring.
12 Apr 2023Submitted to Molecular Ecology Resources
15 Apr 2023Submission Checks Completed
15 Apr 2023Assigned to Editor
15 Apr 2023Review(s) Completed, Editorial Evaluation Pending
20 Apr 2023Reviewer(s) Assigned
22 Jun 2023Editorial Decision: Revise Minor
29 Jul 20231st Revision Received
31 Jul 2023Submission Checks Completed
31 Jul 2023Assigned to Editor
31 Jul 2023Review(s) Completed, Editorial Evaluation Pending
14 Aug 2023Editorial Decision: Accept