loading page

An empirical study for revealing dramatic influence of maxSH in PYRAD/IPYRAD
  • +4
  • Jie Zhang,
  • Francisco Pina-Martins,
  • Zushi Jin,
  • Yongpeng Cha,
  • Zuyao Liu,
  • Jianli Zhao,
  • Qingjun Li
Jie Zhang
Yunnan University

Corresponding Author:[email protected]

Author Profile
Francisco Pina-Martins
Universidade de Lisboa Faculdade de Ciencias
Author Profile
Zushi Jin
Tibet Academy of Agricultural and Animal Husbandry Sciences
Author Profile
Yongpeng Cha
Yunnan University
Author Profile
Zuyao Liu
University of Bern
Author Profile
Jianli Zhao
Yunnan University
Author Profile
Qingjun Li
Yunnan University
Author Profile

Abstract

Techniques of reduced-representation sequencing (RRS) have revolutionized ecological and evolutionary genomics studies, especially favoring species without reference genome. But it is a great challenge for RRS data to precisely establish homologous loci, which is strongly associated with accuracy of downstream analyses and reliability of biological inferences. maxSH is an overlooked parameter with respect to detecting paralogs, belonging to PYRAD/IPYRAD──a prevailing pipeline for genotyping RADseq and GBS data. Using GBS data of two primroses (Primula alpicola Stapf and P. florindae Ward) and their putative hybrids, as empirical study, we explore the efficiency of maxSH on filtering paralogs and its impact on downstream analyses. At the same time, we try to assess if putative hybrids are truly speciated from hybridization. Our study sheds light on the efficiency of maxSH on filtering paralogs, and significant effects of maxSH, together with clustering threshold and missing data, on downstream analyses of outlier detection, population assignment, and demographic modelling, emphasizing the significance of carefully coping with bioinformatics process. On the other hand, although putative hybrids exhibit a genetic mixture of P. alpicola and P. florindae according to most STRUCTURE and PCA results, we cannot clearly draw a conclusion on the origin of putative hybrids due to conflicting demographic scenarios mainly resulted from altering maxSH value among nine chosen datasets. However, gene flow patterns of most optimal models from multiple maxSH values collectively indicate incomplete reproductive isolation between putative hybrids and two primroses, and the existence of indirect introgression between P. alpicola and P. florindae.