Genome scan reveals selective genes with high FST values between populations
Consistent with expectations, principal component analysis (PCA) based on the nuclear gene dataset revealed that the tetraploid USnat lineage is genetically distinct from the other lineages and accounts for the largest proportion of variations (Fig. 2a ). The octoploid AU lineage is separated from the rest of the tetraploid lineages (Supplementary Figure S1 ). We selected individuals representing pure CN and AU lineages and estimated F ST between these populations. Furthermore, we conducted F STestimation between the CN brackishwater and freshwater populations. To identify regions potentially under selection, we applied a threshold corresponding to the top 1% of F ST values across the genome. The analysis identified a total of 811 regions with highF ST values between the AU and CN lineages, predominantly located on chromosome 1 (79 windows) and chromosome 18 (74 windows) (Fig. 2b ). Among these, 111 windows overlapped with high F ST regions identified between the CN Brackish water and freshwater populations (Fig. 2d ). Most of these windows were found on chromosome 18, 10 and 1, suggesting strong adaptive introgression from the AU lineage into the CN lineage (Fig. 2c ). Within these overlapping regions, we identified 46 genes and regard them as candidate selective genes. Of these, nine genes were enriched in Gene Ontology (GO) terms related to responses to salicylic acid (Supplementary Table S3 ). These genes were proved to involve in response to environmental cues, gene expression regulation, plant defense, and see longevity (Supplementary Table S4 ). Interestingly, five of these nine genes are tandem duplicates.