2. Tissue-based sequencing and primer design
For reference, the partial control region sequences of approximately 16 individuals per population were amplified from tissue-based genomic DNA used in previous studies. Although Nakajima et al. (2021, 2023) did not include the distant populations Pop 20 and 21, tissue samples of some individuals were actually obtained in 2020, and these samples were also used as outgroups in the present study (Table 1). First, the entire control region was amplified with the primers L-Thr (5′-AGC TCA GCG YCA GAG CGC CGG TCT TGT AA-3′) and H12Sr5 (5′-TGA TAA TAA AGT CAG GAC CAA G-3′) (Yokoyama and Goto 2002) using TaKaRa Ex Taq Hot Start Version (Takara Bio, Shiga, Japan) with each 10 µ L reaction containing 1.0 µ L of 10×Ex Taq Buffer, 0.8 µ L of dNTPs (each 2.5 mM), each 0.5 µ L of 10 µ M primers, 0.05 µ L of TaKaRa Ex Taq HS, and 1.0 µ L of genomic DNA. The PCR conditions were initial denaturation at 94°C for 2 min and 30 cycles of denaturation at 98°C for 10 sec, annealing at 58°C for 30 sec, and extension at 72°C for 1 min. After the PCR products were purified using the ExoSAP-IT Express PCR Product Cleanup Reagent (Thermo Fisher Scientific), the 5′-end of the control region, which is the most used region in studies of Japanese sculpin (Yamamoto 2019), was sequenced by Eurofins Genomics (Tokyo, Japan) using the reverse internal primer H16498m (5′-CCT GAA RTA GGA ACC AAA TG-3′) (Yokoyama and Goto 2002). Sequence data were aligned using Clustal W (Thompson et al. 1994) implemented in BioEdit (Hall 1999), and unique haplotypes were identified with the aid of DnaSP 6 (Rozas et al. 2017). Primers for eDNA were designed to amplify both the haplotypes from this study and those reported in Yokoyama and Goto (2002) (Table S2), as well as to have an amplification product length of around 400 bp, which can be sequenced by Illumina MiSeq:
CNdloopS1_F: 5′-ACA CTC TTT CCC TAC ACG ACG CTC TTC CGA TCT NNN NNN GCT CAA AGA AAG GAG ATT YTA ACT C-3′
CNdloopS1_R: 5′-GTG ACT GGA GTT CAG ACG TGT GCT CTT CCG ATC TNN NNN NCC GTT GGC ATT AAG AAA TCA ACT G-3′
Furthermore, not only tissue-based data from the same D-loop region, but genome-wide SNP data, which are considered more informative, were used as a partial reference. Data from multiplexed ISSR genotyping by sequencing (MIG-seq; Suyama and Matsuki 2015) at the studied 21 sites were downloaded from the DDBJ Sequence Read Archive (DRA) under accession number DRA017315, and SNPs were identified from the dataset at the 21 sites for examining the population structure and genetic differentiations (details found in Text S1).