2.6 Quality control and reads mapping
Sequencing files were trimmed and the adapters were removed using
Trimmomatic 0.39 (sliding window 5:20) (Bolger, Lohse & Usadel, 2014).
The quality of the trimmed reads was evaluated using FastQC (Simon,
2010). Genomes had to meet the following quality criteria to be included
in the study: GC content of ~65% without multiple or
anomalous peaks, coverage of at least 15X, median read length Reads were
mapped using Burrows-Wheeler Aligner 0.7.17 (BWA-MEM) (Li, 2010) againstMycobacterium tuberculosis H37Rv for the identification of RDs
(regions of difference) and to check that genomes had a sequencing
coverage of at least 95%. Accordingly, the resulting bam files were
evaluated for the absence of RD4 and presence of RD1 (M. bovisspecific and to rule out BCGs, respectively) using an algorithm
previously described (Zimpel, 2020). Quality-approved reads were then
mapped against M. bovis AF2122/97 using BWA-MEM, and Picard
v2.18.23
(https://github.com/broadinstitute/picard)
was used to remove duplicates from resulting files.