Structural and Functional Annotation
We performed the structural annotation with BRAKER 2.1.2 (https://github.com/Gaius-Augustus/BRAKER) (–etpmode) using data from both the Cory’s shearwater proteome (Feng et al., 2020), and the RNA-Seq data generated in this work. Since the inclusion of RNA-Seq data appeared detrimental, we excluded this piece of information to perform the final annotation using the soft-masked genome with BRAKER 2.1.2 (–prg=gth –trainFromGth).
We made the functional annotation of the predicted genes using a similarity-based approach. We determined the protein domains with InterProScan 5.31-70.0 (Jones et al., 2014), used BLASTP (Altschul, Gish, Miller, Myers, & Lipman, 1990; Camacho et al., 2009) (-evalue 1e-5; -max_target_seqs 10) against the Swiss-Prot database (Boutet et al., 2016) and the Cory’s shearwater and the Zebra finch reference (UP000007754) proteomes. Transcripts were annotated in the same manner. We also annotated the ncRNAs using cmscan from INFERNAL 1.1.2 (Nawrocki, Kolbe, & Eddy, 2009) with the covariance models (CMs) from the Rfam 14.1 database, and tRNA genes using tRNAscan-SE 2.0.5 (Chan & Lowe, 2019).