Gene structure and functional annotation
To annotate the draft genome of the black-faced spoonbill, we used
RNeasy Mini Kit to extract total RNA from a blood sample from the same
individual used for reference genome sequencing following the
manufacturer’s instructions. RNA was sequenced by Genomics BioSci &
Tech (Taipei, Taiwan) using a TruSeq Stranded mRNA Preparation Kit on an
Illumina NextSeq 500 platform with the average insert size of 180 bp. We
used Trinity v 2.4.0(Grabherr et al., 2011) to analyze 5.2
GB of RNA-seq reads. We used both de novo and genome guide (using
the draft genome assembly as a reference) methods to generate 273 Mb and
171 Mb of RNA assembly, respectively. Then, the two assemblies were
merged with PASA (Haas et al., 2003). TransDecoder inPASA was used to identify likely coding regions (CDS) with the
CDS of the genome of the crested ibis Nippon nippon(GCF_000708225.1_ASM70822v1(S. Li et al., 2014)) as the reference.
Finally, we obtained 190 Mb of spoonbill RNA assembly as the gene model
to train Augustus v3.2.3 (Stanke & Morgenstern, 2005) for gene
prediction. Augustus predicted 11,733 genes totaling 411 Mb (35%
of the total genome) and 25 Mb of CDS in the spoonbill genome. We
extracted the coding regions of all annotated genes and blasted these
CDS to Gallus_gallus-5.0 (GCF_000002315.4) with BLASTP (v.
2.6.0; e-value cut-off= 10-6). There were only 10,558
genes that met the e-value cut-off criterion.