Reference genome assembly
We assembled a low-cost draft genome for Pogoniulus p. pusillus . A single male individual (sample no. AR93139) was selected for assembly, and a short-insert library was prepared by Novogene, Inc., using the NEBNext Ultra II DNA kit (New England Biolabs). The library was sequenced to a depth of approximately 50x on the Illumina HiSeq X platform at Novogene Inc., with 150 bp paired-end reads. Overlapping read pairs were collapsed and adapter and low-quality sequence removed prior to assembly using Pear v0.9.10 (Zhang et al., 2014) with minimum overlap size 20, minimum read length 30, quality score threshold 20, and maximum proportion of uncalled bases 0.02. We assembled the resulting reads with SOAPdenovo v2.04 (Luo et al., 2012) for each odd-numbered value of k between 41 and 111, with default values for all other parameters. The assembly for k=93 was chosen on the basis of higher scaffold N50 and assembly length closer to the expected genome size for birds.
We aligned these scaffolds to the zebra finch (Taeniopygia guttata ) genome using the Nucmer command in MUMmer 4.0 (Kurtz et al., 2004). Tinkerbird scaffolds aligning to zebra finch chromosomes were ordered and oriented according to these alignments; scaffolds that did not align to the zebra finch genome were ordered by scaffold size.
Single nucleotide polymorphism (SNP ) calling
Adapter sequences were removed and overlapping paired-end reads merged in PEAR v0.9.10 (Zhang et al., 2014) and reads aligned to the P.p. pusillus draft genome assembly in BWA MEM v0.7.17(Li, 2013) using default parameters. Variants were called using bcftools mpileup v 1.8 (Li et al., 2009) with mapping quality > 20 and default values for other parameters. We filtered the resulting variants using VCFtools v 0.1.13 (Danecek et al., 2011), retaining genotypes with depth 4 or greater, and loci with minor allele frequency > 0.05 that were genotyped in at least 80% of individuals.
The SNPs of interest in this study were those most likely to explain differences between red and yellow forecrown colour. Samtools has been shown to perform relatively poorly in calling indels (Hwang et al., 2015), so we manually inspected the genotype for the SNPs most strongly associated with forecrown colour traits for any indels that may have been incorrectly called.