Genome-wide association analysis
We tested for an association between each genetic marker and each of the five colour traits using a linear mixed model implemented in GEMMA 0.94 (Zhou & Stephens, 2012), which uses a relatedness matrix to control for population structure. Because GEMMA requires loci with no missing data, we imputed missing genotypes using Beagle v5.1 (Browning et al., 2018), based on genotype likelihood scores. Significance was assessed based on a threshold of -log10(α ) = 6, which results in approximately one expected false positive per trait, taking into account the 104,933 variants tested for each trait. Following Brelsford et al. (2017), we set this significance threshold for consistency across all traits rather than using false discovery rates (FDR) (Benjamini & Hochberg, 1995), which would result in different significance thresholds per trait. We note that this threshold was approximately in line with the most conservative threshold obtained using FDR, but FDR would also have resulted in significance thresholds as low as -log10(α ) = 4 for certain traits. Scaffolds significantly associated with forecrown colour were then compared with sequences of candidate genes CYP2J19 on the zebra finch genome and BC02 on the downy woodpecker (Dryobates pubescens ) genome using BLAST.