The evolution of the ZZB-TE, Cyp6p4-236M andCyp6aap-Dup1 haplotypes
Based upon Ag1000g data and a time series of collections from Central and East Africa, we were able to trace the sequence of mutational events (ZZB-TE, Cyp6p4-236M and Cyp6aap-Dup1 ) and reconstruct the evolutionary history of the swept haplotype. Among the Ag1000g data [19], the Cyp6p4-236M mutation was only observed in collections from eastern Uganda (collected in 2012) suggesting that this mutation originated in the eastern Ugandan/western Kenyan region. In a screen of collections from Uganda and Kenya predating the Ag1000g collections by eight years (2004) (Fig 3) only the ZZB-TE insertion was detected, although the sample size was too small (n=4) to conclude that the Cyp6p4-236M allele was absent. The Cyp6p4-236Mmutation was first observed in this region in 2005 (frequencyCyp6p4-236M =0.10) in individuals carrying the ZZB-TE mutation, whilst the Cyp6aap-Dup1 CNV was first recorded in 2008 (proportion of individuals with Dup1=0.8%). This inferred sequence of events may explain why ZZB-TE and Cyp6p4-236M mutations are in tighter statistical linkage with each other than withCyp6aap-Dup1 (Figure 2), despite the closer proximity of ZZB-TE and Cyp6aap-Dup1 (Figure 1). Given the very tight association between the ZZB-TE insertion and the Cyp6p4-236M SNP, we will henceforth refer to the Cyp6p4-236M (double mutant) haplotype and the Cyp6aap-Dup1 (triple mutant). The double mutant haplotype shows a steady increase in frequency between 2004 and 2011 in Kenya (Figure 3); possibly in response to the introduction and subsequent intensification of bednet distribution programmes [29, 33, 34]. Following its appearance in 2008, the triple mutant haplotype, rapidly increased towards fixation in both collections from Uganda and Kenya, replacing the double mutant. This haplotype replacement and the observation that the triple mutant is the only non-wildtype haplotype observed outside Kenya/Uganda (such as in Tanzania and DRC, Figure 1 and 3) strongly implies an additional selective advantage to the triple mutant. The time series data from across DRC are particularly striking both in terms of the speed of increase of the triple mutant but also the north-south heterogeneity, with very low frequencies in the more southerly provinces (Figure 3).