Figure 4 : Battey, Ralph and Kern (2020) developed machine learning software, Locator , to estimate the geographic origin of a genetic sample. A) True and predicted sample locations for 153 Anopheles gambiae/coluzzii samples in sub-Saharan Africa using a total of 612 training samples and a 2Mb window size. Each inferred sample location (geographic centroid of per-window estimates) is a black point connected by a line to the true location of the sample. True sample locality point sizes are scaled by the number of training samples used for the estimate and coloured by average test error. See Battey, Ralph and Kern (2020) for further details.