The genetic diversity present in cultivated
plant varieties generally only represents a small fraction of the total
diversity present in the species from which the cultivar was derived (
Kovach 2008). This
reduction in diversity is dramatically exemplified by soybean, wherein ~85% of
North American breeding germplasm is derived from 18 landraces (
Cornelious 2002). Because of lack of variation in cultivars,
the global diversity of a crop is commonly mined for beneficial alleles, such
as novel alleles for pest-resistance.
These alleles can then be introgressed into cultivars via marker
assisted selection. Historically, the
introgression of traits from wild or semi-wild germplasm has generally been
limited to simple, Mendelian traits.
Such traits are easier to identify with confidence, they are less
dependent on genetic background, and they are simpler to track during
introgression. Though wild yield alleles
have been successfully identified using near isogenic lines, these methods are
highly resource intensive and often miss relevant alleles().
PI416937
is a Japanese landrace found in the pedigree of many major cultivars in the
Southeastern US, most notably Woodruff (
Boerma 2012). Woodruff is 25% PI416937 by pedigree and was found to yield % of elite checks in Southern USDA Regional trials in year. PI416937 has many traits that distinguish it from more common elite
soybean varities, including slow-wilting (
Fletcher 2007;
King 2009;
Abdel-Haleem 2012), expansive fibrous roots (
Hudak 1996;
Pantalone 1996;
Busscher 2000;
Purcell 2007;
Abdel-Haleem 2010), aluminum tolerance (
Goldman 1989;
Bianchi-Hall 2000), large leaf
surface area and overall drought stress tolerance in general (
Goldman 1989;
Sloane 1990). Many other lines derived from PI416937 exhibited substantial vigor relative to checks and had a significant yield advantage (). Thus, unlike the
more common case in which exotic germplasm is used as a donor of a specific,
Mendelian trait, PI416937 and its derived lines are a model for the effective
use of exotic germplasm in producing immediate yield increases and providing
diversity for long-term genetic gain.
In
this study, we aimed to use genome-wide marker data and the known pedigree
information related to PI416937 in order to track exotic regions that were
selected for and against over the course of the last 30 years. The idea of exploiting breeding pedigrees to
detect selected loci has been use previously in attempts to detect agronomically
important loci in soybean (
Shoemaker 1992;
Lorenzen 1995;
Sebastian 1995;
Grainger 2013). The
approach is analogous to transmission disequilibrium tests pioneered in animal
genetics. Released varieties or breeds
are assumed to be the product of selection and thus alleles conferring superior
fitness are expected to deviate from random (50%) transmission (
BINK 2000). While original
versions of the test depend on a heterozygous parent, the test is easily
adapted to selfing crops in which the entire F1 population of an inbred cross
is assumed to be heterozygous for all segregating alleles (
Jannink 2001). Though the approach is theoretically very powerful, previous
studies employing it suffered from low marker density (
Shoemaker 1992;
Lorenzen 1995) or gapped pedigrees that made rigorous
statistical inference problematic (
Grainger 2013). Higher marker density allows for the
confident inference of shared haplotypes and thus the ability to accurately
define and count the number of crosses that truly test a locus for the
influence of selection.
Haplotype
consolidation based on identity-by-descent (IBD) should theoretically benefit
any genotyping strategy that does not result in complete knowledge of the
allelic states of all polymorphisms in the population under study (
Jordan 2005). This conclusion stems
from the fact that regions that are IBD will share not only the identical typed
alleles but intervening untyped alleles as well (excluding novel mutations that
have occurred since divergence from the last common ancestor). Since there is generally a greater chance of
these untyped alleles being the causal mutations underlying a phenotype, IBD
more effectively reflects the phenotypic impact of possessing that genomic
region.
In
this study we used a two-step process that infers which genomic regions are
derived from the two parents and then which regions in the parents are derived
from PI416937. For all markers in the
study, any cross which contained a PI416937 allele in one parent and a
non-PI416937 allele in the other parent was considered a single test of that
locus. If the PI416937 allele was
inherited in such tests more or less frequently than a binomial model would
predict, then we considered this evidence for selection. Identified alleles were independently tested
across a range of environments and recombinant-inbred populations for which at
least one parent had PI416937 in its pedigree.
As potential validation for regions that appeared to be under positive selection from PI416937, we examined previous literature for quantitative trait loci (QTL) that had been previously mapped in studies involving PI46937. We also investigated these peak regions under selection for potential candidate genes that may be conferring a yield advantage. Regions we found under negative selection were investigated for QTL conferring traits considered agronomically undesirable for soybean production. We examined five breeding populations composed of F5 derived recombinant inbred lines (RILs) with PI416937 in their pedigrees to examine any relationship between regions under selection within the
RIL populations and our pedigree based analysis. These RIL populations had undergone phenotypic selection based upon visual agronomic traits (i.e., lodging, height, and visual estimation of yield). Selection occurred on individual F5 plants as well as F5 derived plant rows from selected plants. We also wanted to look at PI46937 derived lines from our analysis that were considered the highest yielding to see if their was a relationship between their phenotypic superiority and presence of regions under positive selection from PI416937 while also having an absence of regions under negative selection from PI416937.
After identifying these regions from PI416937 under breeding selection, we wanted to examine the potential to introgress these exotic alleles into genomic regions that are low in diversity in North American germplasm. There are regions throughout the soybean genome which have been identified that have either lost diversity due to selection or have historically, over the last several decades of North American breeding, never had diversity to test against (). We examined if regions we found under selection from PI416937 overlapped with these regions of low diversity. The potential implications of this would be to have breeders explore targeted introgression of beneficial alleles from PI416937 into regions of low diversity, especially regions which historically have experienced little to no known diversity.