220 likes | 371 Views
Population Genetics. Robert Page Doctoral Student in Dr. Voss’ Lab E-mail: robert.page@uky.edu. Why study population genetic structure?. In general, provides perspective on adaptation and speciation. Can reveal the recent demographic history of a population and the role of:
E N D
Population Genetics Robert Page Doctoral Student in Dr. Voss’ Lab E-mail: robert.page@uky.edu
Why study population genetic structure? In general, provides perspective on adaptation and speciation. Can reveal the recent demographic history of a population and the role of: Gene flow Genetic drift Inbreeding Natural selection Population size Can reveal the history of population structuring over deeper time. e.g. Phylogeography
Why do we expect population genetic structures to vary within and among organisms? 1) Differences in mobility/dispersal ability 2) Differences in reproductive attributes/system 3) Differences in life history attributes 4) Differences in behavioral attributes Differences in geographic distribution Habitat patchiness or variability Historical reasons e.g. See Table 6.3 and 6.4
One of the first idealized models of a population From J. Hey, 2003, Nature Reviews Genetics, 4:535-544.
Models of population structure that allow for migration (Gene Flow) Idealized Population Models a. Island model b. Stepping stone c. Isolation by distance d. Metapopulation
F-Statistics: Classical Descriptors of Hierarchical Population Genetic Structure
What do We Mean by Hierarchy? B/wn Subpops Total Pop Subpop W/in Subpop B/wn Populations (2nd pop not shown)
FIS Is the Inbreeding Coefficient • FIS measures the deviation of genotypes within a sub-population from those expected under HWE • FIS is conceptualized in terms of heterozygote deficiency or heterozygote excess • FIS = (HS - HI) / HS - HS = Mean expected heterozygosity within panmictic subpopulations - HI = Mean observed heterozygosity per individual within subpopulations • FIS varies between -1 and 1... a FIS of 0 implies that the observed values agree perfectly with the expected values... an FIS of 1 implies strong inbreeding... an FIS of -1 implies strong outbreeding
FST Measures Population Subdivision FST can be conceptualized as the discrepancy between randomly pulling alleles from a subpopulation and randomly pulling alleles from the entire population Sub-population level: FST FST = Vp / p (1 - p) This is a measure of the observed variation in allele frequencies among sub-populations (regardless of how the variation arose).
Another way from Avise: FST = (HT - HS) / HT HS = mean expected heterozygosity at a locus within subpops under HWE HT = overall expected heterozygosity in total population (given allele freq & HWE) FST : Ranges from 1.0 to 0.0 subpopulations fixed for alternate alleles subpopulations have same alleles frequencies “structured” “not structured”
From Selander (1970): An analysis of mouse population structure within and among barns in Texas. Estimated Number of Mean Allele Variance of Population Size Pops Sampled Frequency Allele Frequency Es-3b Hbb Es-3b Hbb Small ~10 29 0.418 0.849 0.0506 0.1883 Large ~200 13 0.372 0.843 0.0125 0.0083 FST = Vp / p (1 - p) FST = 0.0506/(0.418)(0.582) = 0.208 for small pops FST = 0.0125/(0.372)(0.628) = 0.054 for large pops * Note that this method requires calculating FST separately for each locus (Es-3b & Hbb)
Consider the joint effects of genetic drift and gene flow on population structure In the absence of migration, finite populations become more inbred and diverge from one another at random (with respect to allele frequencies) as a result of drift. The probability of autozygosity (that an individual carries IBD alleles at a locus) increases faster, the smaller the population. FST provides a measure of divergence under drift. At some point in time, as a population approaches FST = 1, the increase in autozygosity will be balanced by the rate of migration (and/or mutation also, in reality). An equilibrium is struck.
Migration rates (Gene Flow) can be estimated assuming an equilibrium FST has been reached: For neutral alleles in an island model, the equilibrium value of FST : ~ FST = 1 / (4Nm + 1) or, ~ Nm = [(1/FST) - 1] / 4 This is interpreted as the absolute number of individuals exchanged between populations. FST As Nm increases, decreases.
Gene Flow is a powerful thing If Nm = 1, FST = 0.20. Subpopulations are 20% more structured (inbred) than if all subpopulations essentially comprised a single, randomly mating population
However, FST is not very a precise measure. At best it can only provide qualitative perspective. 1.0 0.8 06. 0.4 0.2 0.0 FST Nm
FIT Measures Inbreeding Relative to the Entire Population FIT captures the effects of mating between close relatives within a subpopulation and the accumulated inbreeding resulting from mating between remote relatives at all levels of the population hierarchy FIT = (HT - HI) / HT HT = Expected heterozygosity in panmictic total popualtion HI = Mean observed heterozygosity per individual within subpopulations
Summary of F-statistics I = Individual, S = Subpopulation T = Total
Analysis of Molecular Variance (AMOVA) • Another way to describe where the variation within a dataset lies within the hierarchy • Allows for partitioning this variance into components associated with the following hierarchical levels (1) Between populations (2) Between subpopulations within populations (3) Within subpopulations
More recently, there has been development of DNA sequence variation approaches to characterize population structure. However, all summary statistic approaches live and die by the assumed demographic model. The model specifies meaning to the parameters and the assumptions that underlie them. A summary statistic doesn’t necessarily provide insight.
Clustering & Assignment Approaches • Different algorithms can be used... we will not belabor the differences but rather will focus on the similarities • Likelihood and maximum likelihood methods are most common • There are two main approaches (1) Assigning individuals to subpopulations based on how likely it is that they belong to each subpopulation given the available data on each subpopulation... The agreement between these post-hoc assignments and the subpopulation that each individual was sampled from yields insight into population structure (2) Clustering individuals into arbitrary groups by maximizing fit with expected Hardy-Weinberg proportions... If the groups recovered reflect the sampling scheme, then there is evidence of population genetic structure
Applications of Population Genetics Theory Conservation Human Ancestry Wild-life Management Agriculture This list is far from exhaustive
Next Class Assigned Reading: Cabe et al. Fine-scale Population Differentiation and Gene Flow in a Terrestrial Salamander (Plethodon cinereus) Living in Continuous Habitat