Is there a difference between admixture and introgression? I think there is. Or have always assumed there is. But of late I’m wondering if a distinction is widely accepted, and what sort of distinctions people make. That is, in some cases it seems clear that admixture and introgression are used interchangeably as meaning the same thing. I’ve seen this in scientific papers, and often just do a mental substitution. But in other cases I’m wondering if people are using the terms in a different sense than I am. Probably the latter is more worrisome.
The figure to the left was generated by Admixture, a software package which takes population genetics assumptions (models) and data, and shows you the best fit of the data to a particular model. In this case the bar plot shows you the admixture of a given individual when you posit them to be a combination of K ancestral populations. The individuals are clustered by population, so you see population-wide profiles. The details of the model, and whether the model accurately captures reality (i.e., were there actually K populations at any time in the past?), is less important for this post than the fact that Admixture is reflecting admixture on a genome-wide scale between two or more populations. The input data are represented by hundreds of thousands of single nucleotide polymorphisms distributed across the whole genome. The question of interest is whether a population can be represented as a pulse mixing even between two hypothetical groups, which were at some point phylogenetically distinct.
Introgression in contrast focuses on the question of genetic variants which are penetrating one population from another, and becoming common in the target population. A classical method of generating introgression in plant genetics was to engage in extensive backcrosses of mixed lineages with a trait of interest against a parental population. If one continued to select for a particular trait among the progeny one could introgress the trait and allele in a daughter population which was almost identical to one of the parent populations on a genome-wide scale, but identical to the other at one gene of interest. The practical reason for this is obvious. Imagine you have a variety of cold adapted rice which is susceptible to a particular type of fungal infection. Then, you have a heat adapted rice which is resistant to the fungal infection. All you want is fungal infection resistance, maintaining all the other characteristics that keep the cold adapted rice optimal for its climate. So you cross the two, and continue to cross progeny against cold adapted rice while selecting for the resistance phenotype. Eventually you’ll get the allele you want introgressed while maintaining the genetic background you want. In contrast, if you just allowed for admixture between the two lineages, you might get a population which was in between on a whole host of phenotypes which make them suboptimal for any climatic regime.
An example from human population genomics can be found in the paper Altitude adaptation in Tibetans caused by introgression of Denisovan-like DNA. What occurred here is that a very common variant in Tibetans implicated altitude tolerance and adaptation seems to be phylogenetically closer to those you find in the Denisovan hominins than in other human populations. This, despite the fact that Denisovan ancestry is nearly nonexistent in Tibetans (the latest work suggests admixture in East and Southeast Asia on the order of 0.1 to 0.5%, with the highest fractions being among certain Southeast Asian and South Asian groups).
The network plot to the right illustrates the issue. On a genome-wide admixture plot Tibetans look like any East Asian population. They seem to be a mix of farmers related to the Han to the east and indigenous groups long resident at these high altitudes. But on the region around EPAS1 their genetic variation matches not modern humans, but the Denisovan hominin, which diverged ~500,000 years ago from the population gave rise to 90 to 99% of the ancestry of our own lineage.
So what happened? We know that there were low levels of hybridization between very diverged human lineages in the past. Because of genetic incompabilities it seems that in fact there was some selection against distinctive alleles from archaic lineages in our own genome. That is, the percentage of Neanderthal ancestry on the genomic level is probably lower than you’d get from doing a genealogical analysis of all lines of ancestry back to 100,000 years ago, because there has been selection against Neanderthal variants in the dominant human genetic background. But not in all cases. In a minority of instances the Neanderthal and Denisovan variants were not less fit, nor were they neutral, but rather, they were favored!
So, imagine a scenario where in the initial generation admixture between a large human population and a small Neanderthal population leads to admixture on the order of ~5% in the descendants. Over the generations due to selection against Neanderthal alleles the genomic ancestry from this group converges upon ~2.5%. But, on a subset of loci the Neanderthal alleles will have increased in frequency, and in some cases introgressed to high levels. This could be due to randomness; in a genome with billions of base pairs and tens of millions of nucleotide polymorphisms some alleles will drift up to higher frequencies randomly. But it is in the set of high frequency alleles from Neanderthals that you might find variants that have become common due to adaptive introgression. See this paper in AJHG, Introgression of Neandertal- and Denisovan-like Haplotypes Contributes to Adaptive Variation in Human Toll-like Receptors. Immunological variation is always an excellent candidate because genetic diversity at these loci are highly favored, and long resident populations often have local adaptations.
Because my focus is generally in microevolutionary process, the sort of thing population geneticists are interested in, I’ve really not been talking about species-level dynamics (though the hominins are arguably distinct species). Much of the work on admixture and introgression is done by biologists focused on inter-specific differences, but the general framework holds I believe (in fact, questions of admixture and introgression and more clear and distinct across diverged lineages). In plants in particular hybridization and introgression are common in wild and domestic lineages.
I’m not putting this post up as definitive. When I read papers where there is talk about “introgression of ancestry” it is clear that today people are merging and bleeding the definitions. I actually checked for definitions of introgression and admixture in . Principles of Population Genetics and Elements of Evolutionary Genetics. There wasn’t anything, because debate on this issue isn’t/wasn’t very live in these fields. At this point I’m really curious what other biologists think. I still find the distinction important, and more critically, useful. If one doesn’t, I’d like to hear opinions. If one has different definitions, I’d like to hear opinions.