December 13, 2012

MixMapper: wrong again!

Some of you already know that I'm reluctant to take the results produced by TreeMix seriously because they mostly seem to make very little sense and rather induce to the weirdest of confusions than to provide any useful, reliable info on ancient admixture episodes.

Now a new version with allegedly greater power and ability to detect two-source admixture events is being presented but I hold the same caveats: the results are not consistent with what we already know, so the product can only cause the weirdest of confusions if trusted beyond the level of an experimental toy with no reliability whatsoever.

Mark Lipson et al., Efficient moment-based inference of admixture parameters and sources of gene flow. arXiv 2012 (pre-pub). Freely accessibleLINK [ref: arXiv:1212.2555 [q-bio.PE]]

Abstract

The recent explosion in available genetic data has led to significant advances in understanding the demographic histories of and relationships among human populations. It is still a challenge, however, to infer reliable parameter values for complicated models involving many populations. Here we present MixMapper, an efficient, interactive method for constructing phylogenetic trees including admixture events using single nucleotide polymorphism (SNP) genotype data. MixMapper implements a novel two-phase approach to admixture inference using moment statistics, first building an unadmixed scaffold tree and then adding admixed populations by solving systems of equations that express allele frequency divergences in terms of mixture parameters. Importantly, all features of the tree, including topology, sources of gene flow, branch lengths, and mixture proportions, are optimized automatically from the data and include estimates of statistical uncertainty. MixMapper also uses a new method to express branch lengths in easily interpretable drift units. We apply MixMapper to recently published data for HGDP individuals genotyped on a SNP array designed especially for use in population genetics studies, obtaining confident results for 30 populations, 20 of them admixed. Notably, we confirm a signal of ancient admixture in European populations---including previously undetected admixture in Sardinians and Basques---involving a proportion of 20-40% ancient northern Eurasian ancestry.  


The relevant graph is this one:

Figure 4. Inferred anceint admixture in Europe. (A) Detail of the inferred ancestral admixture for Sardinians (other European populations are similar). One mixing population splits from the unadmixed tree along the common ancestor branch of Americans (“Ancient Northern Eurasian”) and the other along the common ancestor branch of all non-Africans (“Ancient Western Eurasian”). Median parameter values are shown; 95% bootstrap confidence intervals can be found in Table 1. The branch lengths a, b, and c are confounded, so we show a plausible combination.

Who says Sardinians here, says any other European or Highland West Asian (represented only by the Adygei in fig. 3; we know from other studies that North Caucasians cluster rather tightly with Higland West Asians like Turks, Kurds or Iranians, as well as with Western Jews). There are alleged (and expected) minor differences  between Russians, Basques and Sardinians but not in essence.

However we know for a fact that it is Native Americans who display obvious ancient admixture between a West Eurasian source (represented by Y-DNA Q and mtDNA X) and a more dominant East Asian one (represented notably by mtDNA A, B, C and D and also by Y-DNA C3). 

Native Americans are a clear case of ancient admixture between Western and Eastern Eurasians and this MixMapper algorithm is unable to detect that well known, obvious admixture. Instead (and I guess it could be worse) it detects a false admixture in reverse, maybe by conflating these ancient Native American (and some other Siberians') dual origins with recent inflow in Europe from Siberia (Uralic peoples and such, well known also). 

The conclusion can only be that, like its ancestor TreeMix, MixMapper is a mere experimental toy with no practical applications other than laughs. 

Keep trying, guys. 


______________________________________________________________

Note: in preliminary email exchanges on this matter, I was asked why I state so confidently that mtDNA X and Y-DNA Q are West Eurasian by origin. The matter is clear as soon as you look at their phylogenetically structured or basal haplogroup diversity.

This one is concentrated in West Asia for X (with even marked penetration in Africa with the X1 (also known as X1'3) clade, which is probably of Egyptian coalescence) and also for X2, with a more Central Asian tendency and scattered rare clades in Altai and Central Siberia. 

Almost the same is true for Y-DNA Q, the main Amerindian patrilineage, whose basal diversity seems centered around Iran at least until the Q1 and Q1b nodes. Q1a may have a Central Asian center of spread but it is not until the Q1a3 level when we can really speak of Native American lineages specifically. 

Nobody has ever expressed to me any doubts about mtDNA A, B, C and D, or Y-DNA C3, being of East Asian origin, so I won't discuss them here. 

12 comments:

  1. "Note: in preliminary email exchanges on this matter, I was asked why I state so confidently that mtDNA X and Y-DNA Q are West Eurasian by origin. The matter is clear as soon as you look at their phylogenetically structured or basal haplogroup diversity".

    Something I agree with you completely on.

    ReplyDelete
  2. We know where these haplogroups are found today. We don't know what these X / Q lineage bearers were like in terms of autosomal genetics.

    There is also something else that is rarely discussed: http://en.wikipedia.org/wiki/Haplogroup_R1_%28Y-DNA%29

    "In Indigenous Americans groups, R-M173 is the most common haplogroup after the various Q-M242, especially in North America in Ojibwe people at 79%, Chipewyan 62%, Seminole 50%, Cherokee 47%, Dogrib 40% and Papago 38%. The decreasing gradient of haplogroup R-M207 from Northeastern to Southwestern North America is evidence that this results from European admixture.(Malhi 2008)"

    Supposedly. We need a more current phylogeny of these R lineages and to corroborate this inference (i.e. that R=recent European admixture) with autosomal data. "Reconstructing Native American population history" by David Reich et al. collected relevant data for digging into these questions, but has not made the data available.

    ReplyDelete
    Replies
    1. From the haplogroups we can track very reasonably (especially as it seems to agree with archaeology where known) the founder populations through the geography. Hence a "Western" population with "mode 4" technology migrated to Altai (where the archaeology supports this narration) and later through NE Asia (where archaeology indicates a flow or influence through Mongolia and North China since c. 30 Ka or so). Patrilocality explains the rest.

      As for what you say about R1*, I do recall a paper from years ago suggesting very imprecisely a Mongolian connection... but the matter seems still in wait of more research, especially because much of those unspecific R1 are probably European-originated R1b and R1a, unless the authors tested for the defining markers of these lineages, of course. I cannot find the mentioned Malhi 2008 paper, so I can't comment further.

      Whatever the case, IF some R1* would have traveled through Siberia and Beringia with Q, it would not change anything in relation to what I'm saying here. After all R1 is from either West Asia or South Asia by origin, hence as "Western" as Q, and its close relative.

      Delete
  3. "but I hold the same caveats: the results are not consistent with what we already know"

    I can understand why you don't like it, especially where the results are actually consistent with what we already know. From the link you provided at your blog:

    http://linearpopulationmodel.blogspot.co.nz/2012/12/efficient-moment-based-inference-of.html?showComment=1355443804853#c7236276812806489637

    Quote from the paper:

    "we found that Han Chinese have an optimal placement as an approximately equal mixture of two ancestral East Asian populations, one related to modern Dai (likely more southerly) and one related to modern Japanese (likely more northerly), corroborating a previous finding of admixture in Han populations between northern and southern clusters in a large-scale analysis of East Asia"

    I've been trying to tell you that for ages.

    ReplyDelete
    Replies
    1. You seem to be in the wrong entry.

      Delete
  4. "You seem to be in the wrong entry".

    You obviously didn't bother to read the article. Scroll down until you reach the heading, 'Two-way admixtures outside of Europe', and READ.

    ReplyDelete
    Replies
    1. First of all, nobody present in this discussion is the owner of Linear Population Model (that's Marnie and she did not participate yet).

      Second, if you don't descend from the heavens of your mental confusion and bother explaining readers (including myself) what do you mean, I will have to consider you a hostile troll.

      Delete
  5. "Second, if you don't descend from the heavens of your mental confusion and bother explaining readers (including myself) what do you mean, I will have to consider you a hostile troll".

    Maju, the paragraph I quoted was from Marnies blog but it is easily accessible from the link you provided. The paper is freely accessible so I find it amazing that you were not aware of the comment concerning the Chinese.

    ReplyDelete
    Replies
    1. This comment has been removed by the author.

      Delete
    2. Uhmmm... whatever. I'm trying to show how unreliable MixMapper is and you are trying to "demonstrate" something using MixMapper... we will not agree.

      Delete
  6. "Uhmmm... whatever. I'm trying to show how unreliable MixMapper is and you are trying to 'demonstrate' something using MixMapper... we will not agree".

    Changing direction again. I would assume that MixMapper is not entirely unreliable.

    ReplyDelete

Please, be reasonably respectful when making comments. I do not tolerate in particular sexism, racism nor homophobia. Personal attacks, manipulation and trolling are also very much unwelcome here.The author reserves the right to delete any abusive comment.

Preliminary comment moderation is... ON (sorry, too many trolls).