Some of you already know that I'm reluctant to take the results produced by TreeMix seriously because they mostly seem to make very little sense and rather induce to the weirdest of confusions than to provide any useful, reliable info on ancient admixture episodes.
Now a new version with allegedly greater power and ability to detect two-source admixture events is being presented but I hold the same caveats: the results are not consistent with what we already know, so the product can only cause the weirdest of confusions if trusted beyond the level of an experimental toy with no reliability whatsoever.
Mark Lipson et al., Efficient moment-based inference of admixture parameters and sources of gene flow. arXiv 2012 (pre-pub). Freely accessible → LINK [ref: arXiv:1212.2555 [q-bio.PE]]
The recent explosion in available genetic data has led to significant advances in understanding the demographic histories of and relationships among human populations. It is still a challenge, however, to infer reliable parameter values for complicated models involving many populations. Here we present MixMapper, an efficient, interactive method for constructing phylogenetic trees including admixture events using single nucleotide polymorphism (SNP) genotype data. MixMapper implements a novel two-phase approach to admixture inference using moment statistics, first building an unadmixed scaffold tree and then adding admixed populations by solving systems of equations that express allele frequency divergences in terms of mixture parameters. Importantly, all features of the tree, including topology, sources of gene flow, branch lengths, and mixture proportions, are optimized automatically from the data and include estimates of statistical uncertainty. MixMapper also uses a new method to express branch lengths in easily interpretable drift units. We apply MixMapper to recently published data for HGDP individuals genotyped on a SNP array designed especially for use in population genetics studies, obtaining confident results for 30 populations, 20 of them admixed. Notably, we confirm a signal of ancient admixture in European populations---including previously undetected admixture in Sardinians and Basques---involving a proportion of 20-40% ancient northern Eurasian ancestry.
The relevant graph is this one:
Who says Sardinians here, says any other European or Highland West Asian (represented only by the Adygei in fig. 3; we know from other studies that North Caucasians cluster rather tightly with Higland West Asians like Turks, Kurds or Iranians, as well as with Western Jews). There are alleged (and expected) minor differences between Russians, Basques and Sardinians but not in essence.
However we know for a fact that it is Native Americans who display obvious ancient admixture between a West Eurasian source (represented by Y-DNA Q and mtDNA X) and a more dominant East Asian one (represented notably by mtDNA A, B, C and D and also by Y-DNA C3).
Native Americans are a clear case of ancient admixture between Western and Eastern Eurasians and this MixMapper algorithm is unable to detect that well known, obvious admixture. Instead (and I guess it could be worse) it detects a false admixture in reverse, maybe by conflating these ancient Native American (and some other Siberians') dual origins with recent inflow in Europe from Siberia (Uralic peoples and such, well known also).
The conclusion can only be that, like its ancestor TreeMix, MixMapper is a mere experimental toy with no practical applications other than laughs.
Keep trying, guys.
Note: in preliminary email exchanges on this matter, I was asked why I state so confidently that mtDNA X and Y-DNA Q are West Eurasian by origin. The matter is clear as soon as you look at their phylogenetically structured or basal haplogroup diversity.
This one is concentrated in West Asia for X (with even marked penetration in Africa with the X1 (also known as X1'3) clade, which is probably of Egyptian coalescence) and also for X2, with a more Central Asian tendency and scattered rare clades in Altai and Central Siberia.
Almost the same is true for Y-DNA Q, the main Amerindian patrilineage, whose basal diversity seems centered around Iran at least until the Q1 and Q1b nodes. Q1a may have a Central Asian center of spread but it is not until the Q1a3 level when we can really speak of Native American lineages specifically.
Nobody has ever expressed to me any doubts about mtDNA A, B, C and D, or Y-DNA C3, being of East Asian origin, so I won't discuss them here.