For what they were... we are: Ancient Thracians, Ötzi and the origins of modern Europeans (another point of view)

May 11, 2014

Ancient Thracians, Ötzi and the origins of modern Europeans (another point of view)

A recent study has sequenced the DNA of an ancient Thracian woman but, for some reason, instead of looking at her comparison with modern Bulgarians and such, they have written a study that mostly goes about Ötzi "the iceman" and has not a single Bulgarian sample.

Martin Sikora et al., Population Genomic Analysis of Ancient and Modern Genomes Yields New Insights into the Genetic Ancestry of the Tyrolean Iceman and the Genetic Structure of Europe. PLoS Genetics 2014. Open access → LINK [doi:10.1371/journal.pgen.1004353]

Abstract

Genome sequencing of the 5,300-year-old mummy of the Tyrolean Iceman, found in 1991 on a glacier near the border of Italy and Austria, has yielded new insights into his origin and relationship to modern European populations. A key finding of that study was an apparent recent common ancestry with individuals from Sardinia, based largely on the Y chromosome haplogroup and common autosomal SNP variation. Here, we compiled and analyzed genomic datasets from both modern and ancient Europeans, including genome sequence data from over 400 Sardinians and two ancient Thracians from Bulgaria, to investigate this result in greater detail and determine its implications for the genetic structure of Neolithic Europe. Using whole-genome sequencing data, we confirm that the Iceman is, indeed, most closely related to Sardinians. Furthermore, we show that this relationship extends to other individuals from cultural contexts associated with the spread of agriculture during the Neolithic transition, in contrast to individuals from a hunter-gatherer context. We hypothesize that this genetic affinity of ancient samples from different parts of Europe with Sardinians represents a common genetic component that was geographically widespread across Europe during the Neolithic, likely related to migrations and population expansions associated with the spread of agriculture.

Notice please that, as the authors acknowledge, the DNA of the second Thracian individual, K8 may be contaminated:

the DNA damage pattern of this individual does not appear to be typical of ancient samples (Table S4 in [15]), indicating a potentially higher level of modern DNA contamination.

This does not seem to dissuade them to use it in the analyses.

Figure 1. Geographic origin of ancient samples and ADMIXTURE results.
(A) Map of Europe indicating the discovery sites for each of the ancient samples used in this study. (B) Ancestral population clusters inferred using ADMIXTURE on the HGDP dataset, for k = 6 ancestral clusters. The width of the bars of the ancient samples was increased to aid visualization.

Notice that, instead of attempting to model moderns on ancients, as would seem logical from the viewpoint of purported ancestry but would be incomplete for lack of a sufficiently large ancient sample or allow the ancient samples to "float freely" in the analysis, the researchers decided to force them into modern parameters what is still valid, because it indicates greater or lesser affinity to the various studied modern populations (among which there's not a single Balcanic sample, oddly enough).

We can see that:

Epipaleolithic Iberian Braña 1 approximates the French structure but is somewhat "more Basque" than these.
Neolithic Pitted Ware semi-forager Ajv70 (Gotland) approximates the Orcadians very well.
Neolithic Megalithic/Funnelbeaker Gok4 (Southern Sweden) approximates North Italians.
Chalcolithic North Italian Ötzi (Iceman) is close to Sardinians but not quite the same ("more Basque" again).
Iron Age Thracian commoner P 192-1 approximates Tuscans.
I would ignore princely Thracian K8 because of the aforementioned contamination issues.

For completeness, I'm including here also fig. S1, which includes runs 1-8 of ADMIXTURE:

Fig S1- ADMIXTURE results for HGDP. Panels show the results for ADMIXTURE runs for k = 2 to k = 8 ancestral clusters on the HGDP individuals, and the corresponding cluster proportions inferred for the ancient samples.

Notice (see fig. S7) that K=3-5 are quite poor fits and therefore both should be ignored as meaningless. From K=6 onwards the scores slightly improve for all the ancient samples, however it must be said that K=2 is in general the best fit form most European populations, being most of the improvement in error score due to better approximation to West Asian samples.

In most cases Basques have the lowest or one of the lowest fitness scores (except at K=5, where Basques are portrayed as a Russian-Sardinian mix, what is clearly a confounding artifact). Sardinians also have very low error scores but only from K=5 onwards, when the Sardinian component becomes apparent. The Iceman has very low error scores for all K values, while the Thracian samples have the greater ones, maybe owing to the lack of Balcanic samples.

For me these error results suggest that ancients are fine being just "unspecific Europeans" (K=2 blue), while the low error score for Basques and Sardinians surely underline that these are about the only modern populations which can be explained as simple Paleolithic-Neolithic mix, without need of a third Indoeuropean extra ancestry.

They also projected the ancient samples onto PCA plots of modern European populations:

Fig S2 - PCA results for HGDP. Panels show the results for PCA on the HGDP individuals for subsets of SNPs with data in the respective ancient sample. Each point represents an individual, with plot symbol and color indicating population of origin. The position of the ancient samples was inferred by projecting onto the PC space calculated using the modern samples only.

For some odd reason the PCAs are different in each case, even if the samples are the same (only moderns used, ancients are "projected" and should not affect the result). I have no explanation for this issue and I reckon I'm tempted to write to the authors asking for this unexpected complexity, which seems product of the projection itself altering the graph.

Whatever the case, the projection of the ancient samples, follows in general terms the patterns noted above for the ADMIXTURE graph:

La Braña 1 projects between French and Basques.
Ajv70 projects onto Orcadians, tending also towards France.
Ötzi projects between Sardinians and Italians.
Gok 4 with North Italians but not far from Basques.
P 192-1 doesn't seem too akin to any specific modern population, although some French, Basques and Tuscans do approximate her.

These results may be frustrating for those already too accustomed to the previous analysis of ancient autosomal DNA but we must not forget that, because of its huge size and complexity, autosomal DNA requires statistical analysis, which is highly susceptible to variations in sample strategy particularly, as well as to other not always well understood factors. Hence different points of view are generally complementary rather that outright contradictory.

Of some interest is also this TreeMix graph of modern populations and Ötzi:

Figure 3. Results of TreeMix analysis of the Iceman with 1000G/Sardinia.
Shown are maximum-likelihood trees and the matrices of pairwise residuals (inset) for a model allowing (A) m = 0 and (B) m = 3 mixture events. Large positive values in the residual matrix indicate a poor fit for the respective pair of populations. Edges representing mixture events are colored according to weight of the inferred edge.

It is notable the African low level admixture arrow at the root of the Euro-Mediterranean branch (the so-called "Basal Eurasian" element in Lazaridis 2013) and the East Asian component in Finns. Also sizable admixture from the West Eurasian root is apparent among Tuscans. Once these admixture axes are allowed for, the topology of the European tree changes significantly, showing a main split between Eastern Europeans (Finns) and Western/Southern ones.

Other similar trees are available in fig. S6.

No extra Neanderthal admixture in Ötzi

Contrary to some previous rough estimates, Ötzi does not appear to differ from other Europeans in Neanderthal ancestry at all. See figs. S9 and S10.

20 comments:

DavidskiMay 12, 2014 at 3:50 AM
Nah, forgot it, La Brana doesn't cluster there.

I just spoke to someone at the Reich lab about this issue, because it shows up in PCA done with Eigenstrat as well.

They call this problem "shrinkage", because the PCA space is shrinked for the projected samples relative to the reference samples, and there's no automatic fix for it yet, but there might be soon.

You'll find all the technical details here...

http://arxiv.org/abs/1211.2970
ReplyDelete
Replies
AnonymousMay 12, 2014 at 8:57 AM
Oversampling of North Europeans, primarily Swedes, in the Skoglund 2014 PCA is of little consequence because the primary dimension is defined by the "southernmost" Yemeni vs the "northernmost" Finn (Ancients are of course projected). The secondary dimension is a lack of "Sardinian/Southwest European-ness" which is defined by Swedish Saami at the other end. Yemenis are as distant from Sardinians along this dimension as Finns are. One sample is enough to define a primary or secondary PCA dimension if it's divergent enough, which is best seen in the MA-1 or La Braña PCA's of West Eurasia Davidski did.

By the way, in which PCA there was a Basques-Adygei PC2? Lazaridis had one with a Basques-Maltese dimension, but that's not quite the same thing.
ReplyDelete
Replies
DavidskiMay 12, 2014 at 1:43 PM
Here, I forgot I had this online.

Scroll down to the bottom. The first result for PL1 was achieved with ADMIXTURE, and the second with allele frequencies using a calculator. Check out the difference in the spread of the main components.

https://docs.google.com/spreadsheet/ccc?key=0Ato3EYTdM8lQdFh6SzZyOEdMT2kyUmY0cS1PaW1maXc#gid=0
ReplyDelete
Replies
RyanMay 14, 2014 at 1:44 AM
Re: Ötzi and archaic DNA - I wish Hawks would go back and correct himself. His blog post claims the data was about to be published, but it looks like that didn't ban.

From that PCA like Ötzi is actually on the low end of archaic admixture for Eurasian populations. I wonder if that means the "basal Eurasian" component was not admixed with Neanderthals/Denisovans.
ReplyDelete
Replies
UnknownMay 17, 2014 at 10:33 PM
If the EEF is virtually the same in number of the Pas_Vasco, South_french, Bergamo and Bulgarians then clearly its basically the same people the migrated in these areas or is the EEF-WHG-ANE paper a lot of rubbish?
ReplyDelete
Replies

Add comment

Please, be reasonably respectful when making comments. I do not tolerate in particular sexism, racism nor homophobia. Personal attacks, manipulation and trolling are also very much unwelcome here.The author reserves the right to delete any abusive comment.

Preliminary comment moderation is... ON (your comment may take some time, maybe days or weeks to appear).

Pages

May 11, 2014

Ancient Thracians, Ötzi and the origins of modern Europeans (another point of view)

20 comments: