October 27, 2016

Mitochondrial DNA from post-Neolithic Santimamiñe (Basque Country)

Four human remains dated to the Bronze Age were sequenced for mitochondrial DNA in Santimamiñe cave (Kortezubi, Biscay, Basque Country), along with single instances from the Neolithic, Chalcolithic and Roman period.

J.C. López Quintana et al., NUEVOS DATOS SOBRE LA SECUENCIA DE USO SEPULCRAL DE LA CUEVA DE SANTIMAMIÑE (KORTEZUBI, BIZKAIA). Arqueología y Prehistoria del Interior Peninsular (ARPI), 2016. Freely accessible (PDF) → LINK [no DOI]

The mtDNA study is not "brand new" but a synthesis of a previous doctoral thesis and advance publications:

Un primer avance de este estudio fue publicado en la monografía de las campañas de 2004 a 2006 de Santimamiñe (Cardoso et al. 2011), incluyendo el conjunto completo en la Tesis Doctoral de L. Palencia Madrid (Palencia 2015).

So we are talking of relatively old data, that has partly remained within the (sometimes absurdly greedy and anti-social) academic circles until now. The relative antiquity of the DNA study is important when assessing it, because genetic analysis is evolving very fast and, in most cases in the rather closed and under-budgeted Spanish universitary circles, they tend to do things "the old way", so we are almost certainly dealing here with HVS-I sequencing, something that is not explicit in the paper (I'm searching for Leire Palencia's thesis to make sure but no luck until now). 

If I am correct in this (and I should be), then we must understand that it is impossible in many cases to determine the exact haplogroup in the crucial R0 upper tier haplogroup, which includes HV and the extremely common H. Lacking the original HVS-I sequences by the moment, I can't but take the authors labels at face value but I must warn here that where it reads "R0" it is almost certainly H (HV0 or V are easy to recognize with this method, as is R0a) and where it reads "H1" it is probably H1 but not 100% certain. 

For more details see the relevant PhyloTree page, where the HVS-I markers are the last bloc in blue, beginning always with the sequence "16" (the other markers in blue of lower numerical value are HVS-II, more rarely used, and the ones in black are the coding region markers, which are in this case fundamental for proper assignment).

The mtDNA haplogroups (as reported) are:

  • Neolithic:
    • U5a2a (S2011-M2, c. 5100 BCE)
  • Chalcolithic:
    •  T2b (S-1, c. 2000 BCE)
  • Bronze Age:
    • U5b (S2011-M1 c. 1700 BCE) 
    • H1 (S2011-M4, c. 1700 BCE)
    • R0 (S2011-M6, c. 1500 BCE)
    • U3a (S2011-M3 c. 1300 BCE)
  • Roman period: 
    • R0 (S2011-M5, c. 300 CE)

Interpretation attempts

It's difficult to extract conclusions from them but they should be compared with other sequences from the area, for which I recommend my 2013 synthesis. In general, treat "R0" as meaning "H", even if I chose to use a different color (magenta instead of red) for exactitude. 

In order to aid that analysis, I reproduce here my 2013 graphic:

We cannot compare the single Neolithic and Roman Era individuals but we can compare the Satimamiñe Chalcolithic+Bronze group of five sequences with the peripheral Chalcolithic large dataset of De La Rúa:

  1. R*+H (very similar):
    1. Peripheral "Basque" Chalcolithic: ~40%
    2. Santimamiñe Chalcolithic+Bronze: 40% 
    3. Santimamiñe Bronze only: 50%
  2. U(xK) (very different):
    1. Peripheral "Basque" Chalcolithic: ~15%
    2. Santimamiñe Chalcolithic+Bronze: 40%
    3. Santimamiñe Bronze only: 50%
  3. Other lineages (all them of certain Neolithic immigrant origin, very different too):
    1. Peripheral "Basque" Chalcolithic: ~45%
    2. Santimamiñe Chalcolithic+Bronze: 20%
    3. Santimamiñe Bronze only: 0%

However one of the U(xK) lineages in Santimamiñe is U3, which is also quite certain to be of Neolithic immigrant origin, and one is an important figure when n=5 so we can also see it this way:
  1. Paleolithic lineages:
    1. Peripheral "Basque" Chalcolithic: ~55%
    2. Santimamiñe Chalcolithic+Bronze: 60%
    3. Santimamiñe Bronze only: 75%
  2. Neolithic lineages:
    1. Peripheral "Basque" Chalcolithic:  ~45%
    2. Santimamiñe Chalcolithic+Bronze: 40%
    3. Santimamiñe Bronze only: 25%

The comparison of #1 with #2 is much more similar. This could be important, because Santimamiñe is not anymore a "peripheral" site, as are those from De La Rúa's dataset, but a rather central one with a extremely long and uninterrupted Paleolithic sequence, dating to Neanderthal-made Chatelperronian culture. It is still a single site with a small number of samples but it does provide a counterpoint that, in one approach could produce similar results. 

But, surprisingly, when we consider a distinct Bronze Age category, comparing not anymore with #2 but with #3 everything changes, suggesting a totally different interpretation of the available dataset, in which, the "Chalcolithic interlude" (if real at all, more data is needed) would be reversed quickly with the onset of the Bronze Age. 

I am sorry but I cannot lean for either interpretation: the data is just not extensive enough to allow for conclusions. I am tempted to support the continuity hypothesis, allowing only for lesser changes to happen, and keep the Chalcolithic dataset under a big question mark, but the question mark is admittedly a bit smaller now: something in terms demographic may have happened in the Chalcolithic period and may have been reversed in the Bronze Age. But "may" is not "for sure", we need more data points.

Feel free to discuss in good mood, as always.

Thanks for the heads up to Jean Lohizun (again).


  1. U3a is a bit of an odd bird and the Bayesian prior would be for the particularly likely U3a1. Consider this from FamilyTreeDNA:

    "mtDNA haplogroup U3 is present in low percentages throughout Europe and Western Asia. It is an ancient haplogroup arising over 30,000 year ago from the very old haplogroup U. It rises to its greatest frequencies in the Near East and Southern Caucasus, that is the mountainous area of Western Iran, Georgia, Armenia, Azerbaijan, Turkey, Syria, Jordan and Iraq, where the percentages vary between 4 and 8 percent. Currently U3 can be divided into three subclades, U3a, U3b and U3c. The latter is a subclade of the original U3ac and split off from U3a 1000's of years ago. All three subclades occur in the above mentioned areas with the exception of Turkey and Armenia, where U3c appears to be absent and U3a is very rare, those countries being dominated by U3b.

    In Europe U3 is still common in Bulgaria and the eastern most islands of the Mediterranean Sea, Cyprus, Rhodes, Crete, where percentages rise to 3 or 4 percent, but becomes rarer and rarer as one moves west with one exception. Again Bulgaria, the Greek mainland and Etruscan Italy are dominated by U3b, whereas the Mediterranean Islands and the rest of Italy have all three subclades.

    The one exception is one sub-branch of subclade U3a called U3a1 which appears to have originated in Europe. At least at this point no instance of this clade has been observed in the Mid-East. This sub-branch dominates U3 in Western Europe especially along the Atlantic coast making up over 60% of U3 in these areas with frequencies rising to as high as 1% of the total population in Scotland and Wales and as high as 3 or 4% in Iceland. Also well over half of this sub-subclade is made up of one version of U3a1 called U3a1c, with a change at 16356, and which accounts for most of the distribution along the coastline from Norway to Northern Portugal.

    U3 also occurs along the North African coast which borders the Mediterranean. The subclade U3b dominates the eastern countries including Egypt and Ethiopia, whereas both U3a (in its older form found in the Near East) and U3b occur in the Berber occupied areas from Libya through Morocco, where again the percentage rises to as high as 1% of the total population.

    There are isolated pockets in the Near East where U3 occurs at a very high percentage of the population; U3 makes up 16% of the Adegei in the Northern Caucasus, about 18% of Iraqi Jews around Baghdad, 39% of Jordaneans of the Dead Sea Valley, 11% of the Qashqai in Southwest Iran (note these people speak a dialect closely related to Azerbaijani of the Caucasus) 17% in one study of Luri in the Western Zagros, 12% on the Greek island of Rhodes and also among the Romani (Gypsies) of Poland, Lithuania and Spain where percentages vary from nearly 40% to as high as 55%."

    This suggests that U3a1 was probably part of the Bell Beaker mix, or perhaps an earlier maritime Atlantic Mesolithic migration that would have also included mtDNA V (the hint that it might be pre-Bell Beaker is the presence in Norway).

    1. You're saying that U3a1c probably reflects some sort of Megalithic or Bell Beaker homogeneous spread in those areas, what is interesting, but we don't know if the Satimamiñe U3a is of that subclade (might be but uncertain). In any case U3 in general is undocumented in Europe before Neolithic and it is widely documented in West Asia and also among the Anatolian ancestors of European first farmers, so it is almost certainly a "Neolithic lineage", in spite of being a subclade of U, whose sublineages U5 and U4 (and also U2 and some U*, with even an instance of U6) are very common, dominant even, in the European Paleolithic.

      U is an extremely old matrilineage, surely dating to c. 50,000 years ago, so some sublineages spread to Europe in the early Upper Paleolithic and others did not until the Neolithic. U3 is quite clearly in the latter case. Just in case there is any doubt, that I presume you don't have it but others may.

      As for V I don't see any connection with Bell Beaker or Megalithism: its key regions are Catalonia, Kabylia and Sápmi (Lapland). The first two are clearly oriented to the Mediterranean, while the third is just too remote and away from any obvious connection with the others. The lineage (along with other HV0 variants, often taken to be V without guarantees) appears in Europe also in the Neolithic but, unlike U3 or K, it is not detected in early Neolithic West Asia, so it may have been picked along the way (from where exactly?) In any case I don't see any apparent correlation in this case with Bell Beaker (only Catalonia would fit in) nor with Megalithism (Kabylia would fit, Catalonia partly only and Sápmi by no means): it was probably spread by Neolithic flows in semi-random ways, maybe from a pre-Neolithic European population still to be sequenced. Its presence in North Africa may also be pre-Neolithic, as other matrilineages (H1, H3, H4 and H7) are clearly of very ancient Iberian derivation and it may even be the case of U6, so why not V as well?

    2. Another issue that I got mentioned by email today is that, if they are using HVS-I, misidentification as U3a of some other U or even H (?) haplogroup is also possible apparently.

  2. Are you talking of Gomez-Sanchez et al.(2014) a Calcholitic individual that Eupedia put in Burgos region?
    I am U3a1c1 with maternal line 99% iberian. I am tired to hear "Never been U3 in basque country", well, nowadays. U3a1c have been reported recently in Aracena (Huelva) in the same percentages as Western Ireland.

    1. It is true, Miqui, that there was U3 in Chalcolithic El Mirador per that study you mention.

      The brief discussion above was mostly about: (1) if U3 could be Paleolithic in Europe (I don't think so: it looks a Neolithic arrival from ultimately West Asia) and (2) about the possible role that a sublineage, U3a1c, might have got within the Megalithic or Bell Beaker phenomena, both of which could have spread Iberian-like genetics northwards (an open question but something Andrew defends).

      I did comment back in the day on Gómez Sánchez 2014, if that's your question, but U3 did not strike me as anything worth discussing, nor anybody raised the issue in the comments either.

      "I am tired to hear "Never been U3 in basque country""

      You can say now that it is definitely not the truth, that it is confirmed to have existed here since at least the Bronze Age -- and of course not far away, in Northern Burgos province, since older times. In essence it seems that U3 (and I'd dare say that almost any single Iberian matrilineage, and probably most of the patrilineages too) has been around since at least the Neolithic, which is quite an old time.

      Am I wrong or the issue has risen because U3 is also a common West/North Romani (Gitano) lineage, which they probably incorporated in Bulgaria or Thrace. Considering that Western and Northern Roma have a brutal founder effect bottleneck, I can only imagine that their U3 sublineages would be different and easy to track, but I have yet to see a study on that specific matter. In any case about 1/3 of Spanish Gitanos are U3.

    2. The most of Roma people have ancestral clade U3a'c. In a recent study about catalan and portuguese Roma, the 100% of portuguese gipsies tested were U3a'c while catalan gipsies were around 75% and rest U3a1. I think U3a1 came from South Germany with last Early Neolithic Farmers by atlantic coast to basque country and north castila. Nowadays is present in Asturias and Catalonia, why not in Euskadi? Well, you know more about basque matriarcal clan substitutions, it is a small haplogroup (1%) and U3 people I know are low condition.

    3. "I think U3a1 came from South Germany with last Early Neolithic Farmers by atlantic coast to basque country and north castila."

      Doesn't make any sense to me. The Early Neolithic did not spread the way you imagine at all: there is just no North-to-South migration by the Atlantic coast or anything even remotely similar in those times. The opposite is plausible, probable even, but North-to-South movements along the Atlantic coast did not exist until Viking times, or until Celts and such but further inland, that would be the Late Bronze Age.

      So the presence of U3, in Iberia is just one example of lineages coming most likely from the Balcans and maybe Anatolia and other parts of West Asia ultimately. The maritime (southern, Impressed-Cardial) and continental branches (northern, Painted+Linear) split at the Balcans and met again around the Rhine about 1000 years later. Linear Pottery does not migrate southwards at all, but probable offshoots of Cardial, like La Hoguette migrated northwards instead. Also later phenomena like Dolmenic Megalithism and Bell Beaker began in the south and only later spread northwards.

      "Nowadays is present in Asturias and Catalonia, why not in Euskadi?"

      Why yes? Basques and Iberians do not overlap too much in terms genetic, forming different clusters as neatly as both populations form relative to the French (SW excluded, because they are Basque-like).

      More research is needed because the French and the Spanish academias are generally extremely reluctant to do these kind of studies but that's the general trend found till present day in international research. For example the only study on Spain's autosomal diversity I recall, "found" that Asturians and Valencians cluster (apparently, I have all kind of doubts because it was a bad study) with central Castilians (oversampled), while Catalans and Andalusians diverged somewhat. Basques were not sampled and every single area surrounding the Basque Country was avoided as well, as if some sort of taboo was preventing the researchers from doing their job properly.

      Asturians have all kinds of interesting genetics but if someone asks me today if Asturians and Basques are particularly related my answer would be "not really". A clear example of these differences are all those lineages related to North Africa so common (and surprisingly diverse) in Asturias but absent in Basque samples: Y-DNA E1b (several variants) is an Asturian lineage it is very clearly not a Basque lineage, mtDNA U6 is an Asturian lineage and not a Basque one.

      I'd be less certain regarding Catalonia, where at least some Y-DNA connection does seem to exist (although it's rather Catalonia-Gasconia-Basqueland). But in any case the geography of the Basque Country does not work too well to explain the genetic relations of Basques: when we look at the bigger European or West Eurasian picture, then the differences become much smaller but we are still not the same thing. There's maybe more distance between Basques, mainline Iberians and mainline French than between Irish and Russians (most North Europeans are surprisingly similar in autosomal genetics).

      It's a bit like asking if something is found in the Luhya and the Maasai, why not among the Hadza, all three being East African populations living not far from Lake Victoria? Well, the Hadza are an isolated and quite distinct population, so it is often not the case.

      In any case U3 had never been reported before among Basques, AFAIK, although in many studies U or U(xK) is reported as a single monolithic category, what does not help. For example, this one was a good study on Basque mtDNA but regarding U it is totally uninformative.


    4. ...

      "the 100% of portuguese gipsies tested were U3a'c while catalan gipsies were around 75% and rest U3a1."

      OK, that looks as if they are carrying a very specific lineage of their own. In any case it would not have been picked in Germany but rather in Bulgaria/Thrace, where it is known that the original Roma lived first in Europe (they arrived via Byzantium) and where U3 is pretty common, unlike elsewhere in Europe.

      Similarly older arrivals of U3 are probably to be understood as founder effects within the Neolithic colonization, again rooted in the Balcans and West Asia. The arrival of U3 to Central Europe and to SW Europe must have happened separately, unless it was La Hoguette culture which brought it to Germany from further South. The mutation rate in mtDNA is too slow to detect such lesser differential migrations: usually the lineage would remain the same as at the origin in such a "short" time as 1000 years or so.

  6. Hi Maju, have you read this paper? It´s full of surprises

    If you add what Blasco Ferrer wrote in 2010 about the proto-sardinian language and the euskeric roots in Sardinia, you can figure out a rough picture of the euskera being a substrate language in the mmediterranean coast.

    1. TY, Olga. Into the "to do" list. I'm sorry but I don't feel like writing these days. Maybe tomorrow?

  7. http://onomastics.ru/sites/default/files/doi/10.15826/vopr_onom.2015.2.002.pdf
    Blasco Ferrer paper

    1. OK, not sure how good it is but I'll circulate around and may comment on it. TY.

    2. It was a good idea to circulate it among my linguist acquaintances. They are not impressed. Blasco Ferrer seems to be doing a difficult tight-rope walking here between "conceding" to the "vasco-sardinian" theory (or vasconic theory on paleo-sardinian, and therefore to vasco-iberism and to all kind of wider ancient family partakings of Basque) and attempting to appease the ivory tower popes of Basque linguistics such as Lakarra, whose monosyllabic segments he insists on using for this exercise, even if they generally make little sense. These "popes" are entrenched on the notion that Basque is an strict isolate and in trying to reconstruct it right away from "proto-human", what is nonsense but somehow has managed to stay as paradigm of Basque linguistics (as made by Basques or more exactly by Joseba Lakarra and his mafioso university camarilla).

      The "advantage" of Blasco is that he's fluent in English, among other languages, something that Lakarra and his minions usually are not. So he can cater to the "global market" more easily, while Lakarra's translations into English are extremely poor apparently. In any case it's clear that while Blasco is knowlegeable about Italian and Sardinian (he has a long list of publications on Sardinian Romance), he is also too much trying to "behave" according to the non-spoken hierarchical discipline of Basque linguistics, bowing unnecessarily to Lakarra's monosyllabic rants in this paper, what is a shame.

      FYI, there is a book on Paleo-Sardinian and Basque by a more serious, independent, Basque linguist, Juan Martin Elexpuru, that is going to be published next month. But so far only in Basque (I'm pressing for a translation into English, it'd be cool). I collaborate in it with an article on paleo-genetics and how these probably mean that Basque is the last survivor of the European Neolithic main family of languages.

      I'll write a note as soon as it's out of the oven.

    4. @Olga: the book I mentioned before will only begin the edition process in January, so maybe be published by early summer. It may also be published in Italian and just now we were considering a translation to English but would need first of a translation and then of a publisher, so too early to say.

  8. Accordingo to your writings in the other blog, these are no news to you, it is just additional data that confirms what many people already knew. But interesting, anyhow.

    1. As I said elsewhere: your (attempted) comments in my other blog indicate that you're a Nazi, so banned without possible appeal, all comments by "Jace Landry" will be deleted.

  11. Mila esker, Olga. Zorionak zuri ere bai.


