July 16, 2013

Refined Basque-centric mitochondrial lineages

A new study has located some mitochondrial lineages that seem to be specific of Basques and nearby populations.

Sergio Cardoso et al., The Expanded mtDNA Phylogeny of the Franco-Cantabrian Region Upholds the Pre-Neolithic Genetic Substrate of Basques. PLoS ONE 2013. Open accessLINK [doi:10.1371/journal.pone.0067835]


The European genetic landscape has been shaped by several human migrations occurred since Paleolithic times. The accumulation of archaeological records and the concordance of different lines of genetic evidence during the last two decades have triggered an interesting debate concerning the role of ancient settlers from the Franco-Cantabrian region in the postglacial resettlement of Europe. Among the Franco-Cantabrian populations, Basques are regarded as one of the oldest and more intriguing human groups of Europe. Recent data on complete mitochondrial DNA genomes focused on macrohaplogroup R0 revealed that Basques harbor some autochthonous lineages, suggesting a genetic continuity since pre-Neolithic times. However, excluding haplogroup H, the most representative lineage of macrohaplogroup R0, the majority of maternal lineages of this area remains virtually unexplored, so that further refinement of the mtDNA phylogeny based on analyses at the highest level of resolution is crucial for a better understanding of the European prehistory. We thus explored the maternal ancestry of 548 autochthonous individuals from various Franco-Cantabrian populations and sequenced 76 mitogenomes of the most representative lineages. Interestingly, we identified three mtDNA haplogroups, U5b1f, J1c5c1 and V22, that proved to be representative of Franco-Cantabria, notably of the Basque population. The seclusion and diversity of these female genetic lineages support a local origin in the Franco-Cantabrian area during the Mesolithic of southwestern Europe, ~10,000 years before present (YBP), with signals of expansions at ~3,500 YBP. These findings provide robust evidence of a partial genetic continuity between contemporary autochthonous populations from the Franco-Cantabrian region, specifically the Basques, and Paleolithic/Mesolithic hunter-gatherer groups. Furthermore, our results raise the current proportion (≈15%) of the Franco-Cantabrian maternal gene pool with a putative pre-Neolithic origin to ≈35%, further supporting the notion of a predominant Paleolithic genetic substrate in extant European populations.

I'd say that the finding of these three lineages is in itself the interesting part. The molecular-clock-o-logical speculations are something that as, you know, I tend to ignore. However these seem to be almost invariably just fractions of the realistic dates when properly calibrating, so if the authors get a 10 Ka date, then we can be reasonably sure that it is a minimum date and that the likely actual date can well be twice that figure (although in the case of mtDNA this may depend on each particular branch, as these are very unequal). 

Another important note is that the authors rather studied the sub-Pyrenean isthmic region rather than the original Franco-Cantabrian region, which corresponds mostly to what is now Southern France. I am always rather skeptic about attributing to Basques exclusively the legacy of this wider Paleolithic region and I want to insist that Southern France's peoples (Gascons, Occitans, Perigordians and all others south of the Nantes-Lyon line) be studied in depth and detail along Basques, Cantabrians and Asturians. At the very least this study has an important Gascon sample (Bearn, Bigorre and Chalosse), although it is from the areas closest to the Basque Country.

In a previous study by the same team (2011), they detected apparent Basque (or is it Gascon?) centrality of lineages J1c and U5b. In this study they refine those findings by locating some sublineages that are more clearly Basque-specific: J1c5c1 and U5b1f. They also spotted another Basque-centric haplogroup within V: V22. 

Figure 1. Maximum parsimony trees of haplogroups U5b, J1c and V including the three autochthonous lineages U5b1f, J1c5c1 and V22.
These trees are extracted from the maximum parsimony phylogenetic tree of 76 complete mtDNA sequences of the Franco-Cantabrian region shown in detail in Fig. S1. Mutations are displayed along the branches. All mutations are transitions unless a suffix specifies a transversion (A, C, G, T). Recurrent mutations within the complete phylogeny of the Franco-Cantabrian area are underlined. The prefix ‘‘@’’ indicates a back mutation. Mutational hotspot variants such as 16182, 16183, or 16519, or a variation around position 310 or 523–524, as well as length heteroplasmies were not considered for the phylogenetic reconstruction. All the samples are colored according to their geographic origin, as shown in the legend. For phylogeny construction, five previously published mitogenomes belonging to subhaplogroups U5b1f (JX286537 and DQ156208), J1c5c1 (JQ702776 and JQ704051) and V22 (HQ384212) were included (GenBank accession numbers in the tree). German ethnicity was declared for sample JX286537 in GenBank; however, maternal ancestry in southwestern Europe cannot be ruled out owing to the absence of lineage U51bf in populations outside the Franco Cantabrian area (see Tables S2 and S3). French B.C. refers to samples from the French Basque Country.

As for the apportions, table S3 provides them for U5b1f and J1c5c1, however I could not find anywhere the frequencies of V22 nor other lineages (although there is a list of lineages on a reduced sample in table S1). Sorted by frequency:

  1. Lapurdi: 24.1%
  2. NE Navarre: 23.6 %
  3. Zuberoa: 17.7%
  4. NW Navarre: 17.0%
  5. North Navarre: 15.2%
  6. Chalosse (Dax district): 15.0%
  7. Bearn: 12.5%
  8. Gipuzkoa: 11.8%
  9. SW Gipuzkoa: 9.5%
  10. Bigorre: 9,5 %
  11. Low Navarre: 8,2%
  12. Central-West Navarre: 7.8%
  13. Burgos (Castile): 4.2%
  14. La Rioja: 3.8%
  15. Biscay: 3.6%
  16. Zaragoza (Aragon): 2.5%
  17. Catalonia: 0.7%
  18. Araba: 0.05%
Not detected in: West Biscay (Enkarterriak), Pas Valley, Cantabria, (North?) Aragon, Madrid, Perigord-Limousin.

So this U5b1f lineage seems concentrated around the Western Pyrenees with a highest density axis between, say, Baiona (Bayonne) and Zangoza (Sangüesa). It is a most important lineage in that core Eastern Basque and Southern Gascon area with frequencies above 5% and reaching almost to 25% in some cases.

While we can appreciate the clinal decline in Iberia, lack of data for Gascony and most of the Paleolithic Franco-Cantabrian region (i.e. Southern France) leaves us without similar data for the northern cline, knowing only that in Perigord-Limousin is 0%. This last detail may explain why the haplogroup is contained within the wider Basque area, as Perigord (and not the Basque Country) was the true center of the Franco-Cantabrian region, a district I have dubbed sometimes "the metropolis of Paleolithic Europe". Rather than looking for signals of expansion from the Basque Country only, researchers should look for such signals of expansion from Perigord particularly (as well as the whole Franco-Cantabrian region, from Provence to Asturias and from the Loire to the Pyrenees). 

Haplogroup U among Basques and Pasiegos (from fig. S1)

  1. Zuberoa: 4.8%
  2. NW Navarre: 3.8%
  3. Chalosse (Dax): 3.3%
  4. Central-West Navarre: 3.1%
  5. Low Navarre: 2.7%
  6. North Navarre: 2.4%
  7. Gipuzkoa: 2.2%
  8. NE Navarre: 1.8%
  9. Bearn: 1.8%
  10. Lapurdi: 1.7%
  11. Zaragoza: 1.2%
  12. Araba: 1.0%
  13. Biscay: 0.08%
Not detected in: SW Gipuzkoa, West Biscay (Enkarterriak), Bigorre, La Rioja, Burgos, Catalonia, Pas Valley, Cantabria, (North?) Aragon, Madrid, Perigord-Limousin.

Again the lineage, even if much less common, is concentrated around the Western Pyrenees, with a highest density axis from say, Leitza to Maule, spilling to Chalosse and Bearn by the NE but again reaching zero frequencies at Perigord-Limousin (and not knowing how it behaves in between). In Iberia, outside of the Basque Country, is only found in the city of Zaragoza. 

As happens with U5b1f, J1c5c1 quickly declines towards the West in the Southern Basque Country, what may be ground, especially if other lineages also follow this pattern, to consider two original Basque populations: one around the Western Pyrenees and another one around Biscay. This idea, while not commonly formulated, would not be new at all, for example F. Krutwig already suggested in the mid 20th century that Central-Eastern Basques had a Dinaric-like morphology (pseudo-Dinaric, because "true Dinarics" are supposed to be brachicephalic while Basques are usually mesocephalic, like most Europeans), while Biscayans would be rather dominated by the rare Dalic anthropometric type instead. Debatable, of course, but that's what we are here for, aren't we?

Haplogroup JT among Basques and Pasiegos (from fig. S1)


The study does not directly provide the frequency data for this lineage (they excuse themselves on technical HVS reasons, a very typical problem when working with HV lineages), so I had to work with supplemental table S1, where the HVS-I haplotypes and attributed haplogroups are listed (over a reduced sample of just 76 individuals). My synthesis is:
  1. Low Navarre: 1/4 (25.0%)
  2. Gipuzkoa: 2/10 (20.0%)
  3. Biscay: 1/9 (11.1%)
  4. West Navarre: 3/33 (9.1%)
  5. Pas Valley: 1/19 (5.3%)
Not detected in Araba's single-person sample: 0/1.

I wouldn't dare to reach any strong conclusions with such small and irregular sample but it does look like the Northern Basque Country and Gipuzkoa have the highest frequencies, what suggests that maybe the lineage extends to the North into Gascony, etc. Unlike the previously discussed lineages in this case the decline towards the West is not as sharp.

Haplogroup R0 among Basques and Pasiegos (from fig. S1)

See also:


  1. A local pre-Neolithic origin for U5b1f and V22 seems very plausible.

    But, I am deeply skeptical of the notion that J1c5c1 has a local pre-Neolithic origin in the region.

    My intuition is that this mtDNA haplogroup is probably not uncommon in this specific area and no where else due to a founder effect in a copper age migration (probably affiliated with the Bell Beaker people in some way). The haplogroup prospered in Basque country relatively speaking, although an average of about 2% of the population of a small corner of Europe after many thousands of years isn't all that amazing statistically speaking. But probably vanished or continues to be very rare in its place of origin someplace in West Eurasia that is well to the east of Italy. An apparently greater time depth could have arisen from diversification of the haplogroup within a very geographically localized and tightly knit source population pre-migration than migrated en masse - perhaps residents of a displaced village and distantly related small subclan of people linked by generations of cousin marriages who participated in the migration.

    1. In the Basque Country (and other parts of Europe) mtDNA J is found (ancient DNA) since Neolithic.

      In Los Cascajos (Neolithic Navarre) J was found at frequencies of 2/27, i.e. ~7%. It was not found in other contemporary Basque sites but this may be caused by its relatively low frequencies, which are consistent with modern ones. In Chalcolithic Basque data it was somewhat more frequent in some sites (~10%) but absent in others (Izagirre & De la Rúa 1999).

      We can only speculate if most of this was already J1c variants but seems plausible, so I'd would think of a Neolithic origin rather than one later on. This does not totally exclude the possibility of an even older origin but it's harder to justify with the available data.

      One problem is that we do not really know the origin of mtDNA J, just that it spread in Neolithic times (but not located yet in Neolithic West Asia).

      On U5b there are no such doubts: it is found in pre-Neolithic Northern Iberia (La Braña and NE Navarre), so it is at least Epipaleolithic (again the same case as in Central Europe). Although at least the La Braña haplogroup (the only one sufficiently detailed) is not the Basque-specific subclade discussed here. Undetermined U5 is found first of all in Solutrean Nerja (Andalusia) - so U5 is a very clear case of Paleolithic origins, at least in part.

      On V, hard to say again: it looks Paleolithic on its distribution (rather similar to H1) but it has never been sequenced in pre-Neolithic aDNA anywhere (first in Portugal and East Germany), so it's a case a bit like J or J1c.

    2. mtDNA J as Neolithic rather than Copper Age is plausible enough. After all, you are only talking about a thousand or two thousand years difference in dates in Basque Country anyway between those two because the Neolithic arrives there much later than the Fertile Crescent and Balkans, while the Copper Age technology spreads more rapidly, right?

      I would argue pretty strongly for pre-Neolithic for the reasons you describe for V (which is found in the Saami, in the Netherlands and Belgium especially in coastal area, and in Western Berbers as well, seemingly centering on the Franco-Cantabrian area including both Basques and Gascons), and we may never get any pre-Neolithic aDNA of V for preservation reasons even if it is older. But, I do think that it might be plausible that it appears not very much pre-Neolithic, perhaps 2,000 to 6,000 years or so probably via what was at the time a pre-Neolithic coastal maritime culture that held its own against the Neolithic influx better than terrestrial hunter-gatherers.

  2. "Doron M. Behar et al., The Basque Paradigm: Genetic Evidence of a Maternal Continuity in the Franco-Cantabrian Region since Pre-Neolithic Times" is the source of the three Gascon samples (Chalosse, Béarn and Bigorre) : you tackled it on your blog last year.


    Bearn (n=56):

    H1: 11 (20%)
    H2a: 2 (4%)
    H3: 3 (5%)
    V: 3 (5%)
    HV: 2 (4%)
    U: 16 (29%)
    K: 6 (11%)
    J: 4 (7%)
    T: 2 (4%)
    X: 3 (5%)
    Singletons: H5'36, H6, H9, H59

    So now, we discover that out of those 16 Bearnese individuals who were detected to belong to mtDNA U, 7 actually belong to a very specific Basque subclade U5b1f.

    I suppose the remainder belongs to a less "specific" and pan-European subclades such as U5b1b like myself.

    I'd enjoy a deeper analyzis of the data obtained by Behar 2012 though it should be completed with samples from other areas in Gascony (the most densely populated areas of ancient Aquitania were the city of the Convenae in modern-day Comminges and the city of the Auscii around modern-day Auch).

    I'm rather intrigued by what appears to be the duality of the Basque people : a West Pyrenean pole and a Biscayan/Cantabrian one. If I find some time, I should upload Basque samples on my blogs even though my work has many limits.

    I'm not that surprised that Périgord/Limousin doesn't show mtDNA affinities with Basque lands : I'm pretty persuaded Périgord was already distinct in late Paleolithic times and that it suffered many subsequent waves of migration which have altered the genepool of Perigordians. They really should be autosomally tested to detect the percentage of "Oriental" influx (you know, componants labelled as "Med" and "West Asian" in Dienekes' World12 run).

    As for Alava, it clearly is a fascinating area though one should know which Alavans were sampled : the ones from the ribera which or the ones from areas which lost the use of the Basque language in the 18th century ?

    1. Nice to see you are alive and well, Heraus. :)

      "I suppose the remainder belongs to a less "specific" and pan-European subclades such as U5b1b like myself".

      They may be "pan-European" but U5b1 has been sequenced in the Basque Country (Aispea, Navarre) since Epipaleolithic (see here, scroll down to last update). So it is not "less Basque" but just more distributed, less specific.

      This is one of the problems of studies focusing on ethnic-specific lineages like this one, that they do not dwell much in other also important stuff shared with neighbors.

      "I'm rather intrigued by what appears to be the duality of the Basque people : a West Pyrenean pole and a Biscayan/Cantabrian one".

      I'm would not agree with the existence of such a Biscayan-Cantabrian pole but rather a Biscayan-Araban one with some interpenetration with Gipuzkoa. Cantabria is in many senses another different region, even if in some cases Pasiegos (who are from just West of Biscay's borders) share some lineages with Biscayans. Notice that historical (and pre-historical) Cantabria only included about 2/3 at most of modern Cantabria. In between there was an ill-understood tribal region (Autrigones), who roughly inhabited most of the lands between Bilbao and Santander and between Vitoria-Gasteiz and Burgos. In Paleolithic times this area was generally more akin to the Basque region than to the Cantabrian one.

      "I'm not that surprised that Périgord/Limousin doesn't show mtDNA affinities with Basque lands"...

      I'm a bit surprised but not that much (would it be Basque-looking Angoumois, then I'd be more perplex surely). My emphasis was rather with the overly simplistic assumption that only Basques are representative of that UP Franco-Cantabrian Region's central population: that they should try to study much more Southern France in general, even if obviously some of the ancestry today will necessarily be of Neolithic or later immigrant origins. Because the diversity would be better captured and also because the populations migrating northwards would most likely be those already located near the northern "border" of the FRC, including Perigordians.

    2. I compared the U5 samples in the Behar et al 2012 paper (http://www.anthrogenica.com/showthread.php?409-Analysis-of-U5-in-the-Behar-2012-Basque-mtDNA-study) and the Asturian U5 samples from Pardina et al.

      If you exclude U5b1f from the Basque, the U5 distribution is remarkably similar in the Asturians and Basques, with about 5.7% U5 in both groups. So there seems to be a shared substrate of U5 in these populations that reflects a small, region-wide contribution from Mesolithic U5.

      What makes the Basque unique is that, apparently, there was a founder effect with a U5b1f woman who became the maternal ancestor to about 11% of the Basque. There is very little diversity in the U5b1f in the ten Cardoso FMS samples, and I estimate that this subgroup of U5b1f dates to about 3500 years ago. There are much older branches of U5b1f found in the region, from Portugal to Germany, although we have only a small number of U5b1f sample outside if the Basque region.

      If you are interested in more details, I can send you my spreadsheet comparing the Basques and Asturian U5 samples.


    3. Hi, Gail.

      First of all I do not wish to fall into the fallacy of U5=hunter-gatherers. That's an idea derived only from ill-selected data, mostly from Central Europe. In SW Europe a lot of hunter-gatherers carried mtDNA H (see links in the "see also" section): at least 50% in the Cantabrian subregion and the vast majority in Portugal. This is also largely true of areas like Karelia and was surely also the case in many other regions of Europe. The excessive focus on Central Europe and the impossibility of identifying mtDNA H with just HVS-I data are the actual culprits of the confusion, along with some individuals with a heavily loaded pseudoscientific agenda, stubbornly insisting on their pet hypotheses, even against hard data.

      "There is very little diversity in the U5b1f in the ten Cardoso FMS samples, and I estimate that this subgroup of U5b1f dates to about 3500 years ago. There are much older branches of U5b1f found in the region, from Portugal to Germany, although we have only a small number of U5b1f sample outside if the Basque region".

      Potentially interesting.

      However I distrust "molecular clock" age estimates systematically.

      "If you are interested in more details, I can send you my spreadsheet comparing the Basques and Asturian U5 samples".

      Thanks a lot but I will pass because I'm overloaded. But maybe someone else is interested... it's a very technical issue but with the right motivation and love for such details...


Please, be reasonably respectful when making comments. I do not tolerate in particular sexism, racism nor homophobia. Personal attacks, manipulation and trolling are also very much unwelcome here.The author reserves the right to delete any abusive comment.

Preliminary comment moderation is... ON (your comment may take some time, maybe days or weeks to appear).