Maria Lluïsa, of
NeanderFollia[cat], points me to this interesting paper:
I am unsure about the access status of this particular paper, but at the moment of writing this, it is openly available online as advance online publication.
Most interesting is surely the quite important database (mtDNA) covering most of West and Central Europe, from the Northern parts of Spain to Denmark and Poland (including many populations from France), available in the supp. material table S3.
Some fun with bidimensional representation
Let's begin the linearized Fst distances in a bidimensional graph (colors are mine, see below):
|
(click to enlarge) |
In order to clarify a bit all that nightmare of acronyms, I marked Basque and Bearnese samples in red, those from the Paleolithic Franco-Cantabrian region in orange and those from the Rhine-Danube region in blue. The larger orange dot FCT is Perigord (Dordogne), the most important single district of Paleolithic Europe.
You can maybe guess some of the simpler acronyms such as ENG (England) or AUS (Austria), I won't list them here (they are in the supp. material) but I must say for clarity that all beginning with F are from France, initial G means Germany and initial I means Ireland. All the others have unique areal acronyms and those ending in MI mean "miscellanea".
Not much apparent structure is observed: basically there is a big blob in the middle (albeit divided in two not too well defined subclusters, I drew a dotted line to mark this internal division) and then four very isolated samples that actually describe the two axis of the graph.
These "polar" samples are the following ones:
- ALA is Araba (Álava in Spanish), a Southern Basque province mostly looking to the Upper Ebro. It defines the positive polarity of the first dimension and its main characteristic is to have 80% R-CRS (surely all H).
- PAS is Valle del Pas (Cantabria), a mountainous district famous for its soft cheese biscuits, its religious architecture and the frequent visits of geneticists... to their archived data (a pity because they miss the delicious quesadas). They define the negative pole of the first dimension. They are particularly high in haplogroup V (24%) but low in CRS (27%). They are also high in U5 (13%), I (6%) and T2b (8%).
- GUI is Gipuzkoa (Guipúzcoa in Spanish spelling). Their most salient characteristic, as far as I can tell, is an unusual high frequency of H2a (22% within H, 12% of all). They also have rather high frequencies of V (12%) and U5 (17%) and are quite to very low in the Neolithic clades (J, T, K, W, X).
- COR is Cornwall. Their most notable characteristic is very high J (20%).
It is curious that all the first dimension of West and Central European diversity is synthesized in a line of some 50 Km or maybe a little more just SW of where I live. The authors seem to agree:
Northern Iberia appears to be microgeographically differentiated. Excluding the highly divergent Pasiego isolate (
Maca-Meyer et al., 2003), there are also significant differences between the Basque provinces or between Catalonia at the north-eastern edge and Galicia-Asturias in the northwest. In fact, when these results were represented in an multidimensional scaling plot (
Figure 2), the Pasiegos and Spanish Basques from Guipuzcoa and Alava were the most outstanding outliers, also followed by samples from Catalonia and Galicia, the French Basque sample and the British samples from Cornwall and Wales.
Even if we decide to disdain the extreme Araban and Pasiego samples, the next in line marking this polarity are Bearnese/North Basques (FSW) and Catalans (CAT), what implies a somewhat longer line from ESE to WNW along the Pyrenees. Somewhat different in the direction and distance but not too much in the geographical regions involved.
The second dimension is quite different, defining a S-N axis along the Atlantic coasts of France, between Cornwall and Gipuzkoa. After ignoring the outlier poles, we can redefine this second dichotomy as being between West Ireland and Cantabria or something like that (same axis mostly, though moved to the West a bit).
The two main clusters show tendencies in these two axes: one (low, left) tends to Cornwall and the Pasiegos (or Catalans if you wish), while the other (up, right) tends to Basques (Arabans and Gipuzkoans specifically). The clusters are not too well defined anyhow and there is also a smaller third cluster formed by Biscayans, Provenzals and Cantabrians, which stands between the Pasiegos and Gipuzkoa (what, excepting the Provenzals, makes almost perfect geographic sense).
I have tried to represent the findings from the Fst graph in maps:
1. The polarity axes: red is dimension 1 (dotted line after replacing Pasiegos by Catalans) and blue is dimension 2:
It is... curious, right?
2. The clusters: blue and red are the populations in each of the two main clusters, marked with stars the four "polar" or outlier populations, with colors representing the cluster they are closer to. Green is the third minor cluster. Magenta are two populations (Switzerland and Morbihan) for which two different samples exist and each falls in a different cluster. I ignored the "miscellanea" samples.
If we are to hypothesize a Franco-Cantabrian origin to some of this duality (which is surely more complex than just that), we'd see that northern Franco-Cantabrian populations (Dordogne, Herault and Lyonais) fall in the blue cluster, together with several Atlantic populations from France, Scotland and Ireland, the Danubian fraction of Central Europe and the Mediterranean fraction of Iberia.
The red cluster instead looks more specifically Atlantic, with Basques/Bearnois and Asturians being the only ones from the Franco-Cantabrian region and otherwise being concentrated towards the North-West.
The lesser green cluster is totally Franco-Cantabrian but should represent a peculiar intermediate alchemy rather than a distinctive ancestral group, I suspect. Not that the other larger clusters are safely any representation of shared ancestry necessarily either but at least of some intriguing coincidence in their alchemy that seems to ask for further exploration.
Some intriguing details of specific haplogroups
The authors seem to take a critical stand, based in previous work, on the Franco-Cantabrian or even Catalan origin of haplogroup V. Different papers have offered strikingly different results on the frequency of this clade specially in Catalans (earlier claimed to be 24%, now just 3%) and Gipuzkoan Basques (initially said to be 20%, now more like half that amount). García and colleagues seem to hint that V may have a southern Iberian origin after all:
Diversity values for V are significantly higher in Southern Iberia than in the Cornice (P<0.05). Excluding Scandinavia, the lowest diversities are found in Northern Africa and the Iberian northeast.
I say that this would totally fit in my model of important Ibero-African contacts in the context of the Last Glacial Maximum and the genesis of Oranian culture in North Africa. It is also consistent with the known fact that North African mtDNA H (sister of V, together making up c. 30% of North African mtDNA) is of Iberian or otherwise SW European derivation.
There is also some mention to HV4, with a novel sublineage, HV4a1, of apparent origin in the Cantabrian strip (also found in one Italian and one continental European). Other HV4 sublineages are Eastern Mediterranean however, with its closest relative HV4a2, being found in Jordan and Egypt.
In regards to H, it is worth mentioning the general high frequencies in the Cantabrian strip and specially high frequency of H6a among Cantabrians (12%), however it lacks diversity.
H7 is confirmed as being most frequent in NE Iberia and SE France, it is one of the four H subclades with significant presence in North Africa as well. The authors however yield to the Mediterranean origins temptation, claiming presence in West Asia that is actually quite anecdotal (4/253 per
Enafaa 2009). Excepting Catalonia, H7 is rare in Iberia but it is quite common in France instead, where it is largely concentrated (
Álvarez-Iglesias 2009). The newly revealed presence in Catalonia offers a plausible origin for its North African presence in the context of the LGM transmediterranean contacts, which would be quite parsimonious considering that in general all mtDNA H in the region is of Iberian origin (
Cherni 2008). But whatever.
A key haplogroup however is H1, the largest H sublineage. In this aspect the authors find surprising heterogeneity. While the highest frequencies are in the Cantabrian strip, the highest diversity seems to be in the Mediterranean area (Italy and Balcans). Next in line come Scandinavia (Finns included), NE Europe and North Central Europe, all three tied at the same value (9.4). Paragroup H1(xH1a,H1b) appears to have also greatest diversity in Italy-Balcans, followed Scandinavia and then the Western Islands and NE Iberia (Catalonia and Aragon). H1a is clearly most diverse in NE Europe and North-Central Europe. H1b, a smaller scattered lineage, is most diverse in the southern Iberian Peninsula. Within H1:
FST pairwise comparisons based on haplotype frequencies detected unexpected heterogeneity. France showed close affinities with only the nearby north-east Iberian sample. In addition, the Scandinavians seem to be very different from north-central Europeans, showing more affinities to Slavs.
H3, the second largest H sublineage probably, is most diverse in North-Central Europe, in spite of being much more common in SW Europe (3-8%).
All this suggests that haplogroup H spread to SW Europe from Central Europe and not the other way around. At least H1b has been detected in Epipalelolithic Portugal (Chandler 2005,
revised sequence assignment by me) establishing a maximum date for this spread. I would therefore think that mtDNA H subclades expanded at the latest with the Gravettian wave because there are no more cultural flows towards SW Europe with that origin before the late Bronze Age (or the late Chalcolithic if you wish to consider Bell Beaker - not me).
It is quite surprising anyhow to find such relative low nucleotide diversity levels for these so abundant clades not just in the smaller NW and NE Iberian regions but specially in France, while South Iberia generally shows greater diversity instead. The results raise more questions than provide answers in this sense.
Recently described haplogroups H1r and H1t were found to exist among Basques. H1t seems to be an Iberian-exclusive clade, while H1r seems instead continental (found in one French and one "European", as well as one Basque now).
In partial contrast haplogroup K has highest diversity in NE Iberia. However the differences are not too large for all Europe except the NE, where diversity is very low.
T2b has highest diversity in Northern Iberia (both NW and NE), followed by North Africa.
Excluding two too low in number, the greatest diversity of W is in Balcans-Italy, followed by Iberia (all three regions).
Franco-Cantabrian post-Glacial expansion?
Apparently not. This seems the main conclusion from this paper and, on light of the perplexing diversity values, even for H3, I have to agree. H arrived here from Central Europe and probably Italy, already diversified to a large extent, and did not move much after that. This arrival probably happened in the Gravettian or Aurignacian periods.
It is however still possible that a male-biased expansion happened with Magdalenian, as suggested by the patterns of R1b1b1a1a2, the most common R1b sublineage, at least in Europe, which seems to have most of its phylogenetic diversity (safer than nucleotide diversity, which is the one analyzed by García et al. in the mtDNA) around the Pyrenees (see
here). However a local Central-North European component is still evident in its smaller brother haplogroup R1b1b1a1a1.
Another caution is that no meaningful sampling of French mtDNA has been undertaken since 2004 (Dubut et al., data recycled for this paper) and, considering the many errors and flukes happening in other cases without enough second and third samplings, it is very possible that a lot is still hiding in that area, so important in European prehistory.