Showing posts with label Gascony. Show all posts
Showing posts with label Gascony. Show all posts

August 30, 2015

France's autosomal genetics highlight Gascon-Basque distinctive cluster

A rather decent analysis of French autosomal genetics has been privately pre-published recently (thanks to Jean Secques for calling my attention to it and to the lead author for making it available online). 

Aude Saint Pierre et al. The fine-scale genetic structure of the French population. Submitted to the American Journal of Human Genetics in 2015. Freely accessibleref. LINK, direct PDF link.

A highlight of the study is that the samples all belong to people born in the 1930s and locations refer to their place of birth, so the results should be reflecting the historical demographics of the French Republic in the early 20th century. 

There are no supplemental materials available at this point, so it's only possible to get a glimpse of the general results and we can't go too much into fine detail. These general results are anyhow interesting. Let's see:

Figure 6: Prediction of geographic location of individuals from the test set (n=3,733) using multiple linear regression model. A) Expectation: The seven geographical regions of France according to the geographical coordinates of individuals in the test sample; B) Prediction of geographical coordinates according to the multiple linear regression model.

This figure alone synthesizes the findings: most French citizens cluster in a single unit, which geographically would correspond to NE France (GE region), only SW French (Gascons and Basques mostly) deviate very clearly and roughly fit their own geography towards the Bay of Biscay (or Bay of Gascony, as the French call it). Some samples from the SE (MED and RA regions) also follow this trend. A few outlier samples from the East (GE, RA) look rather Rhenish German, although the lack of controls from outside the Hexagon does not allow me to confirm this appearance. 

You may have noticed that I ignored the IDF samples but that is because it is the Paris region (Île-de-France), which was already back in the 1930s too cosmopolitan to be informative. That is of course reflected in all the results with "orange" dots being nearly of all affinities. 

Follow the principal component analyses, whose more salient information is again the peculiarity of Southwesterners, i.e. Gascons, Basques, and nearby populations.


Figure 2: The scatter plot of the first three PCs from PCA performed on the SNP
genotype data of the 4,433 individuals from the 3 Cities study. Individuals are coloured
according to the region where they were born. (Note: the legend corresponds to both PCAs)

Other than the "Gascon" specificity, which takes over PC1, I'd say that PC2 shows an "anti-Mediterranean" tendency and that PC3 instead shows a "pro-Mediterranean" tendency. This I gather from the relative position of the "red" MED cluster. They both weight the same.

Interestingly there is a prominence of the GO region (Mid-West between the Seine and the Garonne) which may indicate some sort of "Armorican" or "Briton-like" specificity. In appearance it could melt both the "pro-" and "anti-Mediterranean" tendencies but without being able to discern the particular dots (ID and location), I cannot swear for that. 

Much more clear is the "anti-Mediterranean" tendency of Gascons, Basques and allies when they are strongly detached from the main French cluster and instead they show a "pro-Mediterranean" tendency, overlapping at the extreme with the MED cluster, the closer they are to mainstream French. This happens in both PC2 and PC3. 

Little more to say, honestly. Maybe that the small Eastern group of outliers prominent in "anti-Mediterranean" tendency in PC2 probably corresponds pretty well with the outliers of the first graph, which looked German-like. So I guess that the positive side of PC2 probably corresponds with a Northern European tendency.

Interested on what you have to say on this one very particularly, reader.

February 26, 2012

Basque mtDNA

I finally today put my hands on the latest study on Basque matrilineal genetics:


I have to commend the paper because the detail achieved is unprecedented, owing to a good and ambitious sampling strategy and testing for not just the whole hypervariable region (both HVS-I and HVS-II) but for 22 coding region markers as well. As result they have found a number of rare haplogroups and others that are common among Basques but apparently not elsewhere.

Fig. 1, showing the detailed sampling strategy

They have therefore achieved an unprecedented depth in the analysis of mtDNA H among Basques and neighboring populations but they pay no attention to other haplogroups. In this sense I have missed slightly more attention to U(xK), which is an important Basque haplogroup, second only to H, and the lack of proper tabulation of the results other than for haplogroup H. This made me dedicate most of this Sunday to manually tab the information, which I believe is important knowledge to share and discuss.

But first the pearl of this work, the discovery of novel Basque-specific sublineages of haplogroup H. They are detailed in table 1:

Table 1

But there is even more data in the supplemental materials, however it is not well organized (specially all the non-H sequences: merely tabbed in PDF format) and requires some hard work to put together. As said before, I dedicated some long hours to that task and I came up with the following data:
A. Gascony:
Bearn (n=56):
  • H1: 11 (20%)
  • H2a: 2 (4%)
  • H3: 3 (5%)
  • V: 3 (5%)
  • HV: 2 (4%)
  • U: 16 (29%)
  • K: 6 (11%)
  • J: 4 (7%)
  • T: 2 (4%)
  • X: 3 (5%)
  • Singletons: H5'36, H6, H9, H59
Bigorre (n=48):
  • H1: 9 (19%)
  • H3: 2 (4%)
  • V: 4 (8%)
  • U: 11 (23%)
  • K: 5 (10%)
  • J: 2 (4%)
  • T: 4 (8%)
  • Singletons: H2a, H6, H14, H67, HV, R0, K, I, X, W, C
Chalosse (Dax district) (n=60):
  • H1: 9 (15%)
  • H2a: 4 (7%)
  • H6: 2 (3%)
  • H13: 4 (7%)
  • H74: 2 (3%)
  • V: 5 (8%)
  • HV: 2 (3%)
  • U: 13 (22%)
  • K: 3 (5%)
  • J: 5 (8%)
  • T: 3 (5%)
  • X: 3 (5%)
  • Singletons: H3, H4, H5, H8, H42
B. Northern Basque Country:
Lapurdi/Baztan (Lapurtera dialectal zone) (n=58):
  • H1: 15 (26%)
  • H2a: 2 (3%)
  • H3: 3 (5%)
  • H4: 2 (3%)
  • V: 8 (14%)
  • U: 13 (22%)
  • J: 8 (14%)
  • T: 2 (3%)
  • Singletons: H5, H6, H24, K, X
Lapurdi/Lower Navarre (Benafarrera dialectal zone) (n=73):
  • H1: 24 (33%)
  • H3: 4 (5%)
  • H5: 2 (3%)
  • H20: 2 (3%)
  • V: 6 (8%)
  • HV: 6 (8%)
  • U: 13 (18%)
  • K: 2 (3%)
  • J: 5 (7%)
  • T: 2 (3%)
  • X: 4 (5%)
  • Singletons: H2a, H6, H42
Zuberoa (n=61*):
  • H1: 16 (26%)
  • H2a: 3 (5%)
  • H5: 2 (3%)
  • HV: 3 (5%)
  • V: 2 (3%)
  • U: 14 (23%)
  • K: 5 (8%)
  • J: 9 (15%)
  • X: 3 (5%)
  • W: 2 (3%)
  • Singletons: H3, T
C. Southern Basque Country South (Spanish-speaking area since 19th century):
Araba (n=56):
  • H*: 2 (4%)
  • H1: 18 (32%)
  • H3: 6 (11%)
  • V: 3 (5%)
  • HV: 5 (9%)
  • U: 7 (13%)
  • K: 3 (5%)
  • T: 5 (9%)
  • J: 4 (7%)
  • Singletons: H58, N1, X
Central-Western Navarre (n=64):
  • H1: 10 (15%)
  • H3: 12 (19%)
  • H7: 2 (3%)
  • V: 7 (11%)
  • U: 10 (15%)
  • K: 2 (3%)
  • J: 3 (5%)
  • T: 7 (11%)
  • I: 2 (3%)
  • Singletons: H*, H2a, H5, H27, H42, H49, H81, N1, X
North-Eastern Navarre (Erronkari-Salazar): (n=55)
  • H*: 2 (4%)
  • H1: 9 (16%)
  • H3: 5 (9%)
  • H42: 4 (7%)
  • V: 6 (11%)
  • U: 17 (31%)
  • K: 2 (4%)
  • T: 6 (11%)
  • J: 3 (5%)
  • Singleton: K
D. Southern Basque Country North (Basque-speaking area in 20th century):
Biscay (n=59):
  • H1: 17 (29%)
  • H2a: 6 (10%)
  • H6: 2 (3%)
  • H53: 3 (5%)
  • V: 2 (3%)
  • HV: 3 (5%)
  • U: 9 (15%)
  • J: 6 (10%)
  • X: 2 (3%)
  • Singletons: H*, H14, H17, H24, H86, T, N1, I, K
Gipuzkoa (n=57*):
  • H1: 19 (33%)
  • H3: 7 (12%)
  • H17: 2 (4%)
  • V: 3 (5%)
  • U: 12 (21%)
  • K: 2 (4%)
  • J: 5 (9%)
  • T: 2 (4%)
  • X: 2 (4%)
  • Singletons: H2a, H14, W
Gipuzkoa SW (n=63):
  • H1: 24 (38%)
  • H2a: 3 (5%)
  • H3: 3 (5%)
  • H6: 2 (3%)
  • V: 3 (5%)
  • U: 16 (25%)
  • K: 2 (3%)
  • J: 3 (5%)
  • T: 2 (3%)
  • Singletons: H4, H58, HV, X, L3'4
North-Western Navarre (n=53):
  • H1:10 (19%)
  • H2a: 3 (6%)
  • H3: 8 (15%)
  • H4: 2 (4%)
  • H5: 3 (6%)
  • V: 3 (6%)
  • U: 12 (23%)
  • T: 4 (8%)
  • J: 5 (9%)
  • Singletons: H24, HV, W
E. Southern Basque Country - West Biscay (Spanish speaking since old):
Enkarterriak (n=21):
  • H1: 5 (23%)
  • H3: 3 (14%)
  • H15: 3 (14%)
  • HV: 3 (14%)
  • U: 2 (10%)
  • Singletons: H4, H24, H87, K, X
F. Spain (areas once within the Basque ethno-cultural area):
Northern Aragon (n=29):
  • H3: 5 (17%)
  • H4: 2 (7%)
  • HV: 3 (10%)
  • U: 5 (17%)
  • K: 2 (7%)
  • J: 6 (21%)
  • T: 3 (10%)
  • Singletons: H42, V, X
Northern Burgos province (n=24):
  • H1: 2 (8%)
  • H3: 4 (17%)
  • U: 8 (33%)
  • K: 2 (8%)
  • T: 2 (8%)
  • J: 2 (8%)
  • Singletons: H*, H4, V, L2
Cantabria (n=19):
  • H1: 7 (37%)
  • H3: 3 (16%)
  • H5: 2 (11%)
  • J: 2 (11%)
  • Singletons: H27, H30, U, K, T
La Rioja (n=52):
  • H*: 2 (4%)
  • H1: 13 (25%)
  • H3: 7 (13%)
  • H5: 3 (6%)
  • U: 8 (15%)
  • J: 4 (8%)
  • T: 5 (10%)
  • K: 3 (6%)
  • Singletons: H10, H13, H30, H51, H58, R0, I
Notes:

(1) * Sample size of Zuberoa is listed as 62 and Gipuzkoa as 56 but after checking and rechecking I'm pretty sure that one individual has swapped populations. So I'm assuming that n(Zuberoa)=61 and n(Gipuzkoa)=57 for all apportions. 

(2) Haplogroups in italic type are not named that way (or not named at all) in PhyloTree. I am confused by this and other nomenclature of this paper and so far haven't got time to study what they might mean. Ideas are welcomed. 


(3) U means obviously U(xK). Just using the same terminology from the paper. Again, I haven't got any time to explore how much of that U is U5b, U5a, U4 or other clades. This is in my opinion the greatest shortcoming of the paper: ignoring U almost completely. 

Based on this data, I elaborated some maps (official administrative divisions retained for reference, circle diameters are proportional to sample sizes):

Frequencies of mtDNA H1

Frequencies of mtDNA H3

Frequencies of mtDNA U(xK)

Frequencies of mtDNA J

Frequencies of mtDNA V

Notice that V is not as common among Basques as initially reported years ago.

See also: