Showing posts with label X-DNA. Show all posts
Showing posts with label X-DNA. Show all posts

January 1, 2017

Reconstructing Sardinian population history

A very interesting pre-pub study, dealing with Sardinian genetics in great sub-national detail but also within the wider European and Mediterranean context, became available in the last weeks. I won't probably be able to make justice to it here, so please take a look yourselves.

Charleston W.K. Chiang et al., Population history of the Sardinian people inferred from whole-genome sequencing. BioRXiv 2016. Open access pre-pubLINK [doi:10.1101/092148]

Abstract

The population of the Mediterranean island of Sardinia has made important contributions to genome-wide association studies of traits and diseases. The history of the Sardinian population has also been the focus of much research, and in recent ancient DNA (aDNA) studies, Sardinia has provided unique insight into the peopling of Europe and the spread of agriculture. In this study, we analyze whole-genome sequences of 3,514 Sardinians to address hypotheses regarding the founding of Sardinia and its relation to the peopling of Europe, including examining fine-scale substructure, population size history, and signals of admixture. We find the population of the mountainous Gennargentu region shows elevated genetic isolation with higher levels of ancestry associated with mainland Neolithic farmers and depleted ancestry associated with more recent Bronze Age Steppe migrations on the mainland. Notably, the Gennargentu region also has elevated levels of pre-Neolithic hunter-gatherer ancestry and increased affinity to Basque populations. Further, allele sharing with pre-Neolithic and Neolithic mainland populations is larger on the X chromosome compared to the autosome, providing evidence for a sex-biased demographic history in Sardinia. These results give new insight to the demography of ancestral Sardinians and help further the understanding of sharing of disease risk alleles between Sardinia and mainland populations.

The authors call to some question the extreme simplicity of the three populations model of Lazaridis and subsequent studies. They do not flatly reject it but it seems that the lack of nuance bothers them a lot, as it does to me. This is quite clear when they find once and again Sardinian-Basque lines of relationship without going through Italian, Spaniard or French intermediaries, also when they face the issue of the largest Y-DNA haplogroups in the island, I2a1a (M26, almost exclusively a Sardinian and Pyrenean haplogroup) and R1b1a2 (M269), which are not typically associated with Neolithic farmers, suggesting that there is more to Neolithic settlement than meets the eye in the too simplistic three populations' model. They even seem to consider if Paleolithic peoples from Sardinia itself or maybe some other locations contributed heavily to what they feel is a sex-biased genetic pool.

They do confirm that Sardinians have both strong "Neolithic" (Stuttgart) and "Paleolithic" (Lochsbour) ancestry and no (negative even) "Steppe" (Yamnaya) one, although this last is truer for the most isolated sub-populations than for the more cosmopolitan ones. 

They also estimate that Sardinians have been generally isolated from the rest of Europeans for some 330 generations, what reads as approx. 9900 years, i.e. since the very early Neolithic settlement of the island. We would actually have to reduce that time span a bit but within reason, else it becomes Epipaleolithic in fact, what is most unlikely. Alternatively, as the main comparison is Northern Europe, this date could refer to the branching out of Painted-Linear (continental) and Impressed-Cardium (maritime) Neolithic cultures in the Aegean or the Balcans.

November 2, 2015

Algerian complex genetics


This is a rather interesting study that deals with the genetics of the Republic of Algeria, with several new samples.


Asmahan Bekada, Lara R. Arauna et al. Genetic Heterogeneity in Algerian Human Populations. PLoS ONE 2015. Open accessLINK [doi:10.1371/journal.pone.0138453]

Abstract

The demographic history of human populations in North Africa has been characterized by complex processes of admixture and isolation that have modeled its current gene pool. Diverse genetic ancestral components with different origins (autochthonous, European, Middle Eastern, and sub-Saharan) and genetic heterogeneity in the region have been described. In this complex genetic landscape, Algeria, the largest country in Africa, has been poorly covered, with most of the studies using a single Algerian sample. In order to evaluate the genetic heterogeneity of Algeria, Y-chromosome, mtDNA and autosomal genome-wide makers have been analyzed in several Berber- and Arab-speaking groups. Our results show that the genetic heterogeneity found in Algeria is not correlated with geography or linguistics, challenging the idea of Berber groups being genetically isolated and Arab groups open to gene flow. In addition, we have found that external sources of gene flow into North Africa have been carried more often by females than males, while the North African autochthonous component is more frequent in paternally transmitted genome regions. Our results highlight the different demographic history revealed by different markers and urge to be cautious when deriving general conclusions from partial genomic information or from single samples as representatives of the total population of a region.


Y-DNA frequencies


Supplementary Table 2: Y chromosome haplogroup frequencies among the studied populations (% in parentheses)







Population Algiers1 Oran1 Reguibate1 Zenata1 Mozabite2 Oran3 Algiers4 Tizi Ouzou4
Abreviations ALG1 ORN1 RGB ZNT MZB ORN2 ALG2 TZO
Number of individuals 26 80 60 35 20 102 35 19
A -M91 (-) 1 (1.25) (-) (-) (-) (-) (-) (-)
C-M216 (-) 1 (1.25) (-) (-) (-) (-) (-) (-)
E1a-M33 1 (3.84) (-) (-) 1 (2.86) (-) (-) 1 (2.86) (-)
E1b1a-M2 (-) 8 (10) 2 (3.33) 8 (22.86) 2 (10) 8 (7.84) (-) (-)
E1b1b1a-M78 4 (15.38) 2 (2.50) (-) 1 (2.86) (-) 6 (5.88) 4 (11.43) (-)
E1b1b1b -M81 14 (53.85) 33 (41.25) 48 (80) 17 (48.57) 16 (80) 46 (45.10) 14 (40) 9 (47.37)
E1b1b1-M35 (-) 3 (3.75) 3 (5) (-) (-) (-) 1 (2.86) 2 (10.53)
E2 -M75 (-) 1 (1.25) (-) (-) (-) (-) (-) (-)
F -M89 (xJ, K, Q, R1) 2 (7.69) 4 (5) 1 (1.67) (-) (-) (-) 4 (11.43) 2 (10.53)
J -M304 (xJ2) 5 (19.23) 18 (22.50) 6 (10) 4 (11.43) (-) 23 (22.55) 8 (22.86) 3 (15.79)
J2 -M172 (-) 1 (1.25) (-) (-) (-) 5 (4.90) 2 (5.71) (-)
K -M9 (-) (-) (-) (-) (-) (-) 1 (2.86) (-)
Q -M242 (-) 2 (2.50) (-) 1 (2.86) (-) 1 (0.98) (-) (-)
R1 -M173 (-) 6 (7.50) (-) 3 (8.57) 2 (10) 13 (12.75) (-) 3 (15.79)









Y Haplogroup Diversity GD (h +/- sd) 0.6677 +/- 0.0806 0.7674 +/- 0.0356 0.3520 +/- 0.0757 0.7092 +/- 0.0625 0.3579 +/- 0.1266 0.7245 +/- 0.0325 0.7782 +/- 0.0499 0.7427 +/- 0.0831
1 Present study







2 Shi et al. 2010







3 Robino et al. 2008







4 Arredi et al. 2004










The most common lineage is E1b-M81, which is centered around Morocco and has a mostly NW African distribution. The Reguibate sample (Arabic speakers from near Southern Morocco and West Sahara) shows extremely high frequencies (80%) of it. This is also true of the Mozabites. Otherwise the frequencies range between 40% and 54%.

Tropical African lineages are mostly represented by E1b-M2, which peaks among the Zenata Berbers of the Southern Atlas and Northern Sahara but has also some notable presence in Oran, Mozabites (North Sahara) and Reguibate (West Sahara). However these lineages are nearly absent in the Northeast Kabyle Berbers (Tizi Ouzu) and only have a token presence in Algiers (E1a). 

E1b-M78, a lineage centered in NE Africa, seems to peak in Algiers, with low frequencies in Oran and effectively absent in other populations. 

J1, presumably the same as J(xJ2), is strongest in the coast (Algiers, Oran) but has significant frequencies in other populations (excepted Mozabites). 

J2, although quite rare, is worth mentioning because its presence may indicate areas of true Arabic settlement (of course J1 is more common in Arabia but it is unthinkable that one goes without the other in such a recent time frame). It seems that Oran has the strongest such settlement, although some is also apparent in Algiers.

R1 peaks among Kabyles (16%) and is also present in Oran and among the Mozabite and Zenata Berbers. Sadly it is not analyzed what fraction of it is R1b-M412 (Western European) or R1b-V88 (Afro-Mediterranean), as both lineages have been detected in North Africa in previous studies but almost certainly have different histories. 

Other F is quite intriguing. The few Q and K* individuals are within expectations (at least my expectations) but there are a lot of F* people, notably in Kabyle and Algiers that are most intriguing. Are they within haplogroup G or is it something else? G reaches almost 10% in Egypt but previous studies had not found more than 6% in NW Africa (Bouhria Berbers, see here).

Update (Nov 4): Chris makes a very interesting suggestion in the comments section about all this F*: what if it is (partly or in full) haplogroup I, a typical European Y-DNA lineage that is clearly rooted in the Paleolithic of the region? The lineage has been documented in ancient Berbers from Canary Islands and, for what Chris says, also in Sudan. It would make perfect sense if it was also present among modern NW Africans, being consistent with other genetics that seem to originate in Paleolithic Europe (~30% of mtDNA, a good share of autosomal DNA, maybe also part of the Y-DNA R).



Mozabites are close to "pure North Africans"

Autosomal analysis shows that this Berber population of the Algerian Atlas has the lowest range of admixture form any external source, be it Europe, West Asia or Tropical Africa. Some individuals appear extremely unadmixed.


Fig 3. Plots for the analysis of genome-wide SNPs.
PC analysis (upper figures) based on autosomal data, and X-chromosome SNPs. ADMIXTURE proportions (bottom figures) at k = 2,3, and 4 based on autosomal data and X-chromosome SNPs. Algeria, stands for general Algerian sample [3]; Mozabite, stands for the Algerian Berber Mozabites [32]; and Zenata, stands for Algerian Berber Zenata (present study).



X-chromosome conundrum

It is not common that genetic studies analyze the X-chromosome. A reason is probably that its interpretation can be confusing. Intuitively it seems true that the X chromosome is passed down by a mostly female line but this is not really correct, as (ignoring partial recombination) a man can have an X chromosome from either the maternal grandfather or grandmother, while a woman will have one from her father and another from the mother. Ironically only a woman's father-inherited X-chromosome can be automatically traced to a woman two generations back: that of the paternal grandmother. Complicated, right?

As probably apparent in fig. 3 above but made more clear in fig. 4 below, the study detected differences in autosomal (overall) ancestry and X-chromosome one.


Fig 4. Correlation plots of the ancestry proportions at k = 4 in the ADMIXTURE analysis comparing autosomes and X-chromosome SNPs.
North African, sub-Saharan, Middle Eastern, and European ancestry proportions are shown in different plots. Solid black lines represent linear correlations between autosomal and X-chromosome components.


The authors interpret these results as indicating female bias in the European and West Asian components. This may be true at least in the European case because it correlates well with the differential between European mtDNA (~30%) and Y-DNA (<10%), which suggests that European ancestry used to be more important in the past and that male-biased migrations (Capsian culture is probably one of the culprits) altered this. 

But is it also true for the West Asian ancestry? I can't say, really. I remember a study from a decade ago (don't have the reference right now, sorry) or so that showed that in a Colombian coastal town, X-chromosome ancestry was almost only European, while mtDNA was instead almost exclusively Native American, and that it should be interpreted as continuous influx of men from Europe, marrying local women, who managed to retain, generation after generation, the aboriginal mtDNA (which does never leave the strict maternal line) but not the X-chromosome line, once and again altered by male immigrants. 

I don't really dare to subscribe the authors' interpretation without a more nuanced analysis, analysis that I don't feel able to perform myself at the moment either. If they are correct, anyhow, it means that there were important male-biased demographic expansions of African specific origin, either in NW Africa itself (what could well be supported by the vigor of E1b-M81) or in NE Africa prior to migration to the West within Capsian. Or both. 


Mitochondrial DNA data

In case anyone wants to try their luck at this complicated analysis (North Africans are indeed a complex and most intriguing population), I'm adding here the raw mtDNA table:

Supplementary Table 5: mtDNA haplogroup frequencies (%) distribution among Algerian populations





Populations Algiers Oran Zenata Reguibate Oran (Bekada et al. 2013) Mozabite (Corte-Real et al. 1996)
Abbreviation ALG ORN1 ZNT RGB ORN2 MZB
Number of samples 62 93 73 108 240 85
H/HV 19.35 35.48 12.33 30.56 30.83 23.53
HV0 4.84 2.15 5.48 6.48 3.75 8.24
I 1.61 - 1.37 - 0.83 -
J (16069 16126) 14.52 3.23 2.74 0.93 3.33 3.53
K (16224 16311) - 4.30 4.11 3.70 1.67 -
L - - - 0.93 - -
L0 1.61 3.23 1.37 - 0.42 -
L1b 1.61 2.15 9.59 6.48 3.75 -
L1c - - 1.37 0.93 0.83 -
L2 - - 5.48 4.63 0.83 -
L2a 9.68 5.38 15.07 3.70 5.42 5.88
L2b 1.61 2.15 5.48 - 0.42 1.18
L2c1 - - 1.37 - 1.25 -
L2d - - - 1.85 - -
L2e - 1.08 - - - -
L3b 1.61 3.23 2.74 3.70 1.67 2.35
L3b/d - - 4.11 - - 1.18
L3d - - 4.11 - 1.25 -
L3e1 1.61 - - - 0.42 -
L3e2 4.84 - 5.48 - 0.83 2.35
L3e3 1.61 - - - - -
L3e5 11.29 - - - 0.42 -
L3f - 4.30 8.22 3.70 2.08 -
L3h1b1a 1.61 - 1.37 - - -
L4b2 - - - - 0.42 -
M1 3.23 5.38 - 1.85 7.08 4.71
N 1.61 1.08 - 0.93 0.42 -
R - - - 0.93 - -
R0a - - - 0.93 1.67 -
R0a1a - - - 8.33 - -
T* - - - 0.93 1.67 -
T1a 1.61 2.15 2.74 - 3.33 4.71
T2 - 1.08 - 0.93 0.42 -
T2b - - 2.74 - 2.92 -
T2c - - - - 0.83 -
U - 1.08 - 0.93 0.42 -
U1 - 1.08 - 0.93 0.83 -
U3 - 1.08 - - 1.25 10.59
U4 1.61 - - - 1.67 1.18
U5 - - - - 0.42 -
U5a 1.61 3.23 - - 1.67 -
U5b 1.61 1.08 - 2.78 0.42 -
U6a - 4.30 - 7.41 6.67 -
U6a1a - 1.08 - - - 12.94
U6a1a1 - 3.23 - 3.70 - 14.12
U6a1b - 1.08 - - - 1.18
U6a5 - - - - 0.83 -
U6c - - 1.37 - 0.83 -
U8b1 - 1.08 - - - 2.35
V - - - - 3.75 -
V7a - 1.08 - 1.85 - -
W 3.23 1.08 - - 1.25 -
X 8.06 2.15 - - - -
X2 - 1.08 1.37 - 1.25 -
mtDNA haplogroup diversity (h+-sd) 0.9175 +/- 0.0174 0.8630 +/- 0.0325 0.9376 +/- 0.0117 0.8823 +/- 0.0236 0.8853 +/- 0.0166 0.8891 +/- 0.0169
 


Good luck (and feed me back if you have some idea).