November 15, 2015

Minor issue with comments

For a week or so the comments feedback, on which it depends that latest comments that appear listed on the right margin of this blog, had been glitched. 

When I checked around yesterday I could not get the gadget to go back to work, however, looking around, I found three comments awaiting approval, something that should not be the case because there is not comment pre-moderation implemented right now. After I approved them all, the gadget seems to be working back again, so I'm guessing that those ill-queued comments were causing a glitch in the feed. 

My apologies for whatever inconvenience.

November 2, 2015

Algerian complex genetics


This is a rather interesting study that deals with the genetics of the Republic of Algeria, with several new samples.


Asmahan Bekada, Lara R. Arauna et al. Genetic Heterogeneity in Algerian Human Populations. PLoS ONE 2015. Open accessLINK [doi:10.1371/journal.pone.0138453]

Abstract

The demographic history of human populations in North Africa has been characterized by complex processes of admixture and isolation that have modeled its current gene pool. Diverse genetic ancestral components with different origins (autochthonous, European, Middle Eastern, and sub-Saharan) and genetic heterogeneity in the region have been described. In this complex genetic landscape, Algeria, the largest country in Africa, has been poorly covered, with most of the studies using a single Algerian sample. In order to evaluate the genetic heterogeneity of Algeria, Y-chromosome, mtDNA and autosomal genome-wide makers have been analyzed in several Berber- and Arab-speaking groups. Our results show that the genetic heterogeneity found in Algeria is not correlated with geography or linguistics, challenging the idea of Berber groups being genetically isolated and Arab groups open to gene flow. In addition, we have found that external sources of gene flow into North Africa have been carried more often by females than males, while the North African autochthonous component is more frequent in paternally transmitted genome regions. Our results highlight the different demographic history revealed by different markers and urge to be cautious when deriving general conclusions from partial genomic information or from single samples as representatives of the total population of a region.


Y-DNA frequencies


Supplementary Table 2: Y chromosome haplogroup frequencies among the studied populations (% in parentheses)







Population Algiers1 Oran1 Reguibate1 Zenata1 Mozabite2 Oran3 Algiers4 Tizi Ouzou4
Abreviations ALG1 ORN1 RGB ZNT MZB ORN2 ALG2 TZO
Number of individuals 26 80 60 35 20 102 35 19
A -M91 (-) 1 (1.25) (-) (-) (-) (-) (-) (-)
C-M216 (-) 1 (1.25) (-) (-) (-) (-) (-) (-)
E1a-M33 1 (3.84) (-) (-) 1 (2.86) (-) (-) 1 (2.86) (-)
E1b1a-M2 (-) 8 (10) 2 (3.33) 8 (22.86) 2 (10) 8 (7.84) (-) (-)
E1b1b1a-M78 4 (15.38) 2 (2.50) (-) 1 (2.86) (-) 6 (5.88) 4 (11.43) (-)
E1b1b1b -M81 14 (53.85) 33 (41.25) 48 (80) 17 (48.57) 16 (80) 46 (45.10) 14 (40) 9 (47.37)
E1b1b1-M35 (-) 3 (3.75) 3 (5) (-) (-) (-) 1 (2.86) 2 (10.53)
E2 -M75 (-) 1 (1.25) (-) (-) (-) (-) (-) (-)
F -M89 (xJ, K, Q, R1) 2 (7.69) 4 (5) 1 (1.67) (-) (-) (-) 4 (11.43) 2 (10.53)
J -M304 (xJ2) 5 (19.23) 18 (22.50) 6 (10) 4 (11.43) (-) 23 (22.55) 8 (22.86) 3 (15.79)
J2 -M172 (-) 1 (1.25) (-) (-) (-) 5 (4.90) 2 (5.71) (-)
K -M9 (-) (-) (-) (-) (-) (-) 1 (2.86) (-)
Q -M242 (-) 2 (2.50) (-) 1 (2.86) (-) 1 (0.98) (-) (-)
R1 -M173 (-) 6 (7.50) (-) 3 (8.57) 2 (10) 13 (12.75) (-) 3 (15.79)









Y Haplogroup Diversity GD (h +/- sd) 0.6677 +/- 0.0806 0.7674 +/- 0.0356 0.3520 +/- 0.0757 0.7092 +/- 0.0625 0.3579 +/- 0.1266 0.7245 +/- 0.0325 0.7782 +/- 0.0499 0.7427 +/- 0.0831
1 Present study







2 Shi et al. 2010







3 Robino et al. 2008







4 Arredi et al. 2004










The most common lineage is E1b-M81, which is centered around Morocco and has a mostly NW African distribution. The Reguibate sample (Arabic speakers from near Southern Morocco and West Sahara) shows extremely high frequencies (80%) of it. This is also true of the Mozabites. Otherwise the frequencies range between 40% and 54%.

Tropical African lineages are mostly represented by E1b-M2, which peaks among the Zenata Berbers of the Southern Atlas and Northern Sahara but has also some notable presence in Oran, Mozabites (North Sahara) and Reguibate (West Sahara). However these lineages are nearly absent in the Northeast Kabyle Berbers (Tizi Ouzu) and only have a token presence in Algiers (E1a). 

E1b-M78, a lineage centered in NE Africa, seems to peak in Algiers, with low frequencies in Oran and effectively absent in other populations. 

J1, presumably the same as J(xJ2), is strongest in the coast (Algiers, Oran) but has significant frequencies in other populations (excepted Mozabites). 

J2, although quite rare, is worth mentioning because its presence may indicate areas of true Arabic settlement (of course J1 is more common in Arabia but it is unthinkable that one goes without the other in such a recent time frame). It seems that Oran has the strongest such settlement, although some is also apparent in Algiers.

R1 peaks among Kabyles (16%) and is also present in Oran and among the Mozabite and Zenata Berbers. Sadly it is not analyzed what fraction of it is R1b-M412 (Western European) or R1b-V88 (Afro-Mediterranean), as both lineages have been detected in North Africa in previous studies but almost certainly have different histories. 

Other F is quite intriguing. The few Q and K* individuals are within expectations (at least my expectations) but there are a lot of F* people, notably in Kabyle and Algiers that are most intriguing. Are they within haplogroup G or is it something else? G reaches almost 10% in Egypt but previous studies had not found more than 6% in NW Africa (Bouhria Berbers, see here).

Update (Nov 4): Chris makes a very interesting suggestion in the comments section about all this F*: what if it is (partly or in full) haplogroup I, a typical European Y-DNA lineage that is clearly rooted in the Paleolithic of the region? The lineage has been documented in ancient Berbers from Canary Islands and, for what Chris says, also in Sudan. It would make perfect sense if it was also present among modern NW Africans, being consistent with other genetics that seem to originate in Paleolithic Europe (~30% of mtDNA, a good share of autosomal DNA, maybe also part of the Y-DNA R).



Mozabites are close to "pure North Africans"

Autosomal analysis shows that this Berber population of the Algerian Atlas has the lowest range of admixture form any external source, be it Europe, West Asia or Tropical Africa. Some individuals appear extremely unadmixed.


Fig 3. Plots for the analysis of genome-wide SNPs.
PC analysis (upper figures) based on autosomal data, and X-chromosome SNPs. ADMIXTURE proportions (bottom figures) at k = 2,3, and 4 based on autosomal data and X-chromosome SNPs. Algeria, stands for general Algerian sample [3]; Mozabite, stands for the Algerian Berber Mozabites [32]; and Zenata, stands for Algerian Berber Zenata (present study).



X-chromosome conundrum

It is not common that genetic studies analyze the X-chromosome. A reason is probably that its interpretation can be confusing. Intuitively it seems true that the X chromosome is passed down by a mostly female line but this is not really correct, as (ignoring partial recombination) a man can have an X chromosome from either the maternal grandfather or grandmother, while a woman will have one from her father and another from the mother. Ironically only a woman's father-inherited X-chromosome can be automatically traced to a woman two generations back: that of the paternal grandmother. Complicated, right?

As probably apparent in fig. 3 above but made more clear in fig. 4 below, the study detected differences in autosomal (overall) ancestry and X-chromosome one.


Fig 4. Correlation plots of the ancestry proportions at k = 4 in the ADMIXTURE analysis comparing autosomes and X-chromosome SNPs.
North African, sub-Saharan, Middle Eastern, and European ancestry proportions are shown in different plots. Solid black lines represent linear correlations between autosomal and X-chromosome components.


The authors interpret these results as indicating female bias in the European and West Asian components. This may be true at least in the European case because it correlates well with the differential between European mtDNA (~30%) and Y-DNA (<10%), which suggests that European ancestry used to be more important in the past and that male-biased migrations (Capsian culture is probably one of the culprits) altered this. 

But is it also true for the West Asian ancestry? I can't say, really. I remember a study from a decade ago (don't have the reference right now, sorry) or so that showed that in a Colombian coastal town, X-chromosome ancestry was almost only European, while mtDNA was instead almost exclusively Native American, and that it should be interpreted as continuous influx of men from Europe, marrying local women, who managed to retain, generation after generation, the aboriginal mtDNA (which does never leave the strict maternal line) but not the X-chromosome line, once and again altered by male immigrants. 

I don't really dare to subscribe the authors' interpretation without a more nuanced analysis, analysis that I don't feel able to perform myself at the moment either. If they are correct, anyhow, it means that there were important male-biased demographic expansions of African specific origin, either in NW Africa itself (what could well be supported by the vigor of E1b-M81) or in NE Africa prior to migration to the West within Capsian. Or both. 


Mitochondrial DNA data

In case anyone wants to try their luck at this complicated analysis (North Africans are indeed a complex and most intriguing population), I'm adding here the raw mtDNA table:

Supplementary Table 5: mtDNA haplogroup frequencies (%) distribution among Algerian populations





Populations Algiers Oran Zenata Reguibate Oran (Bekada et al. 2013) Mozabite (Corte-Real et al. 1996)
Abbreviation ALG ORN1 ZNT RGB ORN2 MZB
Number of samples 62 93 73 108 240 85
H/HV 19.35 35.48 12.33 30.56 30.83 23.53
HV0 4.84 2.15 5.48 6.48 3.75 8.24
I 1.61 - 1.37 - 0.83 -
J (16069 16126) 14.52 3.23 2.74 0.93 3.33 3.53
K (16224 16311) - 4.30 4.11 3.70 1.67 -
L - - - 0.93 - -
L0 1.61 3.23 1.37 - 0.42 -
L1b 1.61 2.15 9.59 6.48 3.75 -
L1c - - 1.37 0.93 0.83 -
L2 - - 5.48 4.63 0.83 -
L2a 9.68 5.38 15.07 3.70 5.42 5.88
L2b 1.61 2.15 5.48 - 0.42 1.18
L2c1 - - 1.37 - 1.25 -
L2d - - - 1.85 - -
L2e - 1.08 - - - -
L3b 1.61 3.23 2.74 3.70 1.67 2.35
L3b/d - - 4.11 - - 1.18
L3d - - 4.11 - 1.25 -
L3e1 1.61 - - - 0.42 -
L3e2 4.84 - 5.48 - 0.83 2.35
L3e3 1.61 - - - - -
L3e5 11.29 - - - 0.42 -
L3f - 4.30 8.22 3.70 2.08 -
L3h1b1a 1.61 - 1.37 - - -
L4b2 - - - - 0.42 -
M1 3.23 5.38 - 1.85 7.08 4.71
N 1.61 1.08 - 0.93 0.42 -
R - - - 0.93 - -
R0a - - - 0.93 1.67 -
R0a1a - - - 8.33 - -
T* - - - 0.93 1.67 -
T1a 1.61 2.15 2.74 - 3.33 4.71
T2 - 1.08 - 0.93 0.42 -
T2b - - 2.74 - 2.92 -
T2c - - - - 0.83 -
U - 1.08 - 0.93 0.42 -
U1 - 1.08 - 0.93 0.83 -
U3 - 1.08 - - 1.25 10.59
U4 1.61 - - - 1.67 1.18
U5 - - - - 0.42 -
U5a 1.61 3.23 - - 1.67 -
U5b 1.61 1.08 - 2.78 0.42 -
U6a - 4.30 - 7.41 6.67 -
U6a1a - 1.08 - - - 12.94
U6a1a1 - 3.23 - 3.70 - 14.12
U6a1b - 1.08 - - - 1.18
U6a5 - - - - 0.83 -
U6c - - 1.37 - 0.83 -
U8b1 - 1.08 - - - 2.35
V - - - - 3.75 -
V7a - 1.08 - 1.85 - -
W 3.23 1.08 - - 1.25 -
X 8.06 2.15 - - - -
X2 - 1.08 1.37 - 1.25 -
mtDNA haplogroup diversity (h+-sd) 0.9175 +/- 0.0174 0.8630 +/- 0.0325 0.9376 +/- 0.0117 0.8823 +/- 0.0236 0.8853 +/- 0.0166 0.8891 +/- 0.0169
 


Good luck (and feed me back if you have some idea).

Selection against Neanderthal introgression?

Quickies

A couple of papers have been pre-published these days discussing the apparent selection against most (but not all) of the Neanderthal inheritance among modern ex-Africa humans.

Ivan Juric, Simon Aeschbacher & Graham Coop, The Strength of Selection Against Neanderthal Introgression. BioRxiv 2015 (pre-pub). Freely accessibleLINK [doi: http://dx.doi.org/10.1101/030148]

Abstract

Hybridization between humans and Neanderthals has resulted in a low level of Neanderthal ancestry scattered across the genomes of many modern-day humans. After hybridization, on average, selection appears to have removed Neanderthal alleles from the human population. Quantifying the strength and causes of this selection against Neanderthal ancestry is key to understanding our relationship to Neanderthals and, more broadly, how populations remain distinct after secondary contact. Here, we develop a novel method for estimating the genome-wide average strength of selection and the density of selected sites using estimates of Neanderthal allele frequency along the genomes of modern-day humans. We confirm that East Asians had somewhat higher initial levels of Neanderthal ancestry than Europeans even after accounting for selection. We find that there are systematically lower levels of initial introgression on the X chromosome, a finding consistent with a strong sex bias in the initial matings between the populations. We find that the bulk of purifying selection against Neanderthal ancestry is best understood as acting on many weakly deleterious alleles. We propose that the majority of these alleles were effectively neutral-and segregating at high frequency-in Neanderthals, but became selected against after entering human populations of much larger effective size. While individually of small effect, these alleles potentially imposed a heavy genetic load on the early-generation human-Neanderthal hybrids. This work suggests that differences in effective population size may play a far more important role in shaping levels of introgression than previously thought.


Kelley Harris & Rasmus Nielsen, The Genetic Cost of Neanderthal Introgression. BioRxiv 2015 (pre-pub). Freely accessibleLINK [doi: http://dx.doi.org/10.1101/030387]

Abstract

Approximately 2-4% of the human genome is in non-Africans comprised of DNA intro- gressed from Neanderthals. Recent studies have shown that there is a paucity of introgressed DNA around functional regions, presumably caused by selection after introgression. This observation has been suggested to be a possible consequence of the accumulation of a large amount of Dobzhansky-Muller incompatibilities, i.e. epistatic effects between human and Neanderthal specific mutations, since the divergence of humans and Neanderthals approx. 400-600 kya. However, using previously published estimates of inbreeding in Neanderthals, and of the distribution of fitness effects from human protein coding genes, we show that the average Neanderthal would have had at least 40% lower fitness than the average human due to higher levels of inbreeding and an increased mutational load, regardless of the dominance coefficients of new mutations. Using simulations, we show that under the assumption of additive dominance effects, early Neanderthal/human hybrids would have experienced strong negative selection, though not so strong that it would prevent Neanderthal DNA from entering the human population. In fact, the increased mutational load in Neanderthals predicts the observed reduction in Neanderthal introgressed segments around protein coding genes, without any need to invoke epistasis. The simulations also predict that there is a residual Neanderthal derived mutational load in non-African humans, leading to an average fitness reduction of at least 0.5%. Although there has been much previous debate about the effects of the out-of-Africa bottleneck on mutational loads in non-Africans, the significant deleterious effects of Neanderthal introgression have hitherto been left out of this discussion, but might be just as important for understanding fitness differences among human populations. We also show that if deleterious mutations are recessive, the Neanderthal admixture fraction would gradually increase over time due to selection for Neanderthal haplotypes that mask human deleterious mutations in the heterozygous state. This effect of dominance heterosis might partially explain why adaptive introgression appears to be widespread in nature.