January 14, 2012

North African autosomal genetics (again)

Two weeks ago I performed almost the same exercise here at this blog but now a paper with academic seal of proficiency has been published at PLoS Genetics:

The basic results are very very similar if not outright identical to what I achieved. And that is because the sampling strategy was also very similar. If anything Henn and colleagues used a broader sample of Tropical African populations.

From Fig. 1 (click to expand)

The choice of North African populations is exactly the same as mine (actually the samples I used are from a previous paper by Brenna Henn, who is making a great job in exploring African genetics, and must be the same as in this paper) exception made that I included 10 HGDP Mozabites also. The choice of West Eurasian populations is different but does not seem to produce any relevant difference in the result. The main differences are in the choice of Tropical African samples, which does produces some differences.

Tunisian Berbers' endogamy

But first of all let's explain what happens with the weird Tunisia sample:

... the Tunisian Berber population displayed an excess of pairs of individuals sharing 200–1200 cM IBD. This bimodal distribution indicates that many 1st and 2nd cousin genetic equivalent pairs were present in this sample, even though donors declared themselves to be unrelated during the sampling process. Analysis of long runs of homozygosity (ROH) indicate that the Tunisian population averaged almost twice as much of their genome is in ROH than other North African populations, 230 Kb versus 120 Kb respectively (Figure S3). The pattern of ROH and pairwise IBD in the Tunisian Berbers is likely the result of endogamy due to geographic isolation or cultural marriage preferences.

That's why they perform so weirdly, relating always to themselves almost before any other affinity. This fact makes them a poor and mostly uninformative population. 

Different sampling strategy, somewhat different results

As I say, the main difference between my sample strategy and Henn's (and the result produced) is in Tropical Africa: I used Ethiopians, Mandenka and Fulani; Henn used also Fulani and then also West and East Africans but different ones. I found apparent Mandinka (West African) admixture in NW Africa (and almost no East African/Ethiopian one), Henn found apparent Luhya (East African) admixture instead (and almost no West African one).

I don't know yet how these two apparent different findings may conciliate but the difference of findings is in itself interesting, outlining the strategy of future research. 

Something that is quite obvious as Henn uses such a broad Tropical African sample is that many of the components discerned are just one for each Tropical African population. This is interesting, underlining the immense genetic diversity of Africa, but not very informative in regard to North Africans. 

That (and the fact that I went down to K=11 - and even K=12 but this one was uninformative) is probably the reason why I, with a smaller peripheral sample, could detect what I believe that is a very old layer of North African specificity. 

So the sampling strategy (both sample choice and number of individuals in each sample), as well as the K-depth reached can affect the results of these analysis of the population genetic structure. There is almost always a different approach that can produce complementary information as result. 

Back to Africa but when?

But overall the results are very similar: the bulk of North African genetic affinity is with West Eurasia and not so much Africa as such. That is the most obvious result and indicates that the Out-of-Africa migration had an important backflow which affected several parts of Africa but very specially the North. 

Nothing unexpected, at least for me. But it really hits a blow to those who, quite lightly, associate Y-DNA and overall ancestry: if there is Y-DNA and mtDNA contradiction, as is the case in North Africa, where most of the patrilineages are African but most of the matrilineages are of Eurasian origin, in most cases the mtDNA is right and the Y-DNA is nothing but a varnish. 

The main exceptions seem to be areas of sustained male inflow across generations, notably some parts of Latin America. But this kind of sustained industrialized migration pattern is unlikely to have ever existed in prehistory or even pre-Modern history anywhere. 

Anyhow, interestingly, the authors make an interesting exercise to find out estimate times of Eurasian arrival. The result is forewarned with reasonable precautions:
Since this model neglects migration, we expect our results to form a lower bound on the population divergence time, as similar levels of population divergence would require a longer separation in the presence of migration.

So at least this old:

Fig. 3 (edited to correct a color typo) - click to expand

Although these divergence time estimates may not be precise, as they do not adequately model ancient migration, they do suggest that the population divergence between the ancestral Maghrebi population and neighboring Mediterranean populations occurred at least 12,000 ya and indeed more likely predated even the Last Glacial Maximum.

It is interesting anyhow that the Fst distance to Europe are lower than to Arabia and that the window for a possible migration from Europe can well fit with the Oranian genesis c. 22,000 years ago, which I am pretty sure that is related with Iberian Gravetto-Solutrean: Oranian was back in the day called Iberomaurusian for a reason and, regardless of revisionisms, the Oranian dates for the West are quite older than those of the East - never mind that at least 25% of North African mtDNA (and maybe 10% of Y-DNA) is of European (and most likely Iberian) derivation and that European affinity remains apparent, distinct and important (specially in Algeria and North Morocco) even after North African specific components have become obvious and dominant.

So the old theory of migration from Iberia c. 22,000 years ago (maybe with some backmigration northwards as well) is not any colonial construct but something most probable, as indicate both archaeological and genetic data very consistently.

Less clear is whether there was a previous West Eurasian flow c. 40,000 years ago with the Dabban industries (so far only known in Libya, although unmistakably "Aurignacoid" in character). Genetically the main support for this first Eurasian backflow is mtDNA U6 (derived from Eurasian U), whose origin is probably in Morocco, where most of the basal diversity accumulates (and then around it, in Canary Islands and Iberia). 

It is not impossible that the lineage might have arrived, still as undefined U*, via Europe, however the structure of the autosomal DNA, as illustrated in this study or in my exercise from December, evidences that the North African specific components (excepted the "Aterian" one) are most akin to West Eurasia (by much). So there was probably a first migration from West Asia c. 40 Ka. ago and then an Iberian layer overlapped. 

This paper also suggests more recent migrations from Tropical Africa, although I am unsure if I can take their timeline conclusions at face value, specially regarding the East African component, that I imagine quite older (they suggest, table 1, just 25 generations ago, what I find most hard to believe for such a notable impact).

I would personally conclude that the North African genetic composition appears to be made up of:
  1. A deep, quite diluted, 'Aterian' layer
  2. A dominant North African specific layer of mostly West Eurasian roots
  3. An Iberian or European layer (Oranian)
  4. Maybe an East African (Capsian) layer (needs clarification but agrees with Y-DNA)
  5. Maybe a lesser Arab layer and also maybe some recent Tropical African input


  1. the main issue is that "european" and "west asian" are both compounds. especially for "west asian" lots of genome analysis suggests a strong difference between a northern west asian branch, centered in the south caucasus eastern anatolia, and an arabia branch, centered in arabia. in terms of Fst the northern branch is somewhat closer to northern europeans than arabians (though these groups may all be mixed in various ways, so it might not be describe a real phylogeny). to make a long story short i think that in terms of Fst from the 'magrebhi' it would really be west asian < european < southwest asian.

    i will probably do this analysis at some point by merging behar, hgdp, and henn et al.

  2. "if there is Y-DNA and mtDNA contradiction, as is the case in North Africa, where most of the patrilineages are African but most of the matrilineages are of Eurasian origin, in most cases the mtDNA is right and the Y-DNA is nothing but a varnish".

    If Y-DNA E was part of that back-migration the above comment would not hold. The mt-DNA and Y-DNA would have moved in together. But on the whole I agree with your statement. Y-DNA is more easily and rapidly replaced than is mt-DNA.

    "there was probably a first migration from West Asia c. 40 Ka. ago and then an Iberian layer overlapped"

    That could be when Y-DNA DE entered, or expanded. That would place D as rather late in East Asia, of course. But that is not impossible.

  3. @Razib: you can desing an specific exercise to discern that: just compare North Africans one on one (or together, but you'd need more K depth) with various populations of your interest, like Turks. I've never thought that they are a candidate ancestor (haploid DNA does not suggest that, nor does the archaeology I know) but if you think they can be, you can design the exercise.

    I'd do it that way: get non-inbred North African samples and a selected array of Eruasian putative source populations and then go down to K=10 or whatever. You should see things clear after doing that, not just putative "immigrant" components but also Fst distances with each hypothesized source population.

    From other exercises we know that SW Europeans and SW Asians are as distant among them as any other two West Eurasian populations. Also both have some affinity to the respective "northern" components the NE European and the West Asian Highlander, so I would not expect any oddities. Iberians and Arabs do represent well enough Turks and Lithuanians for this purpose.

    While there may be a lot of subtle distinctions in regards to Tropical Africa, extremely diverse, West Eurasia is more amorphous and homogeneous. Should not be a problem and before you posted I did not even consider it could be at all.

  4. @Razib (again):

    On second though, I already compared all relevant West Eurasian populations and components in my first inroad with Admixture, in mid-December, which you already read.

    And now I see what you mean: the "Berber" component is slightly closer to the "Caucasian" (Highland West Asia) one. That is however a pattern that happens with all WEA components, which appear as distinct peripheral outgrows of a central NEE-Cau trunk.

    However there is strong SW Euro haploid affinity and it has been claimed that NW African mtDNA H (c. 25% of all the mtDNA pool) is all of Iberian derivation (Cherni).

  5. "If Y-DNA E was part of that back-migration the above comment would not hold".

    Y-DNA E is an African lineage without doubt. E1b is almost for sure from Ethiopia (although the diversity of Sudan is still a bit of a mystery), E1a appears concentrated in West Africa, although I'm a bit unsure of where exactly most diversity lays. E2 and E* haven't been detected outside Africa.

    Your comment is without merit. Avoid me that waste of time, please.

  6. I wish they had more diverse samples for Sub-Saharan Africa. They are lacking crucial samples from South Sudan, who are the purest Nilotes.

    The Maasai are not good examples of Nilotes, they are too admixed with Horn Africans (Cushites) and Bantus. Neither are the Bulala/Bilala great examples of Nilotes because they are too mixed with Chadic people.

    The estimates for Nilotic migration to Egypt are very inaccurate because they lack pure Nilotic samples (the South Sudanese are needed for accurate estimates).

  7. Absolutely, Eze: I also wish there were some samples from Sudan.

    However I suspect that the East African affinity (which is Luhya and not Nilotic in the case of NW Africa) actually represents the Afroasiatic component which IMO would have arrived with Capsian culture. So instead of being concerned about "purer Nilotics" and fearful of Chadic distractions, I'd be interested in all kinds of Sudanese and notably the Afroasiatic ones of deep local roots like Chadic or Beja.

    That does not mean that there is no Nilotic genetics (if such thing exists at all) but I'd look at Nubians rather than to Maasai. Whatever the origin of the genes, it was the Afroasiatic language family which clearly led that expansion (it it's the one I imagine - and there are no other candidate components for it).

    But agreed, Sudanese peoples would be most interesting to compare with.

  8. @Razib:

    One detail I have been chewing on is that the authors are comparing populations and not components. Hence it's most difficult that you are going to be able to obtain lower Fst (=more recent divergence times) because there is too clear Iberian input in North Africa and, in the All West Eurasia comparisons, the other component very apparent in North Africa is the SW European one(s), not the Caucasian one. Both West Asian components are very minor, at least in Morocco, while NE Euro is nil.

  9. What I can add is that Em81 in North Africa is due to a strong founder effect. There are groups of Berbers which show more diversity than others in terms of markers , less EM81 predominance particularly. For instance , Kabyle people in Algeria you have a group of 19 individuals taken from Tizi-Ouzou : http://en.wikipedia.org/wiki/Kabyle_people#Genetics
    You can observe a higher diversity on both MtDNA and Y-DNA haplogroups than what occured among Moroccan Berbers. Apart from this another example would be that a group of Berbers had 1/3 of its mtdna asigned as U6 while on the other hand another group showed absolutely no trace of U6 haplogroup. This confirms the founder effect nature of the various peoples. Finally , the uniparental markers found in Berbers were already too young on a human scale at the beginning to think they were a population who developed independetly from Western Eurasians and Africans , and now we are just confirmed they did not.

    I would dare to add before even genetics , various archeologic researches (Ibero-Maurisian culture , Mechta-Afalou (of which a reconstruction was made dated 20.000 years old) , the women of Tassili rock art) gave the traces of a out-of-Africa population presence in Africa.

  10. The molecular clock has been demonstrated to be nonsense at least in Y-DNA, and I'd dare say that in mtDNA should also be considered with most caution, specially as there are many large branches that appear to stop mutating just after they coalesce (for example H), what I contend is caused by the "cannibal mum" effect (my moniker) in which the established variants "drift out" any new mutations by the power of numbers (mtDNA, unlike Y-DNA or autosomal DNA mutates quite slowly, probably one consolidated mutation per line each several millennia).

    In addition key calibration points like the Pan-Homo fork are systematically underestimated, from 15% to as much as 100%.

    So in general I'd discard all molecular clock estimates and consider things only based on distribution patterns, phylogeny and archaeology.

    That way my hypothesis A (never put all eggs in one basket: keep your thought flexible) is that the immediate precursor of E-M81 may have arrived with the Dabban or some other early migration (probably not but I would not even discard Aterian totally - haven't studied the matter so much). Similarly L3k and maybe other L lineages arrived with Aterian, while U6 should be Dabban instead.

    The next layer is of Iberian origin and corresponds with Oranian/Iberomaurusian (although may have been reinforced later with Megalithism, specially in the male side): includes the typical European lineages: Y-DNA R1b1a2-M269 and I (documented in Guanche mummies at the very least) and mtDNA H (H1, H3, H4 and H7 which make up 24% of the North African mtDNA on average and have been demonstrated in Tunisia to be derived from Iberia). MtDNA V (important in Tunisia and parts of Algeria) could also be from this group (or maybe vice versa: original from N. Africa and migrated to Europe?)

    I also think that is very possible that with the Oranian genesis there was a back-flow of some sort into Iberia, bringing the North African design for points with wings and stem (so "Aterian"!) and lineages that are concentrated in West Iberia: U6 and E-M81. We could argue that this happen in the Neolithic also (founder effect) but the fact that they are important in areas like Asturias of important Paleolithic settlement and unlikely Neolithic founder effects suggest me that they may have arrived earlier in fact. The peculiarities of Solutrean allow for such a flow (Asturian Solutrean is "Iberian", while further East it is "French" instead).

    This is another reason I have to support a very old M81 presence in North Africa, pre-Oranian possibly.

    Finally J1 and E-M123 would be of Capsian origin, what explains their absence in Iberia (and low levels in "refuge" parts like Morocco). There's no archaeological distinct Neolithic in North Africa that I know of other than some Cardial or Epicardial areas in North Morocco. Correct me if I'm wrong, but I understand that, essentially, the North African Neolithic is a reformed Capsian and the term Capsian Neolithic is common.

    This does not exclude "Epi-Neolithic" arrivals like Megalithism or later Phoenicians, etc. but they were less likely to make a massive impact once people were already farming and herding everywhere.

    My 2 cents.

  11. A point of the 2 cents :

    I thought the Aterian culture was part of the Mousterian culture which was found throughout Europe and the Near-East. One of the known Aterian area (Taforalt in Morocco) was studied and the MtDNA found was of Eurasian origin: http://www.google.fr/search?sourceid=chrome&ie=UTF-8&q=taforalt#sclient=psy-ab&hl=fr&source=hp&q=MTDNA+of+taforalt&pbx=1&oq=MTDNA+of+taforalt&aq=f&aqi=&aql=&gs_sm=e&gs_upl=4455l5509l0l5741l9l7l0l0l0l6l528l2015l2-,or.r_gc.r_pw.,cf.osb&fp=70f9df7ca4a9bdf9&biw=1280&bih=933

  12. It's not Mousterian certainly: it has maybe conceptual affinities (mode 3, levallois technique) but these are almost universal, at least in the Western half of the old world.

    Furthermore, the Jebel Irhoud remains, which are pre-Aterian apparently, are generally considered one of the oldest know anatomically modern humans or Homo sapiens. Archaic within this set but not more than his/her contemporary Herto man (Idaltu) from Ethiopia. And very similar to some of the early skulls of Palestine (Skhul/Qahfez).

    Not just that, Aterian and early Palestinian Sapiens sites share the earliest known usage of ornamental shells, only many millennia later found in Southern Africa.

    On the other hand I'd say that both Mediterranean populations were essentially dead ends and that the Eurasian subset of Humankind arose from an East African population via Arabia. However I see no reason why some of the genetics of the Aterians could not have survived to present day even if diluted: while intermittent aridity is no doubt a problem in North Africa in the long run, I can't accept that it was never as severe as to totally impede human (and animal) life as some have claimed rather ignorantly.

    And that's what I believe I detected in Southern Morocco (14.4% but not more than 1.5% anywhere else). Because it is a component with extreme Fst distances to both West Eurasians and Tropical Africans, suggesting it diverged very early on.

    Taforalt population look totally Eurasian and I'd dare say largely "Iberian" or "European" but they are Oranian ethno-culturally, what is a long shot from the first arrivals to North Africa:

    1. Jebel Irhoud (no survivals?)
    2. Aterian (mtDNA L3k for example)
    3. Dabban (possible source of early Eurasian layer, like mtDNA U6 and the very Berberid stock - however only known in Lybia as of today).
    4. Oranian (Iberian/European genetics: mtDNA H, etc.)
    5. Capsian (NE African wave: mtDNA U6a and maybe some L, Y-DNA J1, E-M78)
    6. Post-neolithic lesser inputs

    That's my best reconstruction at this point, which of course is fragmentary and has some blanks, notably the issue of Dabban industries and the apparent Eurasian stock leading to the Berberid type. However the end of Aterian as such is not well resolved either, so there is room for the tentative reconstruction I make.

  13. http://www.dnatribes.com/dnatribes-digest-2012-02-01.pdf

  14. I never liked DNA tribes approach, really: seems good enough for the illiterate in population genetics but quite horrible for the rest.

    Anyhow, they commit two errors here, I think:

    (1) They do not allow North African specificness to show up and treat it by default as mere admixture from other regions. This we know is quite a wrong approach.

    (2) The apportions from West Asia and Europe seem completely wrong on light of what we see here and in my December exercise (very similar results): unlike what DNA Tribes say, North Africans (excepting Egyptians) have more European than Arabian specificity, after their specific component shows up, what will happen in any unsupervised run with a regional approach. This is possible an error induced by the first error: not allowing their specific distinctiveness to show up.

  15. PS.- ... or comparing with CEU instead of with Spaniards maybe

    In any case they have the wrong approach: dividing Humankind in neatly pre-defined blocs and then making everyone fit in them. I really hate that: reality is always more complex and they are certainly missing clearly distinct blocs as well.

  16. Regarding DNA tribes, genetic regions are determined by both SNPs and STRs not by the humans.

    Also, regarding the North African Component, you are talking about an old Dna Tribes analysis as if you look at the last analysis below in March 2012 (p.23), you can see that there is a native North African in their last analysis component that peaks in Sarahwi and Mozabites. Their results are quite similar to yours and Henn's.


    1. That sounds to me like evading responsibility: sample sizes and strategies are all when it comes to determine clusters. DNAtribes standardized clusters are just one approximation to reality, not The Truth(TM). If you use a different sampling strategy you get different results.

      As for my own little work with Admixture and the like, I know that it is almost impossible that North Africans do not show up as distinct when compared with either West Eurasians or other Africans.

      "you are talking about an old Dna Tribes analysis"

      I'm talking at the junk in their propaganda (publicity) sheets. I never went beyond that: they really irked me.

  17. This comment has been removed by the author.

  18. This comment has been removed by the author.

  19. There are a few more North Africa samples availaible at:


    "194 individuals both men and women representing two ethnicities (Arabs and Amazighs) from three geographic locations (Agadir, Boutroch and Ighrem) and two lifestyles (Urban and Rural) "

    from this study:

    "Geographical genomics of human leukocyte gene expression variation in southern Morocco"


    Maybe it could be interesting to include them in a analysis if there is enough SNPs in common with Henn's samples especially because there are both Arabs an Berbers

    1. I thank you very much for these materials but I'm just a very noobish amateur when it comes to use Admixture and the like - so mixing samples and all that is something that really confuses me. However other bloggers with a better technical grasp of the matter like Etyopis or Razib may be interested, I imagine.

      As for the paper, it is interesting, showing for example that Arabs are not genetically distinct in the context of Morocco, even if they do not retain the somewhat apparent local homogeneity of the various Amazigh groups. In essence, Arabs appear as a mixture of local Berbers.

      Sadly the lack of context (outgroups) and of any structure-style analysis that is deeper than K=3 does not allow us to interpret any such differences in the broader context.


Please, be reasonably respectful when making comments. I do not tolerate in particular sexism, racism nor homophobia. Personal attacks, manipulation and trolling are also very much unwelcome here.The author reserves the right to delete any abusive comment.

Preliminary comment moderation is... ON (your comment may take some time, maybe days or weeks to appear).