March 9, 2011

African diversity and the possible origins of Humankind

There is a massive consensus nowadays on a shared origin of modern humans (Homo sapiens) in Africa (south of the Sahara), maybe some 200,000 years ago, regardless of minor hybridization episodes with other human species after the migration out of Africa.

However it may not be so clear where in Africa exactly, if anywhere, are our shared origins. Earlier work on mitochondrial DNA (Behar 2008) suggested an East African origin, a result largely replicated by myself on the same data (with some lesser differences).

However now a new research paper on autosomal DNA suggests instead a Southern African origin, quite intriguingly:

The paper has some reasons for such claim but, after reading it, I really do not feel fully persuaded. Importantly, there is a huge critical sampling blank in the Upper Nile area, notably the tribal zones of Ethiopia and South Sudan, one of the leading candidate regions for the origin of Humankind. The East African sampling is somewhat restricted: two isolated huntergatherer ethnicities (Hadza and Sandawe), one Nilo-Saharan pastoralist nation (Maasai) and one Bantu farmer group (Luhya). Certainly it would have improved massively if two or three South Sudanese and Ethiopian peoples would have been included in this research (this Upper Nile area has the greatest basal diversity in mitochondrial DNA at several successive levels, much larger of what we find in Southern Africa or more southernly parts of East Africa, like Tanzania).


But well, this is what we have and this is what we get:

Above (fig. 1, click to view larger) we can see some of the runs in Admixture

Niger-Congo peoples retain quite an homogeneity all around Africa, reinforcing the idea of the Bantu expansion being largely a demic colonization (yet Mozambicans look different and in any case this matter should be explored separately and with proper sampling strategies: you can't just "jump" over all the Congo and Zambezi basins for example).

All other groups show their own distinctive components at k=8, and in some cases (Maasai) two different specific components. 

The distances between the various "purified" components (i.e. not the populations but the genetic components in them as shown in k=8 above) is dealt with in table 1. Interestingly the closest component to non-Africans (represented by Tuscans) is the yellow one (East Africa), followed by the purple one (Sandawe), the red one (West Africa) and the pink one (Maasai specific). This last component is suggested to be relatively close to some North African component but this is not shown in quantified form and the lack of sampling in the Upper Nile does not help in discerning this matter either. 

On the other hand the Hadza reveal themselves (their light blue component) as an isolated group with large Fst distances to all other components (less remote are East African, Sandawe, Pygmies and West Africans).

Instead, the less studied Sandawe (with mtDNA and Y-DNA similar to the Hadza, see table S3), reveal themselves as a well connected and highly diverse population (see below). They are genetically closest to the East African and West African components and the closest relative also of the Maasai-specific component.


The best representation of this is maybe found in fig. 2B:

Fig. 2B annotated by me

Look specially at the vertical axis where linkage disequilibrium (LD) is annotated (the horizontal axis shows distance from  a putative "origin" in Angola). Greater LD means lower diversity.

Fig. 2C
Among the annotations I made there are four grey square marks: they identify four populations that are "pure" (or almost) in the Admixture analysis above (k=8). I understand that the authors used components rather than whole populations when determining diversity clines (left) and that only explains the concentration of greatest diversity in Southern Africa and not in Gabon-Cameroon, where the Biaka and the Fang live.

Honestly, I got lost when they shifted from fig. 2B (where the Biaka and Fang are as diverse as the San peoples of Southern Africa) to fig. 2C (and 2D, where they used Fst data instead but produces similar results), where the diversity clines are concentrated towards Southern Angola mysteriously. This hat trick really got me baffled and unable to explain here what the heck is going on. I think that they decided to use genetic components instead of genuine populations but this can be argued to be an error because the ability of Admixture and such to discern such components as "absolute" is very much limited.

Back to fig. 2B (above), we can see that there are two implicit deviation groups from the 45 degrees line LD/hypothetical distance:
  • On one side: groups like the Sandawe, Maasai or Mandinka look highly diverse even if they are far away from Angola. Southern Moroccans can also be included in this group.
  • On the other hand, peoples like the Hadza, the Fulani and the Kaba are quite less diverse than expected, suggesting a more or less marked founder effect or other kind of bottleneck-like effect at the origin of these peoples. Tunisians are also quite less diverse than their neighbors and hence fall in this category, probably because they are less African than the rest (my best guess). 
The high diversity of Mandinka and Yoruba in West Africa, as well as that of Maasai and Sandawe in East Africa, appears to indicate a strong retention of ancestral diversity from before the Out of Africa episode in spite of them being agro-pastoralist groups (except the Sandawe). This strongly suggests that Neolithic spread at least largely by cultural and not demic diffusion, respecting to a great extent the ancestral diversity that existed before the arrival of domestication technology: there was no population replacement at that stage surely, even if a later process in the Iron Age (Bantu expansion) may have been indeed largely a demic colonization instead. 

In general, most Tropical African populations fall to the right of the 45 degrees line: they are more diverse than expected and they rather seem to follow a 30 degrees line, if anything. On the other hand North Africans fall along a purely vertical reference line (unrelated to distance to Angola), so that 45 degrees reference line is a bit of an artifact and we are here instead before two different curves instead with a discontinuity between them. 

Final remarks

In brief: rather inconclusive and asking for more data (specially from the Upper Nile) and better, more clear, analysis that can confirm the conclusions a bit more strongly (if appropriate).

Still an interesting and informative paper that will no doubt be referred to in future publications. Also, on occasion, people ask for references that illustrate how Africans are more diverse than non-Africans... well, this is a good reference for that as well.

It is important in any case to expand and intensify our knowledge of African genetics so critical to understand Human genetics overall and so ill researched so far. In this sense, this paper is no doubt an important step forward and I do welcome it for all the information it provides.


  1. I would have expected low diversity in the Fulani (who have some outlier Eurasian components) but not the others.

  2. The Fulani are very distinct within Africa but they seem mostly or only African anyhow. See here.

    They may partly originate in an old early migration to West Africa (L1b). I have yet to read any evidence of Fulani having any Eurasian component -- it's maybe possible but I have not found it in any paper, so it's at best an unwarranted claim and at worst a false claim.

  3. Also admixture increases, not decreases diversity. For example South African San are even more diverse than Namibian San because they have some Bantu and European admixture.

    I was also surprised of the Hadza's low diversity but it's true that they are a small isolated population and still have as much diversity as North Africans in genera, who in turn are probably quite more diverse than Europeans (could not find that comparison but I can imagine).

    I'd suspect that the Tunisian low diversity is due to a sampling accident. It does not seem to make any sense, considering the intense migration history of the country. On the other hand, Tunisians are maybe the less "African" of all North Africans, so maybe that's the reason.

  4. One in six Fulani men in North Cameroon have Y-DNA haplogroup T, and the Fulani elsewhere in the Sahel haven't been sampled very well. They are the only Niger-Congo language speaking ethnic group with significant proportions of non-African (i.e. haplogroups A, B and E) Y-DNA.

  5. "The high diversity of Mandinka and Yoruba in West Africa, as well as that of Maasai and Sandawe in East Africa, appears to indicate a strong retention of ancestral diversity from before the Out of Africa episode"

    In the case of the Mandinka the diversity may be the result of admixture. Although the Mandinka language is widespread it is probably as a result of the unification of the Mali Empire in the Middle Ages.

  6. @Andrew:

    The Fula originated in Westernmost Africa (Futa Toro and Futa Djalon). Central and East African Fula are typically mixed or have experienced successive founder effects, all them in the last few centuries since they began expanding as herders first and then taking over the political void in the Western Sahel left by the colonial adventures of Morocco.

    I'd like to see clear evidence. Are you sure it's T and not K* anyhow? Y-DNA K* has been found in large amounts in Cape Verde but makes no sense as Iberian founder effect, specially as it's not T nor any other Western K (R, etc.) Some odd K* and P* have also been detected in NW Africa.

    Whatever the case these Y-DNA flows could be extremely old and do not necessarily indicate admixture or at least admixture we can detect anymore.

    If you want to study the Fula/Peul for real you need to research them in Senegal and Guinea, where they originate (they are still 40% of the population of Guinea-Conakry) not in their "colonial" scatter through the Sahel, which is very recent and may have been subject to all kind of founder oddities.

  7. @Terry:

    While it's true that the ethnic concept Mandinka is relatively recent (Mali Empire) the populations making them up must have been from the area: it was a convergence of already related and neighboring peoples into a single identity, not a mass colonization of overseas settlers as in New Zealand. It's more like the formation of the concept "French", mind you.

  8. "not a mass colonization of overseas settlers as in New Zealand. It's more like the formation of the concept 'French', mind you".

    I was certainly thinking along the 'French' model rather than the 'New Zealand' model. The Mandinka are a conglomeration of groups. But, interestingly, I felt I could generally tell them apart from the Wolof when I was traveling around Senegal, Gambia and Mali.

  9. A nice short summary of the most recent Fulani data from Tishkoff et al, is found here is a reply to a misinterpretation of their data by Clyde Winters.

  10. In truth the original paper is not even providing the data properly (the Structure data is not shown and the supp. material fails to download properly).

    Obviously, when Winters claims an origin of the Fulani in Nubia, he's saying nonsense. It is well known that the Fulani originated in Westernmost Africa ("the futas", plateaus largely inhabited by Peul, aka Fula), from where they expanded in the historical period, filling the void left by the destruction of Songhai by Morocco, between the 17th and the 19th century, when France and other European powers destroyed their states.

    It is possible that the Fula got admixed in this expansion with other peoples, notably those also dedicated to the seminomadic exploitation of the Sahel, such as Afroasiatic or Nilo-Saharan speaking groups. For that reason, in order to understand the Fula it is important to study them in the futas: in Futa Toro in Senegal and Futa Djallon in Guinea, where their origin no doubt lays. If a small population of Fula in Sudan or Chad have this or that lineage, that won't say much or even anything at all about the origin of the Fulani overall.

    It is my understanding that the Fula look (geneticlally) very much African Native, but of a stock that is different from both West African and East African. This is confirmed by their high frequencies of mtDNA L1b, which represent an early expansion into West Africa and is very common among Fulani of all peoples:

    "... the nonsignificant values of the D and F^sub S^ statistics provide an indication of reduced demographic expansion hitherto observed mainly in hunter-gather populations such as the Pygmies or Khoisan (Excoffier and Schneider 1999)".

    We are therefore before a population that is distinct since long ago, like Pygmies or Khoisan. Regardless of lesser admixture in recent times, which is different depending on region and therefore not definitory of the Fulani as a whole.


Please, be reasonably respectful when making comments. I do not tolerate in particular sexism, racism nor homophobia. Personal attacks, manipulation and trolling are also very much unwelcome here.The author reserves the right to delete any abusive comment.

Preliminary comment moderation is... ON (your comment may take some time, maybe days or weeks to appear).