There is a massive consensus nowadays on a shared origin of modern humans (Homo sapiens) in Africa (south of the Sahara), maybe some 200,000 years ago, regardless of minor hybridization episodes with other human species after the migration out of Africa.
However it may not be so clear where in Africa exactly, if anywhere, are our shared origins. Earlier work on mitochondrial DNA (Behar 2008
) suggested an East African origin, a result largely replicated by myself
on the same data (with some lesser differences).
However now a new research paper on autosomal DNA suggests instead a Southern African origin, quite intriguingly:
The paper has some reasons for such claim but, after reading it, I really do not feel fully persuaded. Importantly, there is a huge critical sampling blank in the Upper Nile area, notably the tribal zones of Ethiopia and South Sudan, one of the leading candidate regions for the origin of Humankind. The East African sampling is somewhat restricted: two isolated huntergatherer ethnicities (Hadza
), one Nilo-Saharan pastoralist nation (Maasai
) and one Bantu farmer group (Luhya
). Certainly it would have improved massively if two or three South Sudanese and Ethiopian peoples would have been included in this research (this Upper Nile area has the greatest basal diversity in mitochondrial DNA at several successive levels, much larger of what we find in Southern Africa or more southernly parts of East Africa, like Tanzania).
But well, this is what we have and this is what we get:
Above (fig. 1, click to view larger) we can see some of the runs in Admixture.
Niger-Congo peoples retain quite an homogeneity all around Africa, reinforcing the idea of the Bantu expansion being largely a demic colonization (yet Mozambicans look different
and in any case this matter should be explored separately and with proper sampling strategies: you can't just "jump" over all the Congo and Zambezi basins for example).
All other groups show their own distinctive components at k=8, and in some cases (Maasai) two different specific components.
The distances between the various "purified" components (i.e. not the populations but the genetic components in them as shown in k=8 above) is dealt with in table 1. Interestingly the closest component to non-Africans (represented by Tuscans) is the yellow one (East Africa), followed by the purple one (Sandawe), the red one (West Africa) and the pink one (Maasai specific). This last component is suggested to be relatively close to some North African component but this is not shown in quantified form and the lack of sampling in the Upper Nile does not help in discerning this matter either.
On the other hand the Hadza reveal themselves (their light blue component) as an isolated group with large Fst distances to all other components (less remote are East African, Sandawe, Pygmies and West Africans).
Instead, the less studied Sandawe (with mtDNA and Y-DNA similar to the Hadza, see table S3), reveal themselves as a well connected and highly diverse population (see below). They are genetically closest to the East African and West African components and the closest relative also of the Maasai-specific component.
The best representation of this is maybe found in fig. 2B:
|Fig. 2B annotated by me|
Look specially at the vertical axis where linkage disequilibrium (LD) is annotated (the horizontal axis shows distance from a putative "origin" in Angola). Greater LD means lower diversity.
Among the annotations I made there are four grey square marks: they identify four populations that are "pure" (or almost) in the Admixture analysis above (k=8). I understand that the authors used components rather than whole populations when determining diversity clines (left) and that only explains the concentration of greatest diversity in Southern Africa and not in Gabon-Cameroon, where the Biaka and the Fang live.
Honestly, I got lost when they shifted from fig. 2B (where the Biaka and Fang are as diverse as the San peoples of Southern Africa) to fig. 2C (and 2D, where they used Fst data instead but produces similar results), where the diversity clines are concentrated towards Southern Angola mysteriously. This hat trick really got me baffled and unable to explain here what the heck is going on. I think that they decided to use genetic components instead of genuine populations but this can be argued to be an error because the ability of Admixture and such to discern such components as "absolute" is very much limited.
Back to fig. 2B (above), we can see that there are two implicit deviation groups from the 45 degrees line LD/hypothetical distance:
- On one side: groups like the Sandawe, Maasai or Mandinka look highly diverse even if they are far away from Angola. Southern Moroccans can also be included in this group.
- On the other hand, peoples like the Hadza, the Fulani and the Kaba are quite less diverse than expected, suggesting a more or less marked founder effect or other kind of bottleneck-like effect at the origin of these peoples. Tunisians are also quite less diverse than their neighbors and hence fall in this category, probably because they are less African than the rest (my best guess).
The high diversity of Mandinka and Yoruba in West Africa, as well as that of Maasai and Sandawe in East Africa, appears to indicate a strong retention of ancestral diversity from before the Out of Africa episode in spite of them being agro-pastoralist groups (except the Sandawe). This strongly suggests that Neolithic spread at least largely by cultural and not demic diffusion, respecting to a great extent the ancestral diversity that existed before the arrival of domestication technology: there was no population replacement at that stage surely, even if a later process in the Iron Age (Bantu expansion) may have been indeed largely a demic colonization instead.
In general, most Tropical African populations fall to the right of the 45 degrees line: they are more diverse than expected and they rather seem to follow a 30 degrees line, if anything. On the other hand North Africans fall along a purely vertical reference line (unrelated to distance to Angola), so that 45 degrees reference line is a bit of an artifact and we are here instead before two different curves instead with a discontinuity between them.
In brief: rather inconclusive and asking for more data (specially from the Upper Nile) and better, more clear, analysis that can confirm the conclusions a bit more strongly (if appropriate).
Still an interesting and informative paper that will no doubt be referred to in future publications. Also, on occasion, people ask for references that illustrate how Africans are more diverse than non-Africans... well, this is a good reference for that as well.
It is important in any case to expand and intensify our knowledge of African genetics so critical to understand Human genetics overall and so ill researched so far. In this sense, this paper is no doubt an important step forward and I do welcome it for all the information it provides.