March 17, 2013

Khoesan and Coloured autosomal DNA in context

There has been a number of studies coming out recently on Khoesan genetics but this one does not seem to be just redundant, providing some extra information instead.

Desiree C. Petersen et al., Complex Patterns of Genomic Admixture within Southern Africa. PLoS Genetics 2013. Open accessLINK [doi:10.1371/journal.pgen.1003309]


Within-population genetic diversity is greatest within Africa, while between-population genetic diversity is directly proportional to geographic distance. The most divergent contemporary human populations include the click-speaking forager peoples of southern Africa, broadly defined as Khoesan. Both intra- (Bantu expansion) and inter-continental migration (European-driven colonization) have resulted in complex patterns of admixture between ancient geographically isolated Khoesan and more recently diverged populations. Using gender-specific analysis and almost 1 million autosomal markers, we determine the significance of estimated ancestral contributions that have shaped five contemporary southern African populations in a cohort of 103 individuals. Limited by lack of available data for homogenous Khoesan representation, we identify the Ju/'hoan (n = 19) as a distinct early diverging human lineage with little to no significant non-Khoesan contribution. In contrast to the Ju/'hoan, we identify ancient signatures of Khoesan and Bantu unions resulting in significant Khoesan- and Bantu-derived contributions to the Southern Bantu amaXhosa (n = 15) and Khoesan !Xun (n = 14), respectively. Our data further suggests that contemporary !Xun represent distinct Khoesan prehistories. Khoesan assimilation with European settlement at the most southern tip of Africa resulted in significant ancestral Khoesan contributions to the Coloured (n = 25) and Baster (n = 30) populations. The latter populations were further impacted by 170 years of East Indian slave trade and intra-continental migrations resulting in a complex pattern of genetic variation (admixture). The populations of southern Africa provide a unique opportunity to investigate the genomic variability from some of the oldest human lineages to the implications of complex admixture patterns including ancient and recently diverged human lineages.

The array of Khoesan populations senso stricto analyzed in this study is much smaller than that of Schebusch 2010 but this study has the advantage of including Cape Coloureds and their Baster relatives, partially descendants from the otherwise extinct pastoralist Khoekhoe (Hottentots, now considered a derogative term) who lived in much of Southern Africa upon the arrival of Bantu and Europeans, as well as the amaXhosa, a Bantu people which clearly display marked Khoesan admixture.

Figure 1. Map of southern Africa showing distribution of sampling per population identifier and significant historical events that likely shaped ancestral contributions.

There is brief mention of maternal and paternal DNA. Just to mention that mtDNA being mostly aboriginal (L0d/L0k) among the Khoesan (86-100%), the Coloureds (68%) and even the Xhosa (47%, all L0d), while aboriginal Y-DNA (essentially A2b and A2c2, plus occasional B2) is concentrated among the Ju/'hoan, with the !Xun being instead dominated by E1b1-M275, of putative East African (Nilotic?) origins. This is consistent with the !Xun being historically pastoralists. European patrilineages, notably R1b, are dominant among the Baster (92%) and Cape Coloured (71%).

Coloureds only make up some 9% of South African population but they dominate the countryside in much of the former Cape Province. Namibian Basters are a subset of them who migrated northwards in 1868.

Figure 2.  PCA and STRUCTURE analysis (click to expand)

We can see in the graphics above how the North Cape Coloured and Baster only display minor Bantu admixture, being essentially a variable mix of European and Khoesan ancestry, with probably also some Malay input (apparent in the increase of the blue component relative to the European reference). Instead East Cape and Cape Town (D6) Coloured appear to have greater apportion of Bantu ancestry and, especially the later, a notable increase of the East Asian input.

The STRUCTURE graph, particularly at K=9, is also informative about other African populations but I won't dwell in that here. 

The authors also made an interesting exercise of analysis using Ancestry Informative Markers with the !Xun and Xhosa:

Figure 4. Ju/'hoan-Yoruba ancestry informative markers (AIMs) defined ancestral contributions to the !Xun and amaXhosa, providing evidence for two distinct !Xun lineages with differing ancestral contributions.

It seems evident that much of the !Xun ancestry (up to 70%) does not fall in either (Ju/'hoan-Yoruba) category but it is something else, probably specific to this people. The Xhosa Khoesan ancestry also seems closer to the pastoralist !Xun than to the (likely more genuinely ancient) Ju/'hoan. 

There is some more info in the paper but I feel that the essentials are sufficiently covered here. 

See also:


  1. "make up some 9% of South African population but they dominate the countryside in much of the former Cape Province."

    When you say "dominate" do you mean that they are a superstrate population who are a ruling class relatively speaking, or do you mean that they make up a majority of the population there?

    1. Majority of the population.

      They are >60% in most districts, the main exception being the Xosha country in much of East Cape.

  2. Am I right that Hadza have a very particular ancestry? It seems to me that they are very different from all others - in all but one component. Also their language is a language isolate and they are among the last hunter-gatherers in Africa. They have retained clicks but they share just a little bit of common ancestry with Ju/'hoan and !Xun and not more than with other Africans. Also Nilotic speaking Maasai and tentatively Khoisan Sandawe share a common ancestry, although they do not belong to the same language family. Sandawe are genetically not very close to Khoisan speakers. Perhaps (Sub-Saharan) Africa was full of click languages before the expansion of ydna E (and Eurasian markers K and R) in Africa.

    1. It's probably an isolation artifact. They do have quite low diversity for Tropical African standards. See my discussion of Henn 2011.

      "Perhaps (Sub-Saharan) Africa was full of click languages before the expansion of ydna E"...

      We don't know. Eastern and Southern Africa have deep genetic links (the L0 branch to be overly simplistic, see here for more details). So it's plausible that click languages are a feature of only that branch of Humankind.

    2. "Also Nilotic speaking Maasai and tentatively Khoisan Sandawe share a common ancestry, although they do not belong to the same language family. Sandawe are genetically not very close to Khoisan speakers."

      This is very likely due to admixture in the last several thousand years, not a deep rooted linkage. Nilotic populations arrived in that region post-Bantu if I recall correctly.

    3. Difficult to tell for me if it's recent or ancient what links Sandawe to Maasai on the data provided here. But in essence it looks like a shared component specific to East Africa, which the Maasai should have and that I have no reason to imagine recent admixture, because the contact populations of the Sandawe are Bantu, not Nilotic.

      "Nilotic populations arrived in that region post-Bantu if I recall correctly".

      Maybe but pastoralism did spread through East and Southern Africa long before the Bantu Iron Age expansion. This pastoralist current probably had sources similar in geography to later Nilotic and Afroasiatic ones.

  3. if you want more visitors read this and post it.

      this is the p[ublic url

    2. It's pay per view. If you wish to send me a free copy by email [] I'll appreciate and see what I can get out of it. It looks potentially interesting.

  4. A Revised Timescale for Human Evolution Based on Ancient Mitochondrial Genomes

  5. As for their hg pool, Hadza seem to be very particular because their have extreme amounts on mtdna L4g. Now I am wondering if also Maasai have L4g. This haplogroup must have come from the North and be, as you say, a component specific to East Africa. On paternal side, Hadza have a lot of B2b and then a set of haplogroups related to bantu and other later migrations. Sandawe have much less B2b and in addition they share A3b2 with Khoesan. Sandawe share also L0 with Khoesan groups. So, based on this, it is not surprising that their language is closer to Khoesan languages.

    It might well be true that clicks are part of the culture of ydna A2 and A3 (former nomenclature) and mtdna L0, and at some point of time this culture was one of the most advanced cultures and these clicks were adopted also by people further North who were mostly of ydna B and these people also mixed with them, or, alternatively, ydna A was widespread in East Africa and decreased later on but Hadza and Sandawe retained traits of these ancient languages.


Please, be reasonably respectful when making comments. I do not tolerate in particular sexism, racism nor homophobia. Personal attacks, manipulation and trolling are also very much unwelcome here.The author reserves the right to delete any abusive comment.

Preliminary comment moderation is... ON (your comment may take some time, maybe days or weeks to appear).