September 23, 2012

Khoesan genetics helping to understand the evolutionary history of Humankind as a whole

A reader sent me a copy of this letter or short paper on South African autosomal genetics:

Carina M. Schlebusch, Genomic Variation in Seven Khoe-San Groups Reveals Adaptation and Complex African History. Science 2012. Pay per view ··> LINK [doi:10.1126/science.1227721]

[Note-update (Oct 2): the supplemental material is free and very very extensive: a must read for genetic data-miners and all those interested in getting deeper and more extensive info, even if just on the ethno-historical background of the populations considered in the study, something that most people, including myself, only know rather shallowly].

The paper has several points of interest but is specially useful, complemented by previous studies like Pickrell 2012, to better understand the aboriginal and modern genetics of Southern Africa, which is analyzed, for example as principal component (and other) analysis relative to geography.

Fig. 1. (click to expand)

(A) Sampling locations.
(B) Principal components analysis (PCA) of African individuals showing PC1 and PC2 rotated to fit geography.
(C) PCA for Khoe-San populations (∼ 2.3M SNPs).
(D) Pairwise FST for sub-Saharan populations (excluding Hadza, see fig. S24 for comparison)
(E) Prediction of the genetic components from geographic, linguistic and subsistence covariates. The predictive error relative to geography is given for each combination of covariates (values < 1 show improved predictive capacity compared to geography).

Also an Admixture analysis with an estimate divergence tree that is off in chronology by about 100% or even more. When will geneticists learn to calibrate their "molecular clock" speculations on archaeology? When?!

Here you have it, annotated by me (in red):

Fig. 2.(click to expand)
(A) Rooted population topology from a concordance test approach (14). Nodes with bootstrap support < 50% are collapsed (dashed lines), all other nodes have bootstrap support > 85%.
[Annotations in red by Maju]
(B) Clustering of 403 sub-Saharan African individuals (∼ 270k SNPs), assuming 2 to 11 clusters.
(C) Clustering of 118 southern African individuals (∼ 2.3M SNPs), assuming 2 to 8 clusters. Compare with fig. S16 that include recently admixed individuals.

Additionally the authors think that they have located a number of key genes that appear to have been selected for among some Khoesan groups and/or diversified around the time of the first human split c. 100 200 millennia ago, such as:
  • MYPN (myopalladin) - associated with muscle growth and function
  • ACTN3 - associated with “fast twitching” muscles and elite athletic performance
  • MHC - major histocompatibility comple
  • PRSS16 and POM121L2 - thought to protect against infectious diseases
  • ERCC4 regulators - related to pigmentation
  • ROR2 - involved in regulating bone and cartilage development

Also the following regions appear to have suffered intense selective pressures among early Homo sapiens in general, always according to the authors:
  • SPTLC1 - involved in hereditary sensory neuropathy
  • SULF2 - that regulates cartilage development
  • RUNX2 - related to morphological differences with other Homo species, notably Neanderthals (frontal bossing, clavical morphology, bell-shaped rib cage, and regulating the closure of the fontanel which is crucial for brain expansion)
  • SDCCAG8 - involved in microcephaly
  • LRAT - associated with Alzheimer's disease 

Thus, three of the top five regions contain genes involved in skeletal development, and syndromes associated with mutations in these genes display similar morphological features.
While also:
Including SULF2, three of the top five candidate regions are thus associated with neuronal function.

All this falls within expectations, I'd say, but nevertheless most interesting to know in such detail and precision.


  1. This is a very important paper, the supplementary file alone has a wealth of information in it with over 176 pages, which will take quite a while to go through....

    1. Very true, I did not stop in the supp. materials, which are free, and I wrote the entry totally based on the many article. However there's a dearth of information in those 176 pages: from ethno-historical info on all those ethnicities, which are usually dumped in more simplified categories like Khoesan or Bushmen, to detailed PCA and clustering analysis in great detail for Southern Africans, Africans and humans in general, etc.

      It's a good document for data miners and people who do not resign themselves to the most basic analysis. I'm going to add a note in this sense, thanks.


Please, be reasonably respectful when making comments. I do not tolerate in particular sexism, racism nor homophobia. Personal attacks, manipulation and trolling are also very much unwelcome here.The author reserves the right to delete any abusive comment.

Preliminary comment moderation is... ON (your comment may take some time, maybe days or weeks to appear).