March 24, 2014

Lactase persistence genetics in Africa

This month we get to know a bit more about the ability of humans to digest milk sugars (lactose) in adulthood, with particular emphasis on Africa:

Alessia Ranciaro et al., Genetic Origins of Lactase Persistence and the Spread of Pastoralism in Africa. AJHG 2014. Pay per view (free access after 6-month embargo)LINK [doi:10.1016/j.ajhg.2014.02.009]


In humans, the ability to digest lactose, the sugar in milk, declines after weaning because of decreasing levels of the enzyme lactase-phlorizin hydrolase, encoded by LCT. However, some individuals maintain high enzyme amounts and are able to digest lactose into adulthood (i.e., they have the lactase-persistence [LP] trait). It is thought that selection has played a major role in maintaining this genetically determined phenotypic trait in different human populations that practice pastoralism. To identify variants associated with the LP trait and to study its evolutionary history in Africa, we sequenced MCM6 introns 9 and 13 and ∼2 kb of the LCT promoter region in 819 individuals from 63 African populations and in 154 non-Africans from nine populations. We also genotyped four microsatellites in an ∼198 kb region in a subset of 252 individuals to reconstruct the origin and spread of LP-associated variants in Africa. Additionally, we examined the association between LP and genetic variability at candidate regulatory regions in 513 individuals from eastern Africa. Our analyses confirmed the association between the LP trait and three common variants in intron 13 (C-14010, G-13907, and G-13915). Furthermore, we identified two additional LP-associated SNPs in intron 13 and the promoter region (G-12962 and T-956, respectively). Using neutrality tests based on the allele frequency spectrum and long-range linkage disequilibrium, we detected strong signatures of recent positive selection in eastern African populations and the Fulani from central Africa. In addition, haplotype analysis supported an eastern African origin of the C-14010 LP-associated mutation in southern Africa.

The study detects in essence four alleles that explain lactase persistence in Africa: one (Fulani and other Central Africans) is the same as in Europe, another (Afroasiatics and some Nilotics from NE Africa) is related to Arabia but at least two other variants are African-specific:

Figure 3. Contour Maps of Africa Show the Allele Frequency Distribution for the Four Primary SNPs Associated with the LP Trait in the Current Study

from Fig. 4 (haplotype network)
Interestingly, three of the alleles are related in the haplotype network (to the right), suggesting a common remote origin, maybe in the shared Neolithic origins of West Asia. The fact that they mostly affect peoples of Afroasiatic language (and to lesser extent some other groups of the Sahel) rather suggests a common origin in either West Asia or the Nile (which most likely had some sort of impact in the Western Neolithic, judging from the distribution of the lineage E1b1b1 in the present as in the past and the minor African affinity of early European farmers).

Instead the C-14010 allele, present in East Africa around Lake Victoria, and to some extent also in SW Africa, seems very much unrelated.

Alleles and their variants 

The European T-13910 allele as three main haplotype variants in Africa: 
  • The Mozabite one is related to West Asia
  • The main Fulani and Bulala one is related to Europe
  • A third variant includes only Fulani and Arabic Baggara (strictly African therefore)
Judging on the haplotype tree to the right, the Mozabite/West Asian variant probably derives from Europe but the case for the Central African ones is much less clear.

The authors estimate the allele to have an age between 12,000 and 5,000 years, i.e. approximately early Neolithic. This would allow it to have arrived to Africa and Europe separately before booming. 

The G-13915 allele is attributed by the authors to Arabian origins c. 4000 years ago, based on previous studies and its presence only in Afroasiatic and Nilo-Saharan speakers. However I can't but notice that the haplotype structure above is centered in East Africa, with Arabian and Middle Eastern sequences being found only at the branches, what makes me cast a prudential question mark on that claim.

The G-13907 variant is clearly NE African in origin and distribution, with the greatest frequency being among the Beja (Afroasiatic pastoralists of coastal Sudan). This one is restricted to Afroasiatic speakers only, mostly of the Cushitic branch. The authors did not produce an age estimate for this branch.

The C-14010 allele could be as old as 23,000 years or as recent as only 1200 using the 95% C.I. Again it is essentially present among Nilo-Saharan and Afroasiatic (Southern Cushitic) populations, rather than among Bantu speakers, suggesting again association with pastoralism. 

Notably this allele has reached Southern Africa but it seems more directly related to the Khoe-Khoe pastoralist history and maybe other populations of possible East African origins (Himba, Herero) than to anything Bantu. 

Lactase persistence in hunter-gatherers and other roles of the enzyme

To the surprise of the authors the Hadza hunter-gatherers have high frequencies (47%) of the lactase persistence phenotype (not associated to any of the previously mentioned alleles), something they try to explain on the grounds that the same enzymes that allow lactose digestion, also works as the enzyme phlorizin-hydrolase, which uses the substance phlorizin, found in many Rosaceae (the main group of fruit-producing plants). Besides its anti-diabetic properties, phlorizin has also been used to fight against malaria, what may well be another pathway of positive selection, alternative to usefulness of lactose digestion.

This is a particularly interesting note because it could potentially feed also alternative explanations for the strong selection of lactase persistence alleles in other populations, particularly Europeans. Just tentatively, it might have helped to fend off diabetes (type II) in high carbohydrate diets typical of the Neolithic onwards. The fact that people of West Eurasian and African ancestry are somewhat less prone to this type of diabetes than people of other Asian and Native American ancestry may support this function of lactase persistence.

However in the case of the Hadza a role against malaria (same enzyme) seems more likely. 

Still plenty of room for further research

There are still way too many pockets of lactase persistence phenotype which remain unexplained and more rarely (Afghan Tajiks particularly) also pockets of lactose intolerance lacking explanation, as I discussed in 2010. Map of worldwide actual LP phenotype from Yuval Itan 2010:

In the case of Africa the Western Sahel pocket is most intriguing. Also the alleged 88% LP phenotype in the Sudanese (not apparent in the map but data from the same study - maybe they meant Western Sudanese from the Sudan region, i.e. the Sahel and not the Republic of Sudan?), which is only explained in half. Even Germans and Italians have much greater ability to digest milk than what their known alleles can explain so far. 


  1. This is interesting, I have not read the paper myself, but after seeing the network diagrams you posted for G-13915, I too was wondering why they would propose an Arabian origin for it when it is clearly centered in East Africa from the Network diagram, however, if you look at the frequency map you show, the allele is more frequent in the peninsula, so I think one of the images must be wrong, either the frequency or the network one?? Thanks for the info anyhow....

    1. I just sent you a copy of the paper, so maybe you can dig in more detail in some of these issues (or whatever others). Discerning some of these details you mention may require to take a look at the referenced bibliography, because it seems clear that in some cases they are talking of their own (mostly African but slanted to the East) sample (the haplotype network is surely one such case) and in others of the expanded sample and the conclusions previously reached by other researchers (frequency maps, conclusions on G-13915).

      I look forward to whatever you can mine from it.

    2. Ok I think I understand, as you say they are using references + their own data. The red points in the contour map are samples from previous studies, while the network diagrams are only samples from this current study, thus, that is why they don't show many haplotypes from the peninsula in the network as only the Yemeni (jews), Lebanese and Palestinians were sampled from the near east, of which they had 15.2%, 0% and 4.5% of G-13915 respectively.
      I must say this is a very interesting allele, they are implying that it may be one of the signatures of the islamic expansion. It has a low presence in Northern Ethiopia with the Beta Israel, with an increasing presence as you go to southern Ethiopia, with the Burji and the Konso, and reaches its maximum in Northern Kenya with the Gabra (20%), but also high presence with the Watta, Gareh and Orma. Then as you go further North from Northern Ethiopia it picks up frequency again with the Beja (19%). It is almost as if there is a discontinuity in Northern and central Ethiopia of this allele.
      But the question is, if it is associated with the islamic expansion why can't you see it in North Africa? and why would it be present in the konso, albeit moderately @ 8%? And why present in the Yemeni Jews but totally absent in the Lebanese? The Lebanese were certainly affected by the expansion....

    3. In my opinion it is not impossible at all that all the three related alleles of the top of the chart originated in Africa (Sudan for example).

      The West Asian origin can't be discarded at this stage but the recent research in European early Neolithic shows an African element in it (even if mediated via West Asia): E1b-V13 was there (Catalan Cardium Pottery) and must have arrived to Thessaly (most common in Greece/Albania, Thessaly is at the origin of most of the European Neolithic) via some sort of founder effect probably with partial origin in Palestine (rather than Anatolia only). Not only E1b suggest a Palestinian connection but maybe also G, which is not rarer in the Southern Levant than in Turkey (although less studied). Critically the early European farmers of Lazaridis show that "Palestinian" deviation in autosomal DNA, which this researcher explains as "Basal Eurasian" (as opposed to mainstream Eurasians) but that I strongly suspect African influenced instead (as modern Palestinians and Negev Bedouins are).

      This can be explained because the Afroasiatic core expansion was Mesolithic and clearly affected Palestine, at the very least the more semi-desertic fringes, which in due time generated the Semitic phenomenon. So there was some African influence in the Levantine Neolithic and for some odd reason it affected Thessaly strongly and, via Thessaly, much of Europe.

      As this study ponders that the LP phenotype (the enzyme manifested by all alleles) could have an original role in fencing off malaria and also it has another known role in preventing diabetes mellitus type II, I think it would make good sense if all the triple branch of alleles originated in Sudan in Mesolithic times, helping against malaria and also against the risk of diabetes that greater consumption of cereal (carbohydrates in general) tends to cause. As far as I know there is indeed a "Nubian Mesolithic" rich in wild cereal consumption, even if there is no evidence that it evolved locally into Neolithic before West Asian back-influences arrived.

      So I would keep very open the possibility that the three alleles have a common origin in Africa for all these reasons.

      IF some of them actually evolved elsewhere, the "Islamic" explanation sounds very far fetched and inconsistent with the actual data (all those questions that you pose and some others that I do on my own). If anything it'd be some sort of Neolithic flow.


Please, be reasonably respectful when making comments. I do not tolerate in particular sexism, racism nor homophobia. Personal attacks, manipulation and trolling are also very much unwelcome here.The author reserves the right to delete any abusive comment.

Preliminary comment moderation is... ON (sorry, too many trolls).