July 21, 2011

Nasty gut bacterian genetics in Asia

A new paper that researches human-hosted bacterium Helicobacter pylori may be of some interest for understanding humankind.

H. pylori may be present in more than half of humankind but only a fraction of infected people display symptoms, which can be vague or quite nasty: stomach pain, heartburn, regurgitation, vomiting, belching, flatulence, nausea. If untreated it may degenerate into serious illnesses like gastroesophageal reflux disease, peptic ulcers or even cancer. 

I don't really appreciate the pretense of the authors of using this parasite's genome as substitute of genuine human markers nor I agree with some of their conclusions, however the data is interesting. 

There are three major clades of H. pylori, these are named hpEastAsia, hpAsia2 and hpEurope. They are found as follows:

1. Population East Asia (hpEastAsia)

hpEAsia - Fig. 2 - NJ tree C (supopulation hspEAsia) is a subset of B

This macro-clade would seem, I understand, the ancestral one to the region and reflects well the expectations, including the ordered divergence of hpSahul, hspAmerind, hspMaori (Austronesian-specific) and the large hspEAsia, which also shows the expected south-to-north fanning structure, found in so many other markers.

Some north-to-south colonization pattern (interspersed with south-to-north bouts) is also apparent in the tree C but it is hard to understand because it is not clear how it relates with hspEAsian in tree B. It may be related to Han Chinese expansion or to Neolithic flows or a combination of the two.

2. Population Asia 2 (hp Asia2)

hpAsia2 - Fig. 3

This is a bit more difficult to interpret but it could be original from South Asia however notice how Bengali and Malaysia Indian clades seem derived (and not ancestral) in an otherwise fully SE Asian subtree. The Ladakh clade does not help much because it is a region of largely East Asian genetics and we have already seen hpEastAsia over there (most closely related to Yunnan, surely via Tibet). 

This clade seems to demand better, wider sampling if we are to understand its evolutionary dynamics. 

3. Population Europe (hpEurope)

hpEurope - Fig.1

This macro-clade shows two subpopulations: 'Europe' (including in fact West Asians) and 'Asia', which seems centered in India. The migration of this clade to SE Asia should be related to the Hindu (and Buddhist) influences of some 2200 years ago (otherwise it cannot be explained its presence in Thailand and Cambodia).

A curiosity is the presence of a Basque strain in Philippines, which is clearly a product of Spanish colonization (in which Basques were quite active as mariners, priests and businessmen specially). It clearly shows that, under the right circumstances, no particularly important demographic flow is needed to cause an impact with this nasty bug.

I suspect that the lack of the hpEastAsia clade in Philippines is due to some oddity, be it founder effect or some dietary reason. It suggests that Filipinos (at least in the sampled areas) lacked the bug until the arrival of European colonialism or maybe Muslim influences (do they explain the hpAsia2 clade?)

Update (Jul 22): 

It is interesting for background and a wider picture to take a look at this older paper, which is freely available as author's manuscript at PubMed Central:

Bodo Linz et al., An African origin for the intimate association between humans and Helicobacter pylori. Nature, 2007.

Most interesting is surely this image (fig. 1):

click here for original and legend
While this paper does not yet mention the Asia 2 clade, it does indicate the other relevant clades as well as their African relatives, some of which are closer, while others are more distant (b).

The first split in this human parasite seems to be between Africa 2  (Southern Africa) and the rest, in agreement with what human genetics tells about our early history in Africa.

Then the lineage seems to divide among a branch remaining in Africa (Africa 1 and Europe 2, aka AE2, centered in West and East Africa respectively) and another one migrating to Asia (Europe 1, centered in India, and East Asia). It seems that Helicobacter pylori also experienced an Out of Africa migration... in our stomachs.

Beurec 2011 (the main paper mentioned above) says:

Strains of the hpEurope population were shown to be hybrids of two ancestral populations, AE1 from central Asia and AE2 from northeast Africa while modern hpEastAsia strains are almost pure descendants of ancestral EastAsia. 

This explanation (hat tip to Gioello) was what lead me to find this other paper by Linz et al.


  1. The European strain may be misnamed. It appears to have its most basal branches in the Levant, Iran and India, suggesting an origin probably somewhere in Iran, although I agree with you that the Basque strain is probably a legacy of colonial impact in the last few hundred years.

    The basic theory that a human symbiant or human parasite will track human migrations is basically sound, much like the lice DNA studies, but I'm not sure that we understand the transmission of gut bacteria very well. It is possible to actually do a gut bacteria transplant, which basically involves ingesting fecal material from someone with good gut bacteria, and gut bacteria can change for other reasons during a person's life. We don't know a whole lot about where people get the original set of gut bacteria (from mom or nursemaids, perhaps). We also have an organ, once thought to be useless, called the appendix, whose purpose is now believed to be to reboot the body's gut bacteria system after starving or having extreme GI disease or purging to rid oneself of a parasite.

  2. Sure: of course that "European" is just a name and that the clade surely coalesced between West and South Asia (there seems to be a lot of diversity in Iran but only of one of the two major subclades, the other is clearly South Asian).

    And I agree also that we do not understand well the transmission mechanisms. A possible path is that of sharing cutlery (if not well clean) or maybe other mouth stuff like pipes and such. Less likely seems on first though transmission via excrements but it's not impossible: sharing latrines, being hygienically careless with dogs (which get their snouts into shit and vomits too easily) and other such unhygienic behaviors can more or less randomly get the H. pylori jumping from gut to gut.

    But I guess it's easier if there is no previous strain of H. pylori in the gut, otherwise they should compete and the older one has the advantage of numbers - so, everything else equal...

  3. "The migration of this clade to SE Asia should be related to the Hindu (and Buddhist) influences of some 2200 years ago (otherwise it cannot be explained its presence in Thailand and Cambodia"

    Hem... I will certainly not allege there is a link, but I can't help noticing that they had found several bronze age south Siberian R1a1a matches (of Europoid-like populations) in Thailand (in Kayser et al, 2009 : mislabelled as Malaysian in the article but put in Thailand on the map IIRC) and that the name of Cambodia is supposed to be derived from the Kambojas (a north-west indo-iranian tribe (located maybe as far as afghanistan or even the Pamir (tajikistan) IIRC. They are even mentionned in the very ancient RigVeda tradition if I'm not mistaken).


    The name of Cambodia, in Khmer "Kampuchea" (ព្រះរាជាណាចក្រកម្ពុជា Preăh Réachéa Nachâk Kâmpŭchea), derives from Sanskrit Kambujadeśa (कम्बोजदेश; "land of Kambuja"). It is not unique to the modern kingdom of Cambodia: the same name (i.e. Kamboja/Kambuja) is also found in Burmese and Thai chronicles referring to regions within those kingdoms


    "Kamboja or Kambuja[1] is the name of an ancient Indo-Iranian kingdom. They are believed to have been located originally in Pamirs and Badakshan in Central Asia.

    The name has a long history of attestation, both in the Iranian and the Indo-Aryan spheres.

    In Sanskrit literature, it appears from the middle Vedic period (Iron Age). While not reflected in the Vedic samhitas, it is attested in the later Brahmana stage (ca. 7th century BCE) in the Vamsa Brahmana, as well as in Yaska's Nirukta. Kamboja becomes tangible as a Mahajanapada kingdom in the Hindukush from the Epic Sanskrit stage. Kambojas enter India proper with the Indo-Scythian invasion and the name becomes established as the dynastic name of a number of ancient and medieval kingdoms of Bengal, Tibet, South India, Sri Lanka and Indochina"


    "The Kambojas (Punjabi: ਕਮ੍ਬੋਜ, Hindi: कम्बोज) were a kshatriya tribe of Iron Age India, frequently mentioned in Sanskrit and Pali literature.

    They were an Indo-Iranian tribe situated at the boundary of the Indo-Aryans and the Iranians, and appear to have moved from the Iranian into the Indo-Aryan sphere over time."

  4. Can't you see the phylogenetic tree? The subclade is clearly NOT European but South Asian. There are two subpopulations in the "Europe" macro-clade and one is not related to Europe at all.

  5. HpEurope is a mix of two ancestors: one from North East Africa and the other from Central Asia, and this could be the ancient origin of Europeans (Caucasians). But certainly the paper demonstrates that Indo Arians in India came from West (Europe?), and this deny all the pretensions of Indians that India was the origin of Indo-European languages and also of R1a1. Like I said to one of them , they lack so far of R1a/M420, the ancestor, present in Europe, also in Italy. The paper is very interesting for many reasons and many of them we knew, for instance that Chinese and Sino-Tibetan language formed themselves in the middle course of the Hoang He, i.e. North West China, then the link, prehistoric, with the same Indo-European etc.

  6. @Gioello: Why do you say it's a mix of two ancestors? How do you know?

    For all I know, South Asia is at the origin of Y-DNA P, R, R1, R2 and R1a (not of Q or R1b, which coalesced in West Asia) and maybe even major subclades within R1a (hard to say on current available data).

    This is irrelevant in relation to the origin of Indoeuropean languages which coalesced in Samara basin in the Neolithic period (from an unknown origin however, in part because not deep enough archaeology has been done in the are, sadly enough, and in part because Linguistics becomes very blurry beyond those ages).


    @Waggg: Did I misunderstand you? If you meant all that in relation not so much to Indoeuropeans as to Indians (IE or not), then it may make sense.

    Still notice that the "Europe" clade does not show up in Indoeuropean areas of India like Bengal but in Andrah Pradesh, which is of Dravidian language.

  7. “Strains of the hpEurope population were shown to be hybrids of
    two ancestral populations, AE1 from central Asia and AE2 from
    northeast Africa [13] while modern hpEastAsia strains are almost
    pure descendants of ancestral EastAsia”.

    See the paper, page 3. We are speaking of hp, but we can presume that also the people has the same origin.

  8. Thanks, very interesting. This brings us (via citation) to this 2007 paper which is interesting on its own right and which I will later mention in an update.

    Overall, by fig. 1b, there seems to be two major clades of H. pylori: Ancestral Africa 2 (Southern Africa) and the rest, which in turn is (was?) divided in [Ancestral Africa 1 plus Ancestral Europe 2] and [Ancestral Europe 2 plus Ancestral East Asia].

    Actually Ancestral Africa 1 (West Africa) and Ancestral Europe 2 (centered on Red Sea) are more like the African side of the OoA clan, while Ancestral Europe 1 (centered in India) and Ancestral East Asia would be the Asian branch of the OoA clan. However AE2 is also found in Eurasia (surely because of the E1b and similar African influences in West Eurasia).

    I don't want to rush to conclusions but that would be it on first sight. The branching pattern seems related to the OoA.

  9. "Some north-to-south colonization pattern (interspersed with south-to-north bouts) is also apparent in the tree C but it is hard to understand"

    Is it really 'hard to understand'? I think not.

  10. It is not well explained how C fits in B to begin with: it is obviously a branch within hspEAsia.

    Then C itself begins with branchings in the South (Yunnan), follows the North (Manchuria/Xian), follows the center (Hangzhou, small sample), follows the North (Beijing, small sample), follows the South (Vietnam, Guangzhou, Cambodia), follows the South (Chonqqing, Taiwan), follows the South (Hong Kong), follows the South (Malaysia, Taiwan, Singapore).

    If you can see any meaningful pattern in this...

  11. I suspect that h pylori is more a symbiote than a parasite. It is my theory that it may be associated with lactose tolerance. I would love to see a study done to note if all persons with h pylori are also lactose tolerant.


Please, be reasonably respectful when making comments. I do not tolerate in particular sexism, racism nor homophobia. Personal attacks, manipulation and trolling are also very much unwelcome here.The author reserves the right to delete any abusive comment.

Preliminary comment moderation is... ON (your comment may take some time, maybe days or weeks to appear).