December 9, 2011

Autosomal genetics of South Asia in the wider context

Estonian geneticist Mait Metspalu has in the past performed leading research of the genetic pool of South Asia, so crucial to understand not just the subcontinental populations but all Eurasia as a matter of fact. Again he and his team provide us with valuable material to understand this region and its wider continental context:

The authors added 142 samples from India to pre-existing catalogs and found that:
30% of SNPs found in Indian populations were not seen in HapMap populations and that compared to these populations (including Africans) some Indian populations displayed higher levels of genetic variation, whereas some others showed unexpectedly low diversity.
Reinforcing the generally acknowledged notion that India hosts very large, albeit largely untapped, genetic diversity.

Nothing really new in the wider picture but always worth reminding the basics (principal component analysis of Eurasians):

Supp. Fig. 12

Supp. Fig. 2 (part)

The Pakistan-India (ANI-ASI) duality

It may get a bit more interesting when they analyze the equivalent of Reich's Ancestral North/South Indian components (ANI & ASI), around which much speculation (sometimes quite wild) has built up. 

These two components are apparent at both the PC analysis (PC2 and PC4) but maybe more clearly within the ADMIXTURE cluster analysis. The authors decided to use K=8 where I would have used K=13 (preferred by the combination of both check algorithms shown at Supp. Fig. 4 b and c) but the result is only different (for this purpose) in the inclusion or not of Caucasian populations in the ANI-equivalent component (k5 in the maps below). 

Iranians are always included, as are Central Asians but quite less emphatically anyhow at K=13 than at K=8, as the affinity splits between the Baloch (ANI) component and the Caucasus-specific one. However Russians do not show any Caucasus-specific affinity and show instead strong influence of the ANI component, which seems to correlate well with Y-DNA R1a, specially once the Caucasus affinity is detached at K=13.

Whatever the case at K=8:

The authors do in fact make an effort to discern if the Baloch-ANI could represent the much discussed Indoeuropean (or Aryan) invasion (hardly doubted in the linguistic plane but not clearly supported in the genetic one). They conclude however that the arrival of the ANI component in South Asia should be much older, at least 12,500 years old, that is: clearly pre-Neolithic - and in any case not related to the Indo-Aryan invasion

Barely outlined South Asian internal structure

It is interesting that at deeper K levels (K=18) a Gujarat-centered component (middle green), distinct from the two mentioned so far appears and takes a dominant role in most populations, particularly displacing the Baloch (light green) component:

Cut from Supp. Fig. 4a
I would like to encourage transcending the limitations of the chosen K=8 level of analysis and dive in the K=18 analysis found in the Supplemental Figures' PDF (fig. 4). As said before, the optimal level of analysis seems to be K=13 or maybe K=12, rather than the chosen one of K=8. Above K=10 in any case. However many of the improvements of greater resolution take place outside of South Asia, so for most purposes there is no difference (other than the inclusion or exclusion of the Caucasus' populations in the ANI bloc).

Something else that I miss here is a regional, South Asian specific (maybe with the inclusion of some West Asian and SE Asian controls), analysis. It may have offered interesting insights but it is just outlined, with just four South-Asian-specific components at K=18: more than enough for the pan-Eurasian analysis but surely quite limited to discern the details of population structure in South Asia alone. 

Diabetes-related allele

One of the most specific findings of this survey is the detection of a group of alleles (at genes DOK5, CLOCK) that have been apparently selected for in South Asians but that has become harmful as diet and lifestyles change today, favoring type 2 diabetes.


  1. Outside of linguistics, is there any evidence for an Aryan invasion of India? Last I heard, and as that Wiki article admits, no physical evidence of any such invasion has been found.

    1. There is statistical evidence in terms of the R1a haplogroup that is steady at around 50% of the population on a line running from North India through Tajikistan (Hindu or Buddhist till 1000 A.D) all the way to Poland. I am not sure what this means as I am a statistician and not a biologist

  2. The main evidence is lingüistic (and this one is very strong) because the archaeological evidence is disputed. We have the IVC (which was probably Dravidian-speaking, judging from the Brahui) and we have a number of cultures with likely but contested connections with Central Asia.

    I can't put my hand on fire, so to say, for the cultural continuity from Central Asia because I have not studied the matter in sufficient depth, but the linguistic arrival needs a vehicle and this sequence of cultures is the most likely and AFAIK the only plausible one.

    That says nothing about genetics directly, even if Uttar Pradesh upper castes look "Pakistani" in the genetic aspects, suggesting that they arrived from there (but not directly from Central Asia) in this period, usually known as Vedic period.

  3. But anyhow, suffice to say, that only Hindutva nationalists (the Hindu equivalent to Islamic Fundamentalism) and maybe some people influenced by their ideas less directly are opposed to this theory. Among Indian secularists and outside India there's little if any doubt that the Indo-Aryan conquest (if not "migration") necessarily took place when it is believed to have happened.

    What we are really debating here is not the conquest as such but whether this one carried a lot of Central Asian or European genetics, what is not apparent at all by the new data.

  4. The physical evidence includes the sudden appearance of cremations of human remains that are ubiquitous in Bronze Age IE cultures but were not present in the previous stratum, and the appearance of metallurgy (including metal working sites, not just trade goods) at pretty much the same that the cremations appear. These correlate with Rig Vedic references to a cultural change. Material horse/chariot culture also starts to appear in South Asia at about this time and had made its way to the Mittani by 1500 BCE.

    The sequence of the transition is a bit hard to puzzle out. The Near East had a very extreme arid period for a few decades or so around 2000 BCE that lead to the fall of the Akkadian empire and one of the intermediary periods in Ancient Egyptian history. The appearance of Indo-European material culture and evidence of their religious practices happens within a few centuries of a dramatic realignment of the Indus River Valley river system that turns about half of the highly urbanized area into desert almost "overnight" when something causes the course of the main source of the river to divert into another drainage basin. This could be a case of collapse leading to a socio-political vacuum that is filled by Indo-Aryans (there is no evidence of large scale warfare at the transition era).

    More radically, one could argue that Indo-European culture transitions from a bunch of steppe herders into an elite ruling class of farming societies because it receives an infusion of Harappan refugees who were forced into a mass migration by this aridity/river realignment event and that Harappan refugees, perhaps creolizing with existing populations of Central Asia, are a core part of Indo-Aryan ethnogenesis which definitely picks up elements of a BMAC cultural package that was part of the Harappan high intensity trade network.

    Then again, maybe the Indo-Aryans simply swooped in off the steppe, the Harappans, weakened by climate and river system issues, who had never had much of a military culture because it appears to have been a united political entity pretty much since the Neolithic just quit and surrendered before there was any fighting since they knew that they would be beat after some very selective shows of force. Perhaps there was even a substrate acculturation of the superstrate similar to the way that the Greeks were conquered by the Romans but the resulting culture was heavily Greek influenced with the Byzantines eventually adopting Greek as the language of governance rather than Latin.

  5. I do not think we can reconstruct the collapse of IVC so easily, much less pinpoint to simple reasons. However we can make some not-so-unlikely extrapolations with what we see today in the AFPAK area. I can well imagine a people like the Pashtuns (with another name and another IE language) coalescing back in the day into a roaming horde that took advantage of whatever opportunity happened in nearby "Pakistan" (and beyond). This has happened in fact many times in the history of India, culminating in the Moguls, and we can trace the first indications to the Swat valley, a strategical hotspot today as well.

    But we can hardly reconstruct the details, even with the help of Hindu scripture, which may be suggestive however.

    We have seen such linguistic changes happen to other civilizations: Sumer became Semitic, Elam became Iranian, Egypt became Arabic... so this probably happened once and again in the Metal Ages as well, even more easily as the writing and bureaucratic/religious support that would save Latin and later Chinese (the only two such survivors, in a sense) was probably weaker.

    Sumerian language languished for centuries as religious tool but was eventually superseded by Semitic dialects, Coptic is still used... by a religious minority, Elamite was not that lucky and we know nigh to nothing of Pelasgian or whatever people spoke in Bulgaria before Thracians. Languages are replaced by elite pressure, no matter what you or others may think of it.

  6. @VA_Highlander,

    The origin of the Indo-Aryan Brahmins is Sintashta Culture + BMAC:

  7. I hate to break the news to you guys, but unless someone can point to actual, physical evidence of an Indo-Aryan invasion, or even large-scale migration, occurring in the 2nd millennium BCE on the Indian sub-continent, then this is all just supposition, at best.

    Can anyone point to a type site on the Indian sub-continent where we find evidence of this invading cultural complex? No, no one can and they can't point to any material discontinuity indicative of any invasion, either.

    Can anyone explain why the IVC must be pre-Vedic? No, they can't. The claim is based on Max Müller's dating of the Vedas, a claim he later retracted.

    And Maju, for you of all people, to suggest that questioning the Indo-Aryan invasion hypothesis might be politically motivated is too funny for words, my friend. It's a classic example of the pot calling the kettle black.

  8. PConroy, thanks, but I'm aware of BMAC and Sintashta.

    Personally, I think B B Lai's demolition of the BMAC connection rather convincing:

    Let not the 19th century paradigms continue to haunt us!

    Where do you see his argument failing?

  9. The origin of the Indo-Aryan ethno-culture may be there, Conroy, but the genetic origins surely not. That's what we are finding now: that the Indo-Aryan invasion process had a very limited genetic impact and, if anything, brought people from Pakistan into India rather than people from Central Asia or Europe into the subcontinent. The upper castes (brahmins and ksatriyas) of Uttar Pradesh look "Pakistani" (Balochi) but not European/Central Asian.

  10. "Can anyone explain why the IVC must be pre-Vedic?"

    It's obvious once you realize that the Brahui are genetically Baloch (or vice-versa). IVC must have spoken Dravidian, possibly something close to Brahui language. Dravidian languages in turn may have expanded through South Asia in relation with Neolithic and IVC overall influence.

    "Can anyone point to a type site on the Indian sub-continent where we find evidence of this invading cultural complex?"

    I think that we can see quite clearly in this map the process of Indo-Aryan flow into the subcontinent, beginning at Swat culture (aka Gandhara Grave), which is closely related to BMAC, as mentioned by Conroy, and includes horse burials. However anthropometry suggests local genetic continuity even if the culture is imported.

    Swat is the pivot (as far as we know because there's surely a lot more to unearth over there, specially in Pakistan and Afghanistan).

    The next phase is Cemetery H, where the practice of cremation becomes common, being AFAIK the first Indian culture adopting this kind of burial, which has managed to continue until today. Again there is no apparent biological discontinuity, just cultural.


    The archaeology is in principle quite supportive of this invasion and cultural sweep, however no genetic/anthropometric change is apparent.

    Linguistics gives no choice anyhow, as Indo-Aryan and its parent Indo-Iranian are closely related to European branches of Indoeuropean, so the language must have originated in the same origin as those of Europe: the Samara Valley. As far as I can tell Indo-Iranians were the Indoeuropeans who stayed behind in the steppes after much of Europe was invaded and acculturated in the Chalcolithic, eventually they invaded South Asia (always via assimilation of local tribes) and, later, Iran as well (almost historical).

    "And Maju, for you of all people, to suggest that questioning the Indo-Aryan invasion hypothesis might be politically motivated is too funny for words, my friend. It's a classic example of the pot calling the kettle black".

    Uh? I have no idea of what you're talking about: I follow quite strictly the mainstream theory of Indoeuropean expansion, first outlined by Marija Gimbutas in the mid 20th century. This model has been challenged but with no substantial backing other than wishful thinking (all alternative hypothesis are junk in fact and can be easily debunked, while the Kurgan model is as solid as it can get).

  11. Va_Highlander: "Can anyone explain why the IVC must be pre-Vedic?"

    Can anyone can explain how it could be otherwise?
    Cause I can't see how ancient Indians and ancient Europeans (even western ones) could share a chalcolithic/bronze age vocabulary if it weren't so (wheel, metal, etc..).

  12. Maju
    It is worthless to argue with so called "Hindu Nationalists" like @Va_Highlander. It is something like arguing with Creationists. Everybody knows in India that People vary in Physical features as one moves from North-North West India towards South-South east Indian and also upper castes look differently from lower castes Dalits and Central and Eastern Tribals. Anybody who knows the Physical,Facial features and skin colour differences of people across whole of India will not be surprised by any evidence which supports Aryan Invasion/Migration theory or series of migrations particularly from North West and also from East, it is only some upper caste Hindus which make noise against such issues. One has to remember that if Aryan invasion theory is proved or supported by genetic evidences, the whole Foundation of Hindu Nationalism (which is based upon autochthonous origin of upper castes/ priestly classes) will collaspe and the vested interests don't want it at any cost.

  13. "One has to remember that if Aryan invasion theory is proved or supported by genetic evidences, the whole Foundation of Hindu Nationalism (which is based upon autochthonous origin of upper castes/ priestly classes) will collapse"...


    Yes, it's curious about the subtle (and sometimes not-so-subtle) ideological motivations on these matters.

    On the other hand, I am surprised to read that Va-Highlander is a "Hindu Nationalist", from the name I always pictured him as a Virginian (VA) of Scottish ancestry (Highlander). But of course this is not necessarily straightforward - and I never asked.

    In any case it is not always useless to debate with people who hold stubborn ideas, specially in public. You may not persuade your opponent but at least third parties, occasional readers, may find the debate informative and learn something. Yo may also learn something in the meantime, your opponent may be stubborn and biased and still know or notice something that you do not, or, more commonly, digging into the matter, in aspects you may not know well enough about, you often also discover or re-discover information of interest. But the main point is that if the debate is public, there's almost always some third, fourth, etc. people lurking or peeking later on who may benefit from it.

    In a sense there's always a democratic element in all scientific debates, not at all too different from the song contests of Inuits and other peoples: proponents of each position try to cater the public (either specialized or general public) to their position and either erode the existing consensus with their alternative proposals (if novel or marginal) or forge/keep a consensus around a paradigmatic theory.

    Naturally the difference between the true scientist and the pseudo-scientist is that the true scientist is ready to drop a position as soon as the evidence becomes too heavy against it (or adopt a disliked theory if the evidence supports it strongly enough - or at the very least concede defeat reluctantly). A scientifically minded person is critical but also self-critical and can change stand when need be without further ado. The faith person, the pseudo-scientist, won't and will bend the evidence and/or wage a propaganda war against the scientifically-supported evidence never opening his/her mind to what the others are saying.

    In this sense, a private debate is indeed, as you say well, pointless. But the public debate is always worth try, let not the public be misled by charlatans.

  14. This comment has been removed by a blog administrator.


Please, be reasonably respectful when making comments. I do not tolerate in particular sexism, racism nor homophobia. Personal attacks, manipulation and trolling are also very much unwelcome here.The author reserves the right to delete any abusive comment.

Preliminary comment moderation is... ON (your comment may take some time, maybe days or weeks to appear).