A recent study finds "five" components, although in practice they can be reduced to three.
Analabha Basu et al., Genomic reconstruction of the history of extant populations of India reveals five distinct ancestral components and a complex structure. PNAS 2015. Freely accessible → LINK [doi: 10.1073/pnas.1513197113]
India, occupying the center stage of Paleolithic and Neolithic migrations, has been underrepresented in genome-wide studies of variation. Systematic analysis of genome-wide data, using multiple robust statistical methods, on (i) 367 unrelated individuals drawn from 18 mainland and 2 island (Andaman and Nicobar Islands) populations selected to represent geographic, linguistic, and ethnic diversities, and (ii) individuals from populations represented in the Human Genome Diversity Panel (HGDP), reveal four major ancestries in mainland India. This contrasts with an earlier inference of two ancestries based on limited population sampling. A distinct ancestry of the populations of Andaman archipelago was identified and found to be coancestral to Oceanic populations. Analysis of ancestral haplotype blocks revealed that extant mainland populations (i) admixed widely irrespective of ancestry, although admixtures between populations was not always symmetric, and (ii) this practice was rapidly replaced by endogamy about 70 generations ago, among upper castes and Indo-European speakers predominantly. This estimated time coincides with the historical period of formulation and adoption of sociocultural norms restricting intermarriage in large social strata. A similar replacement observed among tribal populations was temporally less uniform.
One of the components, very distant from the rest, is the Andamanese one (Jarawa, Onge), but the isolated islands are not really in South Asia, rather in SE Asia (south of Myanmar, belonging to India only because of historical accident), what reduces the structure of South Asia to what we can see in the following graph:
(A) Scatterplot of 331 individuals from 18 mainland Indian populations by the first two PCs extracted from genome-wide genotype data. Four distinct clines and clusters were noted; these are encircled using four colors. (B) Estimates of ancestral components of 331 individuals from 18 mainland Indian populations. A model with four ancestral components (K = 4) was the most parsimonious to explain the variation and similarities of the genome-wide genotype data on the 331 individuals. Each individual is represented by a vertical line partitioned into colored segments whose lengths are proportional to the contributions of the ancestral components to the genome of the individual. Population labels were added only after each individual’s ancestry had been estimated. We have used green and red to represent ANI and ASI ancestries; and cyan and blue with the inferred AAA and ATB ancestries. These colors correspond to the colors used to encircle clusters of individuals in A. (Also see SI Appendix, Figs. S2 and S3.)
It is quite apparent that the AAA (Ancient Austroasiatic) component behaves as the ASI (Ancient South Indian) one but with a tendency towards the ATB (Ancient Tibeto-Burman) one, strongly suggesting it is basically product of admixture and not a truly autonomous ancestral component.
This may be more apparent in the wider pan-Asian context:
In this wider mapping (would be even more clear if West Asian populations were included), we see that:
- ANI (Ancient North Indian) strongly tends to the West. In other analyses it is very similar to the Caucasus modal component and therefore a logical conclusion is that we are before a Neolithic immigrant element, much as happens in Europe.
- ATB (Ancient Tibeto-Burman) strongly tends to the East, more specifically SE Asia, and is therefore the reverse to ANI, although much less influential.
- ASI (Ancient South Indian) is the true aboriginal (pre-Neolithic) component of India, better preserved in southern populations but more clinal than the sample choice allows us to perceive.
- AAA (Ancient Austroasiatic) is very similar to ASI but has some SE Asian admixture, as is logical to expect, being Austroasiatic a SE Asian language of likely Neolithic expansiveness.
So ASI and AAA are basically the same thing and that's why I say that the "five" components can be simplified to just three. Said that, it is indeed possible that there is underlying complexity within the ASI+AAA component but this study does not help us to clarify that.
It is true that the K=4 (after exclusion of Andamanese, K=5 with them) fits the parsimony criterion best but the K=3 is also a good fit and shows AAA exactly as I describe them: largely ASI ("aboriginal") with a significant ATB (Eastern) component. The AAA component can therefore be perceived as consolidated, homogenized, ancient admixture. Prove me wrong on this and I'll eat my words.
Caste apartheid stopped genetic flow
Quite interestingly, the authors also dwell on how the admixture process was stopped by the Gupta laws (Middle Ages) that imposed apartheid (caste system) enforced endogamy and caused the now apparent genetic isolation of the multiple groups.
We have provided evidence that gene flow ended abruptly with the defining imposition of some social values and norms. The reign of the ardent Hindu Gupta rulers, known as the age of Vedic Brahminism, was marked by strictures laid down in Dharmaśāstra—the ancient compendium of moral laws and principles for religious duty and righteous conduct to be followed by a Hindu—and enforced through the powerful state machinery of a developing political economy (15). These strictures and enforcements resulted in a shift to endogamy. The evidence of more recent admixture among the Maratha (MRT) is in agreement with the known history of the post-Gupta Chalukya (543–753 CE) and the Rashtrakuta empires (753–982 CE) of western India, which established a clan of warriors (Kshatriyas) drawn from the local peasantry (15). In eastern and northeastern India, populations such as the West Bengal Brahmins (WBR) and the TB populations continued to admix until the emergence of the Buddhist Pala dynasty during the 8th to 12th centuries CE. The asymmetry of admixture, with ANI populations providing genomic inputs to tribal populations (AA, Dravidian tribe, and TB) but not vice versa, is consistent with elite dominance and patriarchy. Males from dominant populations, possibly upper castes, with high ANI component, mated outside of their caste, but their offspring were not allowed to be inducted into the caste. This phenomenon has been previously observed as asymmetry in homogeneity of mtDNA and heterogeneity of Y-chromosomal haplotypes in tribal populations of India (6) as well as the African Americans in United States (34). In this study, we noted that, although there are subtle sex-specific differences in admixture proportions, there are no major differences in inferences about population relationships and peopling whether X-chromosomal or autosomal data are used. We have also found our inferences to become more robust when our data are jointly analyzed with HGDP data.
I can't but find quite curious how, once again, Indian and European histories behave so similarly: in Europe also a simpler but also "god-sanctioned" caste system (designed by Agustin of Hippo) was imposed upon the collapse of the Roman Empire (very similar dates). However popular revolutions gradually but systematically destroyed it. The same is happening in India now but with a delayed timeline. Instead Muslim West Asia (and surroundings) had no caste system and that's probably why it was so successful back in the day: because it allowed relatively more freedom and intellectual pursuit than other neighboring social systems. Of course, this stopped being the case after the Mongol conquests, roughly coincident with European Renaissance, when Islam cocooned itself into reactionary mode, leading to stagnation and eventually to colonial subservience.