Haplogroup R1a, most of which is R1a1, dominant in Northern South Asia and Eastern Europe, as well as in much of Central Asia, has been giving headaches to population geneticists, academic and amateur alike, because key markers were not identified, making most of the haplogroup look like an amorphous goo, the same in India as in Europe. It seems that this may change now:
Horolma Pamjav et al., Brief communication: New Y-chromosome binary markers improve phylogenetic resolution within haplogroup R1a1. AJPA 2012. Pay per view ··> LINK [10.1002/ajpa.22167]
Abstract
Haplogroup R1a1-M198 is a major clade of Y chromosomal haplogroups which is distributed all across Eurasia. To this date, many efforts have been made to identify large SNP-based subgroups and migration patterns of this haplogroup. The origin and spread of R1a1 chromosomes in Eurasia has, however, remained unknown due to the lack of downstream SNPs within the R1a1 haplogroup. Since the discovery of R1a1-M458, this is the first scientific attempt to divide haplogroup R1a1-M198 into multiple SNP-based sub-haplogroups. We have genotyped 217 R1a1-M198 samples from seven different population groups at M458, as well as the Z280 and Z93 SNPs recently identified from the “1000 Genomes Project”.
The two additional binary markers present an effective tool because now more than 98% of the samples analyzed assign to one of the three sub-haplogroups. R1a1-M458 and R1a1-Z280 were typical for the Hungarian population groups, whereas R1a1-Z93 was typical for Malaysian Indians and the Hungarian Roma. Inner and Central Asia is an overlap zone for the R1a1-Z280 and R1a1-Z93 lineages. This pattern implies that an early differentiation zone of R1a1-M198 conceivably occurred somewhere within the Eurasian Steppes or the Middle East and Caucasus region as they lie between South Asia and Eastern Europe. The detection of the Z93 paternal genetic imprint in the Hungarian Roma gene pool is consistent with South Asian ancestry and amends the view that H1a-M82 is their only discernible paternal lineage of Indian heritage.
Not having access to the paper right now, I can't say much more but I believe that the abstract alone is very informative already.
Update:
Fig. 1 - MJ trees (click to expand) |
A reader already sent me a copy of the paper and I think that it has two aspects:
On one side the paper effectively detects these markers and study them, as well as R-M458 in Hungarians and related ethnic groups (Csangos, Szeklers, Hungarian Roma), as well as in Malaysian Indians, Uzbeks and Mongols. This part is informative, even if the selected Asian populations may not be the best choice (Mongols are low in R1a and so are Tamils who make up the bulk of Malaysian Indians).
On the other side, the authors attempt to read too much, not just on these haplogroups but specially on molecular-clock-o-logic estimates, (based on the Zhivotovsky mutation rate, now considered obsolete even by molecular clock enthusiasts). A corrected age estimate would be roughly doubly old[ref 1, ref 2] and that means that neither the Kurgan expansion nor the Neolithic one could account for its arrival to Europe.
Even using Underhill age estimates, they'd imply at least LGM dates for the arrival to Europe after the due correction. Their own dates, after due x2 correction, give Late Upper Paleolithic dates for the haplogroups researched here.
Also the authors insist on arguing against a South Asian origin of R1a1 (Underhill 2010) on what sound like weak and fallacious arguments:
Previous publications have pointed out that regions of highest haplogroup frequencies do not always indicate the territory of origin (Cinnioglu et al., 2004) and high STR diversity may not be exclusively an indicator of in-situ diversification but could also be the consequence of repeated gene flow from different sources (Zerjal et al., 2002; Sharma et al., 2009).
Basically they are nagging: "Underhill could hypothetically be wrong in his conclusions but we have no evidence whatsoever that he is - just saying".
The real reason is that they seem to hope to find a more westerly origin for the lineage and attribute it again to Indoeuropean expansions, in line with classic speculations for which the high South Asian STR diversity levels are a big problem. However it is most unlikely that a bunch of horse-riding nomads could so radically alter the genetic landscape of the whole subcontinent, more so when its agriculture was already fully developed, sustaining no doubt high densities.
But notwithstanding all those highly questionable opinions, the discovery of new haplogroups adding to our comprehension of this major lineage is a great advance.
Update:
It seems that some of the data exposed in this paper was already floating around in some circles because ISOGG already includes the "new" haplogroups in its phylogenetic synthesis. Most interestingly the two "European" clades (along with a third one, whose geography I ignore so far) make up a larger haplogroup (R1a1a1b1a - S198/Z282), which is "brother" of the "Asian" one (R1a1a1b2 - S202/Z93).
As I was just commenting elsewhere the key to the origins of R1a is not so much in these low level haplogroups but in the higher "asterisk" paragroup, which (from memory) used to be concentrated in Pakistan and nearby areas of India, etc.
But once reached the level of R1a1a1b1 (S339/Z283), this lineage seems to have split in two: one which we can describe as "European" and another which we can describe as "Indian".
The European half is treated in this paper as two of its subclades only and separately, what may be confuse. Hence I am adding here a synthesis of the current ISOGG phylogeny of R1a, with some annotations, for easier reference:
All the data on the geography of top level "asterisk" paragroups is from Underhill 2010, already mentioned above. It suggest a West Asian origin for R1a overall and spread to West and East since the R1a1a level or lower.
I used colors to emphasize the clades discussed here (purple for the larger haplogroup, blue for the European-leaning clade and red for the Indian-leaning one).
Clades in cursive are "proposed", not yet consolidated.
Update:
It seems that some of the data exposed in this paper was already floating around in some circles because ISOGG already includes the "new" haplogroups in its phylogenetic synthesis. Most interestingly the two "European" clades (along with a third one, whose geography I ignore so far) make up a larger haplogroup (R1a1a1b1a - S198/Z282), which is "brother" of the "Asian" one (R1a1a1b2 - S202/Z93).
As I was just commenting elsewhere the key to the origins of R1a is not so much in these low level haplogroups but in the higher "asterisk" paragroup, which (from memory) used to be concentrated in Pakistan and nearby areas of India, etc.
But once reached the level of R1a1a1b1 (S339/Z283), this lineage seems to have split in two: one which we can describe as "European" and another which we can describe as "Indian".
The European half is treated in this paper as two of its subclades only and separately, what may be confuse. Hence I am adding here a synthesis of the current ISOGG phylogeny of R1a, with some annotations, for easier reference:
- R1a* ··> Iran, Persian Gulf, Turkey
- R1a1 (L120/M516, L122/M448, M459, Page65.2/SRY1532.2/SRY10831.2)
- R1a1* ··> Iran, Caucasus, Greece, Scandinavia
- R1a1a (L168, L449, M17, M198, M512, M514, M515)
- R1a1a* ··> where? (not clear)
- R1a1a1 (M417, Page7)
- R1a1a1* ··> where?
- R1a1a1a (L664/S298) ··> where?
- R1a1a1b (S224/Z645, S441/Z647)
- R1a1a1b* ··> where?
- R1a1a1b1 (S339/Z283)
- R1a1a1b1* ··> where?
- R1a1a1b1a (S198/Z282)
- R1a1a1b1a*
- R1a1a1b1a1 (M458) ··> Central & East Europe
- R1a1a1b1a2 (S204/Z91, S466/Z280) ··> Europe, Central Asia
- R1a1a1b1a3 (S221/Z284, S443/Z289) ··> where?
- R1a1a1b2 (S202/Z93) ··> India, Central Asia
All the data on the geography of top level "asterisk" paragroups is from Underhill 2010, already mentioned above. It suggest a West Asian origin for R1a overall and spread to West and East since the R1a1a level or lower.
I used colors to emphasize the clades discussed here (purple for the larger haplogroup, blue for the European-leaning clade and red for the Indian-leaning one).
Clades in cursive are "proposed", not yet consolidated.