July 19, 2012

Iranian Y-DNA

There is a new study on Iranian Y-DNA:


Abstract


Knowledge of high resolution Y-chromosome haplogroup diversification within Iran provides important geographic context regarding the spread and compartmentalization of male lineages in the Middle East and southwestern Asia. At present, the Iranian population is characterized by an extraordinary mix of different ethnic groups speaking a variety of Indo-Iranian, Semitic and Turkic languages. Despite these features, only few studies have investigated the multiethnic components of the Iranian gene pool. In this survey 938 Iranian male DNAs belonging to 15 ethnic groups from 14 Iranian provinces were analyzed for 84 Y-chromosome biallelic markers and 10 STRs. The results show an autochthonous but non-homogeneous ancient background mainly composed by J2a sub-clades with different external contributions. The phylogeography of the main haplogroups allowed identifying post-glacial and Neolithic expansions toward western Eurasia but also recent movements towards the Iranian region from western Eurasia (R1b-L23), Central Asia (Q-M25), Asia Minor (J2a-M92) and southern Mesopotamia (J1-Page08). In spite of the presence of important geographic barriers (Zagros and Alborz mountain ranges, and the Dasht-e Kavir and Dash-e Lut deserts) which may have limited gene flow, AMOVA analysis revealed that language, in addition to geography, has played an important role in shaping the nowadays Iranian gene pool. Overall, this study provides a portrait of the Y-chromosomal variation in Iran, useful for depicting a more comprehensive history of the peoples of this area as well as for reconstructing ancient migration routes. In addition, our results evidence the important role of the Iranian plateau as source and recipient of gene flow between culturally and genetically distinct populations.

Figure 1. Frequencies of the main Y-chromosome haplogroups in the whole Iranian population (inset pie), in the 14 Iranian provinces under study and in East Turkey [23], Iraq [20], Saudi Arabia [26] and Pakistan [24]).
(a) Azeris and Assyrians, (b) Armenians, Assyrians and Zoroastrians, (c) Persians and Zoroastrians, (d) Bandari and Afro-Iranians. Pie areas are proportional to the population sample size (small pies, N<50; intermediate pies, 50<N<100; large pies, N>100) and the areas of the sectors are proportional to the haplogroup frequencies in the relative population.

See also the table of lineage frequencies inside the Iranian borders (for the rest of the region check supplemental materials).

Some notes:
  • B is found only in Hormozgan province and in Arabia. This is interesting specially in relation to the presence of this African lineage among Hazaras of Afghanistan, probably the Northern and Easternmost extension of this lineage. 
  • E(xE1b) is also concentrated in Hormozgan and Arabia but, unlike B, it is only found in the Bandari community and, in Arabia, in coastal states and not Saudi Arabia. Notice that neither lineage is found among Afroiranians, suggesting that their presence in the area is pre-Modern.
  • E1b comes in several flavors among Iranians:
    • E1b1b1a1 (M78) - particularly common among Tehran Zoroastrians
    • E1b1b1b2a (M123) - most common among Kurds and nearby peoples
    • E1b1b1b2a1b (M2) - concentrated in the South
  • G among Iranians is mostly G2a, mostly G2a* and G2a3b1 (P303).
  • J1 is seldom found above 10%, while J2 is quite common, sometimes even dominant, what locates Iranians among what I call Highland West Asians, dominated by J2. The main exception is Khuzestan (ancient Elam and nowadays Arab-speaking).
  • Both R1a1 (M198) and R1b1a2 (M269) are common in Iran. R1a1 has only been found in its "asterisk" variant (i.e. not belonging to any subhaplogroup known so far). 

The authors suggest an original demic base of mostly J2a people, enriched since Neolithic (???) by Western and Northern gene flows mostly, with less important fluxes from Africa and South Asia. They also propose a post-LGM colonization of much of West Asia from a refuge in or near Kurdistan, specially J1 flows southwards.

I strongly suggest to take these ideas with the proverbial tablespoon of salt and other spices. It is very possible that the suggested flows are much older (for example the J1/J2 split could well be from the original colonization of West Eurasia c. 50 Ka ago in my opinion, while some of the other flows may be also much older than the authors imagine). 

While there are some diversity indicators suggesting that J1 could be original, ultimately, from Iraq maybe... the reality is that Palestinians remain an ill-researched population which may hide many surprises, specially considering their high autosomal diversity and uniqueness. Palestine has been continuously occupied by our species since at least that "Aurignacoid" colonization of some 55,000 years ago (Emirian culture).

Also, excepted the Indoeuropean invasion from the steppes in the Iron Age, that gives the country name and main language, there are no particular reasons to imagine any major gene flows from elsewhere in West Eurasia, excepted maybe some localized ones. Instead the possibility of flows from Europe at the end of the Upper Paleolithic (rock art of Turkey, alleged Epigravettian influences on Zarzian culture) remains open.


Update: IJ* at the Caspian shores!

Waggg makes a couple of interesting remarks in the comments section, one being the high basal diversity of haplogroup Q (not really new but worth underlining because this lineage so important among Native Americans probably coalesced in or near Iran, something that many do not seem to realize). 

[Edited!] But the big hit is the finding of IJ(xI,J) in 1/42 of Fars Persians and 1/74 Mazandarani, what is surely a clue for the origin of the macrohaplogroup IJ or at least one of its offshoots (it could still be J* or I*). It does seem to underline the notion of IJ and its local variant J being originated from that area of Iran or surroundings.

Important: it is clear in the Molecular Analysis section of the paper that the authors tested for both defining SNPs P209 (J) and M170 (I), so it is genuine IJ(xI,J), the first to be found on Earth as far as I know. 

Thanks for noticing to Etyopis (see comments).

23 comments:

  1. "B is found only in Hormozgan province and in Arabia. This is interesting specially in relation to the presence of this African lineage among Hazaras of Afghanistan, probably the Northern and Easternmost extension of this lineage".

    Yes, interesting. Especially when we remember that Y-DNA CT (or whatever you wish to call it) is actually a branch of B. Perhaps the Hormozgan/Hazara B emerged from Africa at the same time as CT?

    ReplyDelete
  2. No, CT and B are parallel subhaplogroups of BT, as they call it at ISOGG.

    Otherwise I have no clear opinion on the origin of B and E(xE1b) in West Asia but it does look pre-Modern, rather unrelated to the Indian Ocean trade routes of the last thousand years or so. However it's rare enough to have almost any origin.

    ReplyDelete
  3. "R1a1 has only been found in its "asterisk" variant"

    I think it's the the usual "ubiquitous" R1a branch, M198 (not M193), that is called R1a1a* now (there are also a few R1a* too apparently in Iran):

    http://3.bp.blogspot.com/-9ItFg3ZDOCc/UAfrJKKY7aI/AAAAAAAAFC8/WQfxUO6_9Vw/s1600/journal.pone.0041252.t001.jpg

    This presence of R1b1a2a1a is interesting too (it was already shown in Myres et al 2010 IIRC). Any ideas about it?

    It's also interesting that there are so much different Q.

    Clearly Q and R appeared not too far from there (not surprising).

    Surprisingly, there are also some IJ!
    And both I1 and I2! (part of Zarzian culture maybe (at least one these clades)??)

    ReplyDelete
    Replies
    1. Right: M198 (not M193). Corrected, thanks.

      "This presence of R1b1a2a1a is interesting too (it was already shown in Myres et al 2010 IIRC). Any ideas about it?"

      It's only found in the Persian Gulf (Hormozgan Bandari and Khuzestan Arabs) at very low levels (2/134 in Hormuz and 1/57 in Khuzestan). My bet would be Portuguese or Dutch (powers who occupied Hormuz)... but who knows?, in the case of Khuzestan it could even be Macedonian Greek (the first ever found remnant of Alexander's mass weddings?) No idea, really: it doesn't look Earth-shaking to me but rather anecdotal.

      "It's also interesting that there are so much different Q".

      I'll mention it but it's nothing new after all: it's more and more clear that Q must have coalesced in or near Iran, considering the haplogroup's basal diversity. R's and R1's diversity instead seem more concentrated in South Asia (Pakistan or nearby areas of India). P(xQ,R) seems most common towards Bengal.

      Overall it suggests to me an East to West migration from a putative MNOPS origin in SE Asia, surely in the counter-tide phase of the Great Eurasian expansion some 60 millennia ago (?) It seems to me that a people back-migrated through Northern India and Pakistan, rather quickly, leaving as genetic legacy Y-DNA P and mtDNA R (and some other N).

      (Sadly this is likely to ignite an endless debate with Terry about the coalescence area of N, which I think, founded on basal diversity, that is towards the SE of Asia and he imagines towards the SW instead - I'll try not to get entramped again).

      "Surprisingly, there are also some IJ!"

      Whoa! IJ(xI1,I2,J) is most impressive! That's a true phylogenetic finding!

      And it is roughly where it should be: at Mazandaran and Gilan, the sourthern coast of the Caspian Sea. Great!

      "And both I1 and I2!"

      They are both almost exclusively Armenian and Armenians are peculiar, concentrating more European lineages than any other West Asian population (also in R1b). That I suspect is because of Tracho-Phrygian founder effect, as it was claimed back in the day that Armenians were a Phrygian colony (of course most Armenian lineages are locals but some seem to retain that W>E signature). The only exception, 1/64 I1 in Gilan, may even be a Varangian (Swedish or Eastern Slavic) leftover... who knows?! But I'd bet for an Armenian origin in fact.

      Delete
    2. I correct myself: IJ* is found in 1/42 Fars Persians (not in Gilan) and 1/74 Mazandaranis. It could be J* or I* or true IJ(xI,J). My bet would be J* however.

      Delete
    3. "It could be J* or I* or true IJ(xI,J). My bet would be J* however."

      I don't think it could be J* or I* since they tested for the binary polymorphisms P209 and M170 respectively, so it could only be something downstream of IJ (M429) but either upstream or parallel to both I and J but definitively not downstream of either I or J.

      Delete
    4. How do you know that? Neither in table 1 or in Supplemental Table 1 appears P209. However you are correct for M170 (in table S1), which I did not notice initially.

      Maybe they mention it elsewhere in the paper?

      Delete
    5. Ok I found it in section "molecular analysis". Thanks for the clarification, Etyopis.

      Delete
  4. "(Sadly this is likely to ignite an endless debate with Terry about the coalescence area of N, which I think, founded on basal diversity, that is towards the SE of Asia and he imagines towards the SW instead - I'll try not to get entramped again)".

    I agree with your comments regarding MNOPS and R though:

    "Overall it suggests to me an East to West migration from a putative MNOPS origin in SE Asia, surely in the counter-tide phase of the Great Eurasian expansion some 60 millennia ago (?) It seems to me that a people back-migrated through Northern India and Pakistan, rather quickly, leaving as genetic legacy Y-DNA P and mtDNA R".

    I'd siagree with the inclusion of 'and some N' of course.

    "Whoa! IJ(xI1,I2,J) is most impressive! That's a true phylogenetic finding!"

    Indeed. It basically eliminates South Asia as an origin for the haplogroup. And presumably places IJK's origin somewhere near Iran as well.

    ReplyDelete
    Replies
    1. I never though IJ was South Asian by origin. Now, IJK probably was (although this is not possible to argue without understanding well the upstream F and the downstream K distribution, none of which look "Western" at all - IJ and G are therefore exceptional within the overall structure of F and may have coalesced even very late in time, at least that I do suspect for G, not so much for IJ though).

      Delete
  5. "And presumably places IJK's origin somewhere near Iran as well."

    The thing is, even absent other evidence, Iran surely had an extremely low total population 70,000 to ~15,000ya. I have a hard time seeing this as a hot-bed for haplogroup formation.

    ReplyDelete
    Replies
    1. I would not discard it altogether because it had two western border areas that were surely important at times in terms of prehistorical demography: the Persian Gulf Oasis (now submerged marshes, since the OoA) and the Zagros area (roughly Kurdistan, since the beginnings of Westward colonization probably).

      Also, there's no objective reason why a low demography area cannot produce a large founder effect given the right circumstances of opportunity. A lot is pure luck in the end: being in the right place at the right moment.

      My two caveats are instead that:

      (1) we do not see any such "Western" founder effect in the mtDNA until, IMO, the Westward backflow process

      ... and...

      (2) nearly everything upstream of IJK and downstream in the K branch (until Q and R1b) happens East of the Baluchistan arid border

      So I see no particular reason for the near-IJK "clan" to have crossed this Balochistan buffer zone first Westwards (as pre-IJK) and then Eastwards (as pre-K). It is more reasonable, more parsimonious, to imagine a single Westward movement of pre-IJ on their own.

      When? IMO when the Westward flow began, not before c. 60 Ka. ago.

      Delete
  6. I recently h my results back and I was in i2b...

    My father was Iranian.

    Has this halpogroup been seen in Iran up to now?

    Martin

    ReplyDelete
    Replies
    1. Y-DNA I is known to exist in Iran at low frequencies, especially to the North and Northwest. Now, specific subclades... that's harder to say, more so when the nomenclature is changing all the time.

      What's the defining SNP or SNPs? Is it I2b (L415, L416, L417) as per ISOGG 2013, or is it rather I2b as defined in 2010: I2b (L35, L37, M436/P214/S33, P216/S30, P217/S23, P218/S32), now renamed as I2a2? Or something else?

      Delete
  7. I'm not too sure of the exact grouping as I'm a newbie to all this but my markers are :


    DYS393 DYS390 DYS19 DYS391 DYS385 DYS426 DYS388 DYS439 DYS389I DYS392 DYS389II DYS458 DYS459
    11 23 16 10 12 14 11 13 11 15 11 31 20 08 09


    DYS455 DYS454 DYS447 DYS437 DYS448 DYS449 DYS464 DYS460 Y-GATA-H4 YCAII DYS456 DYS607 DYS576 DYS570 CDY DYS442 DYS438
    11 11 24 16 20 25 11-14-15-15 10 10 19-21 13 13 18 22 33-34 11 10


    ReplyDelete
    Replies
    1. Those are SRY markers, microsatellites (repeated "junk" sequences, the value being the number of times they repeat), which are used for secondary reference but are not always too reliable. It is possible to infer the haplogroup from them but not with 100% certainty. They are mostly used to explore affinities within haplogroups, so if you confirm that you are I2b and you find someone from, say, Hamburg, with I2b who also has many similar SRY markers, then you can very reasonably infer that you two are distant relatives by paternal line and that you shared a common patrilineal ancestor X generations ago (but take these time estimates with caution, they may well be older).

      You should have also tested for SNP markers (single nucleotide polymorphism, a single "letter" of the "GATTACA" code that changes - no repetitions, no deletions - these are the most reliable makers by all accounts because chances of happening twice in human history are extremely remote), which are the reliable stuff. I can only imagine that you were SNP-tested but you don't know where to look at. I fear I can't help you with that: ask yo your DNA company.

      Delete
  8. sorry for the delay, familytreedna groups me as I-M170? has that been seen in iran upto now?

    ReplyDelete
    Replies
    1. AFAIK yes but I'm a bit uncertain about where and the exact frequencies.

      For example the old McDonald compilation (2005), reflected a strong presence of I among ethnic Persians but those results do not seem reflected in studies like this one, so probably the actual frequency is quite smaller (and this review may be confusing markers, something not too rare so many years ago). In this paper, for example I is within the "other" category, which does not reach anywhere such high frequencies.

      But in any case I is not unheard of in Iran or other northernly parts of West Asia (Turkey, Kurds, Afghans). But there is little focus on I outside of Europe.

      Delete
    2. Correcting: this very study actually has data on I (table 1): Iranians have 0.2% I1 and 0.3% I2 (in a sample of almost 1000 people). It seems to be a very rare lineage in Iran. There's also some other 0.2% IJ* (which might be I* but not likely because it had never been observed before).

      I1 is found in Gilak and among Tehran Armenians; I2 in the Bandari of Hormozgan, Kurds and again Tehran Armenians.

      Delete
  9. Stupid question from science-illiterate arts student - How "Arya" Turki and Arab is the Iranian gene mix?

    ReplyDelete
    Replies
    1. By "Arya" you mean Indoeuropean, right?

      Anyhow, IMO the bulk of the ancestry of Iranians, as well as other peoples of the region, is older than any of those three expansions, which only contributed very modestly to the modern genetic pools. Almost all the lineages are local West Asian from times long before any of those ethnicities were formed.

      People change language and ethnic identity with certain ease.

      Genetic pools seem relatively stable since at least the early Metal Ages, when invasions became not anymore a matter of farmers looking for land to settle but rather of aristocratic warriors (previously steppe/semidesert herders in most cases) looking for serfs and slaves, land included. This way elite domination changed the language and ethnic identity of many peoples with a very modest input in their genetic pool.

      Delete
  10. I cant understand your scientific discussio. Can you answer my question? Is there any evidence of Genghiz Khan invasion in this genetic data?

    ReplyDelete
    Replies
    1. No, there's not even the slightiest signal.

      I don't believe in the "Genghis Khan theory" anyhow, the ancestor of that lineage must IMO be much older, maybe a forgotten or legendary Turkic leader, but in any case haplogroup C is negligilbe in Iran and the so-called GK lineage is a particular subhaplogroup of C.

      Delete

Please, be reasonably respectful when making comments. I do not tolerate in particular sexism, racism nor homophobia. Personal attacks, manipulation and trolling are also very much unwelcome here.The author reserves the right to delete any abusive comment.

Preliminary comment moderation is... ON (your comment may take some time, maybe days or weeks to appear).