March 2, 2013

West Asian autosomal genetics: two cluster confimed

You may be familiar with my understanding (also suggested by others, I guess) that West Asia has (at least) two ancient populations whose spread now overlaps but is relatively easy to discern. A clear example is the duality between Y-DNA haplogroups J1 (south/SW or lowlands) and J2 (north/NE or highlands). To my knowledge this is however the first study that formally asserts that duality for autosomal DNA.

Marc Haber et al., Genome-Wide Diversity in the Levant Reveals Recent Structuring by Culture. PLoS Genetics 2013. Open accessLINK [doi:10.1371/journal.pgen.1003316]

Abstract

The Levant is a region in the Near East with an impressive record of continuous human existence and major cultural developments since the Paleolithic period. Genetic and archeological studies present solid evidence placing the Middle East and the Arabian Peninsula as the first stepping-stone outside Africa. There is, however, little understanding of demographic changes in the Middle East, particularly the Levant, after the first Out-of-Africa expansion and how the Levantine peoples relate genetically to each other and to their neighbors. In this study we analyze more than 500,000 genome-wide SNPs in 1,341 new samples from the Levant and compare them to samples from 48 populations worldwide. Our results show recent genetic stratifications in the Levant are driven by the religious affiliations of the populations within the region. Cultural changes within the last two millennia appear to have facilitated/maintained admixture between culturally similar populations from the Levant, Arabian Peninsula, and Africa. The same cultural changes seem to have resulted in genetic isolation of other groups by limiting admixture with culturally different neighboring populations. Consequently, Levant populations today fall into two main groups: one sharing more genetic characteristics with modern-day Europeans and Central Asians, and the other with closer genetic affinities to other Middle Easterners and Africans. Finally, we identify a putative Levantine ancestral component that diverged from other Middle Easterners ~23,700–15,500 years ago during the last glacial period, and diverged from Europeans ~15,900–9,100 years ago between the last glacial warming and the start of the Neolithic.

Take the age estimates with all the caution, as always. My own estimates suggest a quite older divergence, soon after the time of the settling of West Eurasia, what could be 40-30 Ka ago.  

As I have said before also, modern Jews (in this case Ashkenazim but in general all Western or Hellenistic Jews from other studies academic and amateur) cluster quite strongly towards the Northern or Highlander cluster, close to Cypriots, Turks and Lebanese but far away from the Palestinians (way too diverse to be an issue of misunderstanding because of endogamy or a recent arrival from the, quite unrelated genetically, Peninsular Arabia (or any other place). Therefore the bulk of the ancestry of modern Jews does not come from ancient Palestine but some other areas further North, where we know that Judaism (and its offshoot Christianity) proselytized heavily in Antiquity. 

Very interesting is also that the "lowlander" component is rather intensely scattered towards North and East Africa, with almost no influence in Europe and South Asia. Instead the "highlander" one has influenced especially Europe with minor influence into South Asia, as well as, to some extent, parts of North Africa (but not The Horn).

Figure 4. Comparisons of the Levantine and Middle Eastern modal components.
A) ADMIXTURE analysis based on 10 constructed ancestral components, with only the Levantine and Middle Eastern components highlighted. B) Frequency of the Middle Eastern component in world populations. C) Frequency of the Levantine component in world populations. Intensity of the colors reflects the frequency of a component in the plotted populations. Maps were produced using a weighted average interpolating algorithm, and therefore should be used as a guide rather than a precise representation of the frequency distribution.

I must say, as an aside, that the terms chosen to designate the components sound horrible and imprecise, because nobody really knows for sure what Levant and Middle East mean. West Asia is a more clear and neutral nomenclature and North and South (or, as I choose, highlands and lowlands) are very descriptive, even if they overlap, especially about their likely origins in or near Kurdistan and Palestine respectively (most likely, as they are two archaeologically very rich regions in the Paleolithic as well as in the Neolithic). 

18 comments:

  1. especially about their likely origins in or near Kurdistan and Palestine respectively (most likely, as they are two archaeologically very rich regions in the Paleolithic as well as in the Neolithic).

    If cluster peaks are an indication of origin then the "lowland" component has peaks much further south than Palestine, this is what they say about its distribution:

    "ADMIXTURE identifies at K = 10 an ancestral component (light green) with a geographically restricted distribution representing ~50% of the individual component in Ethiopians, Yemenis, Saudis, and Bedouins, decreasing towards the Levant, with higher frequency (~25%) in Syrians, Jordanians, and Palestinians, compared with other Levantines (4%–20%)."


    But the truth is that these peaks in components say nothing about where they may have originated,you should know this by now I'm sure, for if you change or add samples, some of these components appear/disappear just like phantoms. For instance they used for the Ethiopian samples those from Behar 2010, which were Cushitic/Semitic speaking Ethiopians, say they added the omotic and Nio saharan Ethiopians from Pagani as well and kept their K constant at 10, do you think we will see the same clusters along with their distribution formed ?

    ReplyDelete
    Replies
    1. Effectively, I was not thinking of the peak densities because, in the case of Palestine for example there must have been more inflow of the Northern component than in Arabia. As you say correctly "these peaks in components say nothing about where they may have originated", at least not in any obvious manner. I was thinking instead of archaeology first of all (the two core regions of Neolithic: Palestine and the Zagros area of roughly Kurdistan, which are also the regions with most findings for Upper Paleolithic as far as I know). And also, admittedly without any exhaustive research on my side as I say this, I have also the impression that both regions are pretty diverse in haplogroup diversity and could well be at the origins of J1 (Palestine) and J2 (Kurdistan). Feel free to correct me if you know something I may have missed.

      "For instance they used for the Ethiopian samples those from Behar 2010, which were Cushitic/Semitic speaking Ethiopians, say they added the omotic and Nio saharan Ethiopians from Pagani as well and kept their K constant at 10, do you think we will see the same clusters along with their distribution formed ?"

      We'd see a sharper decrease in frequency towards the SW, I guess, also samples from the Sudans could well alter the density (higher in North Sudan, especially around Khartoum and essentially zero in the South instead, at least I'd imagine so). But the essentials of the graph would not be altered.

      Unless they allowed the Admixture exercise to reach a level where the African and Asian components of Ethiopians get merged into one (intermediate judging from Fst) which is an Ethiopian-specific one. That in fact happened to me when I was analyzing North Africans a year ago at K=10 and K=11 (Ethiopians were one of several "control" populations) but from the Fst values of the resulting homogeneous Ethiopian component I judge that it is an old blend but the product of intercontinental admixture in any case.

      What ADMIXTURE does in any case is to split the tested populations in a given number of clusters (K value) and then assign each population a proportion of affinity. For what I have seen, recent admixture should not "blend" as the Ethiopian and Fulani cases do in my exercise (that's the product of a long process of genetic homogenization) but the Fst distances are revealing. West Eurasian components (Ara, Mor and Ibe) have around 0.80 Fst value among them, while their distance to "pure" African components (Mand for example) is of almost 0.200 instead (0.195-0.196 Arab and Iberian, 0.148 for the Morocco main component). Instead the homogenized Ethiopia component shows milder values in both directions (c. 100 to West Eurasian components, c. 110 to West African ones), strongly suggesting admixture (followed by a long process of homogenization, as said before).

      Hope you get my point on how I combine not just the cluster graph but also consider the Fst values produced by admixture in parallel, which IMO are most revealing of what each cluster means in relation to the others.

      Other analysis, with different populations may refine this understanding but I doubt at this point that the general result will be altered in the essentials.

      Delete
    2. PS- Seems I did not correctly introduce the link to the exercise I talk about in the last part of my post. It is THIS ONE.

      Delete
    3. And what population on earth would not meet your standards of " a long process of genetic homogenization" ? I have even showed you that West Eurasians themselves are a result of a long term homogenization, the first cluster that pops up for West Eurasians in the K progression for ADMIXTURE has an intermediate Fst between Africans and East Asians, therefore there is nothing really exceptional about Ethiopans and Fulani having an intermediate FST between other SSA and West Eurasians in clusters that are formed much later, but for some reason you like to emphasise them quite a bit. So please explicitly define what 'long term homogenisation' means, otherwise this just looks more and more like some type of fetish of yours*

      *Actually it is a fetish started by anthropologists a couple hundred years ago not just particularly yours, but you know what I mean....

      Delete
    4. I don't know experimentally but groups of recent admixture like African Americans, Mexicans or other Creole admixed populations will surely not show homogenization at any K-depths. But of course you are right that any admixed population with millennia of inbreeding dominating the autosomal exchange will show that kind of homogenization at some point.

      It'd be indeed interesting to research how exactly different populations with different founder admixture ages behave under ADMIXTURE (with Fst control) and compare the results but I doubt I can do that (mostly no time but also technical issues).

      What you say about global K=2 is meaningless, I already explained you that it does not work that way and that it depends a lot on population sampling levels, with cases where Africans are made to appear as Europeans instead (just that you deny it, but it's obvious). That's part of the ADMIXTURE mechanism clustering and is just noise (unless you actually have two and only two clear-cut source populations, an overly simple case). That's why cross-validation is important: to know that we are looking at the most correct K depths, and that's usually not K=2 nor K=3. Try cross-validation with your global exercise and you'll see. You know a lot about population genetics and such but in this you have fallen in a very serious misunderstanding trap, more proper of people just looking at these algorithms for the first times.

      In order to demonstrate properly this (at the levels of empirical demonstration you are demanding), we would need to design a new exercise with Ethiopians, some other Tropical Africans, some West Eurasians and at least one East Asian (and/or Melanesian) control. I expect the Ethiopian component's Fst distances to WEA (and probably also to other Tropical Africans) to be quite smaller than those between West and East Eurasians (and/or Melanesians), which in turn should be somewhat but not extremely smaller to the Fst between Tropical Africans and the various Eurasian/Oceanian components.

      In fact a previous exercise showed me that the East Asian (or actually Siberian) distance to West Eurasians (incl. Berbers) main components is in c. 1:2 relation or even slightly higher. I quote the results:

      Fst (components):

      Siberian-Berber 0.131
      Siberian-European 0.112
      Berber-European 0.054


      The specific Fst figures seem to vary across exercises, so they are only comparable in their apportion, not the raw figures. If moved to this exercise, with due other-African controls, the Ethiopian homogenized component should measure ~0.070 vs. Europeans and Berbers but maybe ~0.140 or ~0.150 vs. Siberians. Other Africans would be in the near ~0.200 vs. all (except Ethiopians).

      Sadly I don't have time these days to get this kind of stuff designed and running but, if you do not do it first yourself, I'll try to do it in May, when I hope to have more time.

      Delete
    5. Maju, my question is simple, what does 'long term homogenization' mean, by which you are also transparently implicating that there were people that were isolated from each other for a long time and came together to form the modern Ethiopian gene pool. Yet you have absolutely no evidence for such a genetically divergent set of people coming to Ethiopia to form the modern gene-pool, what we have rather is irrefutable evidence of all non-Africans holding a subset or a fraction of the genetic diversity found in Africans, just like all Native Americans holding a subset of the genetic diversity found in Asia. Beyond this, or secondary to it rather, you also have back and forth migration between Africa and outside of the continent, that's it, simple! The green cluster you see above is probably one such evidence of millenia upon millenia long of back and forth migration between Africa and Asia.

      Delete
    6. What it means is that the individuals of a population get their genome so similar among them (because of long term repeated recombinations) that at some K-levels they appear as a distinct cluster, even if originally they had ancestry from two or more distinct populations.

      In the case of Ethiopians, I understand that they clearly show two different deep ancestries: Old African and Eurasian (or West Asian more specifically). These two populations were apart since at least 90-80 Ka ago, maybe earlier (up to 125 Ka ago) until whenever they met again, probably c. 50 Ka ago (relation between Aurignacoid and LSA) or maybe later if I'm missing something.

      This is backed in the haploid side of the evidence by notable amounts of Y-DNA and mtDNA haplogroups in Ethiopians that come from Asia (long after the OoA and after going through South Asia).

      Admixture measures affinity, not diversity. It says (in my example): "With K=1-9 clusters, given these populations, Ethiopians appear as 50% akin to the WEA cluster and 50% akin to the African one. With K=10-11 clusters instead they gain their own distinct cluster because they are more akin to themselves than to any of the other components" (long-term homogenization). However the genetic distance values (Fst) still shows them as intermediate.

      Delete
    7. The Eurasian ancestry that you see in Ethiopians is not really Eurasian but Afro-Eurasian, it has layers of historical imprints, starting from OOA, to the Neolithic all the way to historical times, it is not as simple as just Eurasian.

      I never said that ADMIXTURE measured diversity, but it is an established fact that non-African genes are a subset of African genes, just as Native American genes are a subset of Asian genes. If you want to use the intermediateness of FST distances to prove your point then you have to explain the intermediate FST distance of the West Eurasian cluster between the African and the East Asian one at K=3 first, then you can explain what is going on for K>4.

      With respect to your dates, well you already know where I stand on that, until I see concrete skeletal evidence of Homo Sapiens out side of Africa before 50 KYA, (including outside of the Arabian peninsula), a pre-Toba migration of modern humans out of Africa will remain in the realm of speculation for me, sorry.

      Delete
    8. "The Eurasian ancestry that you see in Ethiopians is not really Eurasian but Afro-Eurasian, it has layers of historical imprints, starting from OOA, to the Neolithic all the way to historical times, it is not as simple as just Eurasian".

      I don't see it that way exactly: I think that the bulk of the Eurasian component is not merely post-OoA but post the Great Asian-plus expansion. I.e. returned from South Asia to West Asia (where it replaced almost completely the first OoA populations, as well as Neanderthals) and then to Ethiopia, probably in one or two waves rather than many.

      "If you want to use the intermediateness of FST distances to prove your point then you have to explain the intermediate FST distance of the West Eurasian cluster between the African and the East Asian one at K=3 first"...

      In your exercise? There is no West Eurasian component at K=2 there, so there is no Fst comparison of any sort (for a moment I was even going to run ADMIXTURE but then realized that any result would not produce anything like what you say). However the very forced division of West Eurasians between an African and an East Asian cluster should weaken the Fst between these two because WEA individuals are not just forced to "split alliance", so to say, but also they participate in the constitution of both components.

      "With respect to your dates, well you already know where I stand on that, until I see concrete skeletal evidence of Homo Sapiens out side of Africa before 50 KYA, (including outside of the Arabian peninsula), a pre-Toba migration of modern humans out of Africa will remain in the realm of speculation for me, sorry".

      Up to you. But there are skulls in Palestine, you know, from c. 125 Ka ago, which is also the first date of many other probably African-originated cultures in Arabia. Then there is an industry almost identical to Southern African MSA in Southern India pre- and post- Toba. Then there are skulls in China dated to ~67 Ka and 50-something, and then there is the fact that all Eurasian-plus genetics point to a first expansion in South and SE Asia (plus probably also Oceania and parts of NE Asia) before the colonization of West Eurasia. There may be exceptions but they are all L(xM,N) lineages, which are only somewhat relevant in Arabia.

      I did not know (or had forgotten) that you were so conservative about the OoA. Pity.

      Delete
    9. "In your exercise? There is no West Eurasian component at K=2"

      I said at K=3, not K=2, at K=3 the West Eurasian cluster is intermediate between the African and the East Asian cluster in terms of Fst.

      ---------Pop0---Pop1
      Pop0
      Pop1 0.101
      Pop2 0.131 0.159

      Much like Cavalli-Sforza predicted it 16 years ago.

      Delete
    10. I can't see anything in that link other than coordinates.

      Let me guess: Pop0 and Pop1 are Eurasians and Pop2 Africans. You do get a very intermediate result but I have not seen any other example where the result is so extremely intermediate, but rather: 8:2 or 7:3 Asian:African at K=2, when Africans do even make a cluster (depends on sample sizes). The African affinity of WEA populations should be caused by minor African inflow, absorption of minor archaic components in places like Arabia (which may show up as distinct at deeper K-levels) and maybe the sometimes argued East Asian bottleneck (i.e. their founding numbers and diversity appear smaller, at least in the autosomal aspect).

      I'll work out some experiments on this because I think that your exercise produces strange values.

      "Much like Cavalli-Sforza predicted it 16 years ago".

      He "detected" a 2:5 apportion, not a 1:1. Otherwise why there are so few African lineages in West Eurasians?

      Delete
  2. Would it be sensible to read the lowlanders/"greens" as representing a population originally from Arabia who then spread West during the early days of Islam...?

    ReplyDelete
    Replies
    1. Not really, because a)we know that the population movement was relatively modest in comparison to the existing populations in those areas of the Middle East that came under the rule of the Caliphate and b)populations from Arabia had been moving out to larger Middle East for thousands of years already. Arabs themselves first appear in historical redords in about 750 BCE, where they are recorded in Babylonia. Half a century later Assyrians settled defeated Arabs in northern Palestine.

      To me the Green areas fit much better the pre-Islam extent of Semitic languages.

      Delete
    2. We just had a lengthy discussion on almost the same matter but in relation to Y-DNA haplogroup J1 (clearly related to the lowlander cluster - and also relatedly J2, tightly connected to the highlander one).

      I understand that there is some mystery still to clear up in this matter but that J1 in North Africa and Sudan must be pre-Islamic in any case and most likely pre-Semitic (i.e. pre-Phoenician). Earliest Neolithic (i.e. pre-PPNB) or even (Epi-)Paleolithic are the reasonable options to my eyes.

      But you best read the discussion and draw your own conclusions.

      Better research of Y-DNA J1 should help clear up things in any case.

      Delete
  3. You will not get decent research on Y chromosome J1 for a long time. The haplogroup is tainted by the Semitic tag, Jews and Arabs, and most studies are Eurocentric, i.e I, R1b, R1a. J2 has been whitewashed as a minority of Europeans have this haplogroup and it is found in the UK, Germany, Norden countries not just Italy, Greece and Iberia. Just look at FTDNA's phylogenetic root to see the bias.

    According to Eupedia, J1 is found in Europe mostly in highland, hilly areas of France, Greece, Italy and so on. That Eupedia fact, if that is what it is, does not gel with your southern lowland theory. Using modern frequencies not backed up by ancient dna has its perils. R1b's origins is controversial. mtDNA H origin point is controversial. How haplogroup IJ split very conveniently into I in Europe and J in Asia, has never been explained. And Jews and Arabs are hardly fitting ethnic groups to stake any claims. Who are they? Where did they come from? From what did they come from? Someone said that Arabs were mentioned in ancient times. Does that mean those Arabs are the same as Saudi Arabs and genetically contiguous? Arab just means nomad in ancient Semitic, it can be applied to group of any phenotype and any haplogroup.


    ReplyDelete
    Replies
    1. My most sincere apologies, Ponto. Your comment was sent to the spam folder by Blogger Spam Filter without I receiving any notification. I just saw it and corrected - better late than never, I guess. Next time please, if you notice, send me an email (address in profile).

      "You will not get decent research on Y chromosome J1 for a long time. The haplogroup is tainted by the Semitic tag, Jews and Arabs, and most studies are Eurocentric"...

      Maybe but still there's a growing interest in other regions, with researchers of non-European background signing more and more papers, so I truly hope, even I dare to expect, that we will get to know more eventually. It is a major haplogroup in any case, so it should arise some interest.

      A key difficulty may be the Palestinian endless conflict: it's very likely that J1 originated in that region and revealing Palestinians at the epicenter of major prehistorical events when the Zionist propaganda insist on declaring them recent immigrants from Arabia (what is a total nonsense) may hinder the research, especially as many studies in those matters are Israeli.

      But if the Mossad itself was badly hacked yesterday (as it was) we can truly hope that other obstacles will also be overcome.

      "J1 is found in Europe mostly in highland, hilly areas of France, Greece, Italy and so on. That Eupedia fact, if that is what it is, does not gel with your southern lowland theory".

      My usage of "Lowland West Asia" has nothing to do with the habitat of all J1 peoples (for example Eastern Caucasians) it is a way of speaking of two regions in West Asia: one centered in Kurdistan and the other in Palestine most probably. Use North-South or NE-SW nomenclature if you wish. These two regions are also apparent in material Prehistory: in the Paleolithic as in the Neolithic.

      "Using modern frequencies not backed up by ancient dna has its perils".

      We have to use the information we have. There's no other possible way unless we wish to fall in nihilism. We have to interpret according to the data we have... but be ready to change our conclusions if new evidence arises... when it does.

      "How haplogroup IJ split very conveniently into I in Europe and J in Asia, has never been explained".

      For me it's fairly simple: I is the IJ branch that arrived to Europe with Aurignacoid industries (best guess, see here), while J is the one that remained in West Asia (along with other less important lineages: G, T, etc.) R1b could have entered Europe along with I or later with Gravettian. On R1a I admit I still don't have a good hypothesis.

      Delete
  4. Your point on the labels is well taken.

    The "B" component that they name "Middle Eastern", broadly speaking, fits the distribution of linguistically Afro-Asiatic populations, while a "Middle Eastern" label is suggestive of purely Semitic populations despite many Northern and Eastern African inclusions that would be outside of the Middle East. It does, as you note, seem to largely track Y-DNA J1.

    The "C" component that they name "Levantine", however is misnamed to a much greater extent. West Asian or Northern would be a better description indeed, although I don't necessarily concur that "highland" would be a good description of a component common in Northern Mesopotamia, the Pontic-Caspian steppe and the Northern Levant, all of which are lowlands. This component looks like a major contributor to Europe's first wave of demic contributions in the Neolithic. Its very low frequency in Central Asia, South Asia, Greece, and much of the Balkans, however, strongly disfavors it as a generalized Indo-European component. It is concordant, however, with a legendary highland place of origin for a population that was a substantial contributor to the Druze. The core area of this component looks almost Kassite-Hurrian-Caucasian. It does, as you note, largely seem to track Y-DNA J2.

    I would note that there are subclades within Y-DNA haplgroup E, when you dig deep enough that also show strong alignments with either the North or the South autosomal components reported here, and that the split is not basal within Y-DNA haplogroup E. Instead, there are parallel North-South splits within many of its subclades, suggesting that the North-South split has origins at a time when many subclades of E had been part of a homogenized population that subsequently split on North-South lines.

    For example, E-V13 has a mostly Northern distribution while E-32 has a mostly Southern distribution, even though both are part of E-V68, but E-M123 which is part of a sister clade to E-V68, has a similar distribution to E-V13 (although more strongly Semitic). E-M81 meanwhile, that shares a clade with E-M123 but not E-V68, has a wide presence in Europe radiating from Iberia and E-M293 which also shares the E-M81 and E-M123 clade has a Southern African and Eastern African distribution.

    ReplyDelete
    Replies
    1. Well, labels apart, the blue component IMO tracks J2 (not in Ethiopia, stronger in Europe, etc.), while the green component tracks J1 instead (lowland West Asia and North/East Africa almost exclusively). I don't see any component here tracking E variants specifically nor in group. If you're being misled by the Egyptian point, notice that it surely only represents the delta, which is relatively high in J2.

      Delete

Please, be reasonably respectful when making comments. I do not tolerate in particular sexism, racism nor homophobia. Personal attacks, manipulation and trolling are also very much unwelcome here.The author reserves the right to delete any abusive comment.

Preliminary comment moderation is... ON (your comment may take some time, maybe days or weeks to appear).