For what they were... we are: Eurasia

Showing posts with label Eurasia. Show all posts

August 22, 2015

Ket genetics: strong "ANE" and a paleo-Eskimo link

Quantity over quality series.

Pavel Flegontov et al. Genomic study of the Ket: a Paleo-Eskimo-related ethnic group with significant ancient North Eurasian ancestry. BioRxiv 2015 (pre-pub). Freely accessible → LINK [doi: http://dx.doi.org/10.1101/024554]

Abstract

The Kets, an ethnic group in the Yenisei River basin, Russia, are considered the last nomadic hunter-gatherers of Siberia, and Ket language has no transparent affiliation with any language family. We investigated connections between the Kets and Siberian and North American populations, with emphasis on the Mal'ta and Paleo-Eskimo ancient genomes, using original data from 46 unrelated samples of Kets and 42 samples of their neighboring ethnic groups (Uralic-speaking Nganasans, Enets, and Selkups). We genotyped over 130,000 autosomal SNPs, determined mitochondrial and Y-chromosomal haplogroups, and performed high-coverage genome sequencing of two Ket individuals. We established that the Kets belong to the cluster of Siberian populations related to Paleo-Eskimos. Unlike other members of this cluster (Nganasans, Ulchi, Yukaghirs, and Evens), Kets and closely related Selkups have a high degree of Mal'ta ancestry. Implications of these findings for the linguistic hypothesis uniting Ket and Na-Dene languages into a language macrofamily are discussed.

May 18, 2014

Siberian genetics with focus on Yakutia

Informative study on the populations of Sakha Republic (Yakutia) and Siberia in general:

Sardana A. Fedorova et al., Autosomal and uniparental portraits of the native populations of Sakha (Yakutia): implications for the peopling of Northeast Eurasia. BMC Evolutionary Biology 2014. Open access → LINK [doi:10.1186/1471-2148-13-127]

Abstract

Background

Sakha – an area connecting South and Northeast Siberia – is significant for understanding the history of peopling of Northeast Eurasia and the Americas. Previous studies have shown a genetic contiguity between Siberia and East Asia and the key role of South Siberia in the colonization of Siberia.

Results

We report the results of a high-resolution phylogenetic analysis of 701 mtDNAs and 318 Y chromosomes from five native populations of Sakha (Yakuts, Evenks, Evens, Yukaghirs and Dolgans) and of the analysis of more than 500,000 autosomal SNPs of 758 individuals from 55 populations, including 40 previously unpublished samples from Siberia. Phylogenetically terminal clades of East Asian mtDNA haplogroups C and D and Y-chromosome haplogroups N1c, N1b and C3, constituting the core of the gene pool of the native populations from Sakha, connect Sakha and South Siberia. Analysis of autosomal SNP data confirms the genetic continuity between Sakha and South Siberia. Maternal lineages D5a2a2, C4a1c, C4a2, C5b1b and the Yakut-specific STR sub-clade of Y-chromosome haplogroup N1c can be linked to a migration of Yakut ancestors, while the paternal lineage C3c was most likely carried to Sakha by the expansion of the Tungusic people. MtDNA haplogroups Z1a1b and Z1a3, present in Yukaghirs, Evens and Dolgans, show traces of different and probably more ancient migration(s). Analysis of both haploid loci and autosomal SNP data revealed only minor genetic components shared between Sakha and the extreme Northeast Siberia. Although the major part of West Eurasian maternal and paternal lineages in Sakha could originate from recent admixture with East Europeans, mtDNA haplogroups H8, H20a and HV1a1a, as well as Y-chromosome haplogroup J, more probably reflect an ancient gene flow from West Eurasia through Central Asia and South Siberia.

Conclusions

Our high-resolution phylogenetic dissection of mtDNA and Y-chromosome haplogroups as well as analysis of autosomal SNP data suggests that Sakha was colonized by repeated expansions from South Siberia with minor gene flow from the Lower Amur/Southern Okhotsk region and/or Kamchatka. The minor West Eurasian component in Sakha attests to both recent and ongoing admixture with East Europeans and an ancient gene flow from West Eurasia.

The matrilineal mitochondrial DNA pool is dominated by C4, C5, D4 and D5, with some instances of other lineages (see fig. 1). All these and most of the rest are common Siberian lineages of East Asian roots.

In the odd zone, the extremely rare haplogroup R3 has been found among North Yuhaghirs in this study (previously only in Jordan that I know with any certainty). They mention that R3 and R1 are derived from the same root, sharing two coding region mutations, and therefore they proceed to rename R3 as R1b. R1 is an also rare Indian matrilineage.

The patrilineal Y-DNA pool (see fig. 2) is massively dominated by N1c, which also dominates most Uralic-speaking peoples. This is unusual for a Turkic-speaking population but it was known since long ago. Other still important lineages are C2 (former C3, typical of NE Asia and some North American populations), N1b and R1a. Some instances of I, E1b1b1, J, O, F and L are also reported. C2 is more important among the Northern (non-Turkic) populations of Sakha Republic, reaching to 30-40%.

On the autosomal DNA pool, the heatmat (fig. 4) shows that among all sampled populations the Selkup are particularly isolated. Koryaks and Chukchis from the far NE Siberia form a small cluster of their own and so do Shors and Kets (West Siberians). Native American populations also show great individual isolation in comparison with most Eurasians.

Otherwise there are three major clusters: West/South/Central Eurasians, East Asians and Siberians, who generally also cluster with East Asians.

Some of this is also apparent in the PCA (fig. 5) although not as neatly:

PCA of the native populations of Sakha in the context of other Eurasian and American populations.

Maybe more illustrative is the ADMIXTURE analysis:

ADMIXTURE plots. Ancestry proportions of the 758 individuals studied (from 55 populations) as revealed by the ADMIXTURE software at K = 3, K = 4, K = 6, K = 8, and K = 13.

The analysis reveals, from K=6 upwards, the following clusters: West Eurasian (dark blue), South Asian (green), East Asian (orange, also light green at K=13) and several Siberian and Native American specific clusters (yellow, light and dark brown, red, etc.)

The persistance of the blue West Eurasian component in Aleutians and Greenlanders should raise some eyebrows. However, Greenlanders do not really cluster with West Eurasians in the heatmap, so this is almost certainly an artifact that indicates that a much greater K-depth should be achieved in order to properly classify this most diverse human sample. Thirteen clusters are obviously not enough.

April 12, 2014

Genetic paleohistory of domestic cows shows major differentiation in Africa

A new study reveals that African Bos taurus breeds are quite deeply diverged from the Eurasian branch, showing an early differentiation of both continental populations, admixture with African wild auroch and livestock export from Europe to Asia.

Jared E. Decker et al., Worldwide Patterns of Ancestry, Divergence, and Admixture in Domesticated Cattle. PLoS ONE 2014. Open access → LINK [doi:10.1371/journal.pgen.1004254]

Abstract

The domestication and development of cattle has considerably impacted human societies, but the histories of cattle breeds and populations have been poorly understood especially for African, Asian, and American breeds. Using genotypes from 43,043 autosomal single nucleotide polymorphism markers scored in 1,543 animals, we evaluate the population structure of 134 domesticated bovid breeds. Regardless of the analytical method or sample subset, the three major groups of Asian indicine, Eurasian taurine, and African taurine were consistently observed. Patterns of geographic dispersal resulting from co-migration with humans and exportation are recognizable in phylogenetic networks. All analytical methods reveal patterns of hybridization which occurred after divergence. Using 19 breeds, we map the cline of indicine introgression into Africa. We infer that African taurine possess a large portion of wild African auroch ancestry, causing their divergence from Eurasian taurine. We detect exportation patterns in Asia and identify a cline of Eurasian taurine/indicine hybridization in Asia. We also identify the influence of species other than Bos taurus taurus and B. t. indicus in the formation of Asian breeds. We detect the pronounced influence of Shorthorn cattle in the formation of European breeds. Iberian and Italian cattle possess introgression from African taurine. American Criollo cattle originate from Iberia, and not directly from Africa with African ancestry inherited via Iberian ancestors. Indicine introgression into American cattle occurred in the Americas, and not Europe. We argue that cattle migration, movement and trading followed by admixture have been important forces in shaping modern bovine genomic variation.

The triple (indicine or zebuine, Eurasian taurine, African taurine) division is apparent even in the limited scope of Principal Component Analysis, with African taurine breeds standing out in the intra-taurine distinctiveness in PC2 while PC1 shows the pre-Neolithic taurine-indicine distinction:

Figure 1. Principal component analysis of 1,543 animals genotyped with 43,043 SNPs.
Points were colored according to geographic origin of breed; black: Africa, green: Asia, red: North and South America, orange: Australia, and blue: Europe.

An admixture-enabled phylogeny shows more clearly the deep divergence of the African branch of taurine cows:

Figure 4. Phylogenetic network of the inferred relationships between 74 cattle breeds.
Breeds were colored according to their geographic origin; black: Africa, green: Asia, red: North and South America, orange: Australia, and blue: Europe. Scale bar shows 10 times the average standard error of the estimated entries in the sample covariance matrix. Common ancestor of domesticated taurines is indicated by an asterisk. Migration edges were colored according to percent ancestry received from the donor population. Migration edge a is hypothesized to be from wild African auroch into domesticates from the Fertile Crescent. Migration edge b is hypothesized to be introgression from hybrid African cattle. Migration edge c is hypothesized to be introgression from Bali/indicine hybrids into other Indonesian cattle. Migration edge d signals introgression of African taurine into Iberia. Migration edges e and f represent introgression from Brahman into American Criollo.

Admixture K=3 is also consistent with this triple pattern:

Figure 6. Ancestry models with 3 ancestral populations (K = 3).
Blue represents Eurasian Bos t. taurus ancestry, green represents Bos javanicus and Bos t. indicus ancestry, and dark grey represents African Bos. t. taurus ancestry. See Supplementary Figures S5, S6, S7, S8, S9, S10 for other values of K.

The authors find that modern Anatolian breeds are not representative of early Neolithic cows:

Anatolian breeds (AB, EAR, TG, ASY, and SAR) are admixed between blue Fertile Crescent, grey African-like, and green indicine-like cattle (Figures 5 and 6), and we infer that they do not represent the taurine populations originally domesticated in this region due to a history of admixture. Zavot (ZVT), a crossbred breed [25], has a different history with a large portion of ancestry similar to Holsteins (Figures 2 and S8, S9, S10). The placement of Anatolian breeds along principal components 1 and 2 in Figure 1 [23], the ancestry estimates in Figure 6, their extremely short branch lengths in Figures 2–4, and significant f₃ statistics confirm that modern Anatolian breeds are admixed (see Methods for explanation of f-statistics).

As mentioned above, they also find that African taurines are much deeper diverged from Eurasian taurines than would be expected if they all diverged in a simple model from early Neolithic cows. This is partly caused, according to the study, because of a later history of back-migration (or export) of European cows to Asia, including the Far East:

We conclude that there were two waves of European introgression into Far East Asian cattle, first with Mediterranean cattle (which carried African taurine and indicine alleles) brought along the Silk Road [29] and later from 1868 to 1918 when Japanese cattle were crossed with British and Northwest European cattle [25].

However there is more: African breeds also appear to have important levels of admixture (~26%) with native African wild auroch:

The second factor that we believe underlies the divergence of African taurine is a high level of wild African auroch [30], [31] introgression. Principal component (Figure 1), phylogenetic trees (Figures 2 and 3), and admixture (Figure 6) analyses all reveal the African taurines as being the most diverged of the taurine populations. Because of this divergence, it has been hypothesized that there was a third domestication of cattle in Africa [32]–[36]. If there was a third domestication, African taurine would be sister to the European and Asian clade. When no migration events were fit in the TreeMix analyses, African cattle were the most diverged of the taurine populations (Figures 2 and 3), but when admixture was modeled to include 17 migrations, all African cattle, except for East African Shorthorn Zebu and Zebu from Madagascar which have high indicine ancestry, were sister to European cattle and were less diverged than Asian or Anatolian cattle (Figure 4), thus ruling out a separate domestication. Our phylogenetic network (Figure 4) shows that there was not a third domestication process, rather there was a single origin of domesticated taurine (Asian, African, and European all share a recent common ancestor denoted by an asterisk in Figure 4, with Asian cattle sister to the rest of the taurine lineage), followed by admixture with an ancestral population in Africa (migration edge a in Figure 4, which is consistent across 6 separate TreeMix runs, Figure S4). This ancestral population (origin of migration edge a in Figure 4) was approximately halfway between the common ancestor of indicine and the common ancestor of taurine. We conclude that African taurines received as much as 26% (estimated as 0.263 in the network, p-value<2.2e-308) of their ancestry from admixture with wild African auroch, with the rest being Fertile Crescent domesticate in origin.

As it is well known, African breeds also show variable frequencies of indicine (zebu) ancestry, which is c. 0-20% in West Africa and as much as 74% in some East African breeds, owing to greater exchanges with Asia in historical times.

... we revealed two clusters of indicine ancestry possibly resulting from the previously suggested two waves of indicine importation into Africa, the first occurring in the second millennium BC and the second during and after the Islamic conquests [25], [34], [48].

However the study notices that, after controlling for the African wild auroch's admixture effect, the appearance of indicine admixture in some breeds collapses to zero (and is reduced in other cases):

Thus, we conclude that contrary to the assumptions and conclusions of [55] cattle with pure taurine ancestry do exist in Africa.

Other results are a confirmation of SE origin of European cows, a specific founder effect in Europe for shorthorn breeds and significant (8-23%) African admixture in Iberian breeds. Some American breeds are indeed a colonial mix of taurine and indicine.

Figure 5. Worldwide map with country averages of ancestry proportions with 3 ancestral populations (K = 3).
Blue represents Eurasian Bos t. taurus ancestry, green represents Bos javanicus and Bos t. indicus ancestry, and dark grey represents African Bos. t. taurus ancestry. Please note, averages do not represent the entire populations of each country, as we do not have a geographically random sample.

May 22, 2013

Ancient West Siberian mtDNA

Kristiina called my attention recently to this open access article on the ancient mtDNA of a district of South-Western Siberia known as Baraba.

V.I. Molodin et al., Human migrations in the southern region of the West Siberian Plain during the Bronze Age: Archaeological, palaeogenetic and anthropological data. Part of a wider book published by De Gruyter (2013). Open access → LINK

Fig. 1 - click to expand

Quite interestingly we see in the data that before 3000 BCE this part of Western Siberia (see locator map at the right) shows already signs of West-East admixture, much earlier than Central Asia did.

This fact is consistent with the apparently old admixture detected among the Khanty in autosomal DNA and also with the Epipaleolithic presence of East Asian mtDNA (C1) in NE Europe and the putative Siberian origins of the Uralic family of languages and Y-DNA haplogroup N in NE Europe.

Fig. 2 (left) | Chronological time scale of Bronze Age Cultures from the Baraba region
Fig. 3 (main) | Phylogenetic tree of 92 mtDNA samples obtained from the seven Bronze Age cultural groups from the Baraba region. Color coding of the groups as in Figure 2

The Ust-Tartas culture is part of the wider Combed Pottery culture, usually thought to be at the origins of Uralic peoples in NE Europe and Western Siberia, and shows an almost balanced apportion of Eastern lineages (C, Z, A, D) and Western ones (U5a, U4, U2e), suggesting that the process of admixture was by then already consolidated.

However the Odinovo cultural phase shows a change in this trend, with a clear hegemony of Eastern lineages (notably D) and almost vanishing of Western ones. Trend that continues in its broadest terms in the Early Krotovo phase.

Odinovo is part of the wider phenomenon known as Seima-Turbino, initiator of the Bronze Age in wide parts of Northern Asia and believed to be original of Altai. However the lineages do not correspond at all with the Altaian Bronze Age genetic pool, fully Western in affinity, excepted those from Mongolian Altai, which are all D. Hence the apparent demic replacement happening in this period must have been from the Mongolian part of Altai or some other region and not the core Altai area.

The oriental affinity of Early Krotovo is instead caused by a more diverse array of lineages (less D more CZ and A), which is interpreted materially as reflecting migrations from Northern Kazakhstan (Petrovo culture). However, as mentioned before the known mtDNA pool of Central Asia in that period is completely of Western Affinity, so we must in principle discard Kazakhstan as the origin of the probable demic flows.

Let me here mention that the authors insist on continuity through these three phases, however I see a very different picture in the same data, with Western lineages almost vanishing with Odinovo and Eastern ones clearly changing in frequency well beyond reasonable expectations on random fluctuations.

It is only in Late Krotovo when Western lineages reappear in significant numbers, probably reflecting, now yes, migrational flows from the South. This trend is clearly reinforced in the Andronovo, Baraba Late Bronze and transition to Iron Age phases, suggesting growing influence from Andronovo culture (early Indo-Iranians).

August 13, 2012

Neanderthal allele located in non-African modern humans

This should be of some interest:

Fernando L. Méndez et al., A Haplotype at STAT2 Introgressed from Neanderthals and Serves as a Candidate of Positive Selection in Papua New Guinea. AJHG 2012. Pay per view (for six months) ··> LINK [doi:10.1016/j.ajhg.2012.06.015]

Abstract

Signals of archaic admixture have been identified through comparisons of the draft Neanderthal and Denisova genomes with those of living humans. Studies of individual loci contributing to these genome-wide average signals are required for characterization of the introgression process and investigation of whether archaic variants conferred an adaptive advantage to the ancestors of contemporary human populations. However, no definitive case of adaptive introgression has yet been described. Here we provide a DNA sequence analysis of the innate immune gene STAT2 and show that a haplotype carried by many Eurasians (but not sub-Saharan Africans) has a sequence that closely matches that of the Neanderthal STAT2. This haplotype, referred to as N, was discovered through a resequencing survey of the entire coding region of STAT2 in a global sample of 90 individuals. Analyses of publicly available complete genome sequence data show that haplotype N shares a recent common ancestor with the Neanderthal sequence (∼80 thousand years ago) and is found throughout Eurasia at an average frequency of ∼5%. Interestingly, N is found in Melanesian populations at ∼10-fold higher frequency (∼54%) than in Eurasian populations. A neutrality test that controls for demography rejects the hypothesis that a variant of N rose to high frequency in Melanesia by genetic drift alone. Although we are not able to pinpoint the precise target of positive selection, we identify nonsynonymous mutations in ERBB3, ESYT1, and STAT2—all of which are part of the same 250 kb introgressive haplotype—as good candidates.

According to Wikipedia, STAT2 is a gene which offers immunity against adenoviruses, which are related to respiratory illnesses (common cold, pneumonia, bronchitis...) and some other common diseases like conjunctivitis or gastroenteritis. Most infections are mild and require no specific treatment.

This suggests me that the selective pressure was quite weak, if any at all, so its introgression is most likely the product of a fluke and not selection. However it is not totally impossible that in the past some viral strain was particularly deadly causing adaptive selection.

(Slashed out text is edited: wrong notions)

~~Whatever the case it is also interesting to take a look at SNPedia, which lists five SNPs in this gene:~~

~~Rs1883832 - whose T variant is almost exclusively non-African~~
~~Rs4810485 - whose T variant is also almost exclusively non-African~~
~~Rs2066808 - whose T variant is dominant outside Africa but also somewhat common in Africa~~
~~Rs1927914 - whose T variant is dominant outside Africa but also somewhat common in Africa~~
~~Rs10983755 - whose A variant is almost exclusive of East Asians~~

~~I can only imagine (as I have not got access to the paper, so I can't double check) that the introgressed haplotype includes the first two SNPs in their T variants~~ (see below). If so, the Neanderthal allele should cause: increased risk of osteopenia in women, some increase in the likelihood of lymphoma, among other things (arthritis, asthma?) which I'm not sure about. I do not see any indication of the haplotype being beneficial in any way but you tell me.

Hat tip to Jean.

Update: I just got a copy of the paper, so I share these key figures:

We can see in them that the genomic positions at 55,030,689, 55,030,712 and 55,036,471 do not seem to correspond with the SNPs listed in SNPedia (so my previous inference was wrong, it seems).

We can also see in the map how the haplotype N is distributed in what would seem to be random founder effects.

There is a chance that the Denisova variant (haplotype D) is found in some Papuans but being described by just a single transition this is not certain.

As you know I dislike molecular-clock-o-logy, which I consider close to pseudoscience but considering that there has been some paper recently claiming (as they usually do: as if it was rocket science instead of a mere educated guess) low divergence ages for Neanderthals and H. sapiens, I feel almost obliged to mention that this paper estimates the haplotype divergence at some 500-731 Ka., what, after correcting for the usual under-estimate of the Pan-Homo divergence, can be consistent with the classical archaeological understanding of the Neanderthal-Sapiens divergence before a million years ago, with the spread of Acheulean and H. heidelbergensis.

December 12, 2011

On the origin of mitochondrial macro-haplogroup N

The notion that the migration of Homo sapiens out of Africa had to pivot around West Asia has been deeply entrenched in our minds, partly because geographical common sense, partly because Eurocentrism, partly maybe because of the Judeo-Christian-Muslim religious background of most influential researchers historically...

However in the last years this idea has been challenged by the coastal migration theory that proposes a migration mostly along the coasts of the Indian Ocean rather than through the interior of Asia. This theory was first outlined by population geneticists, who needed to explain the facts of haplogroup distribution in Eurasia, not at all more diverse towards the West, as we could expect from the classical models pivoting around the Fertile Crescent, but rather towards the East and very specially in South Asia. Later it has been also corroborated, with lesser shadings maybe, by archaeologists who have sought material support in Arabia and India and found it.

While the origin of mitochondrial macro-haplogroup M in South Asia is seldom contested, that of its "sister" N is seldom agreed upon. The reason is that it is distributed somewhat evenly through all Eurasia, Australasia and even America.

This map, from the Metspalu 2005 paper (open access), illustrates the issue and how even renowned geneticists doubted not long ago on where to place the urheimat of the haplogroup:

The phylogeny has anyhow been refined in these six and a half years and you may notice that Australasia is not even included in the map, although it does play an important role, being surely more important than West Eurasia. In any case the map is illustrative of this state of confusion. Confusion that I will try (once again and hopefully for good) to dispel in this article.

The facts of mtDNA N

Macro-haplogroup N has 15 acknowledged basal haplogroups scattered through all Eurasia and Aboriginal Australia. They have diverse numerical importance but what matters to me here is how many mutations (coding region transitions, to be more precise) they are downstream of the N node. Why? Because this is surely indicative of the timing of their respective expansions in relation with N as such.

Looking at this measure we find the following classes of N sub-haplogroups:

Elder daughters: one coding region mutation downstream of N: N1'5, N9, N11, S and R. Notice that among these R holds a special place, not for any phylogenetic reason but because it has a scatter as wide as that of her mother N, suggestive of a very early coalescence and some sort of association between both expansions.
Two mutations downstream of N: N10 and O.
Four mutations downstream of N: N2 (incl. W), A and X.
Extremely long stems, rare clades without any known node under N: N8, N13, N14, N21, N22.

This distinction is not very important but I have always present in any case, because it implies that the various classes of subhaplogroups expanded at different moments after the N node. Notably there is a "pause" at the place of the third mutation and then after the fourth. So we can well imagine the expansion of N as a double explosion, first the two first categories and then the third and maybe the fourth.

Representing each haplogroup as a dot, where they might have coalesced (often a hunch within the local region), the result is as follows:

1.- Estimated coalescence of basal subhaplogroups of N

The size of the dots represents only the "class", that is: how many mutational steps they are under N, the larger the closer they are and the earlier they must have coalesced (according to the laws of probability). The peculiar macro-haplogroup R (whose approx coalescence location was estimated in the past and I will not explain here) has been painted of a lighter blue and given a slightly larger size.

I have also outlined the cloud of N expansion at mutational steps 1 and 2 (no difference), which are followed by an apparent pause at mutational step 3, as mentioned above. The cloud has been pushed northwards a bit in East Asia in order to avoid disputes on where exactly did N9 coalesce (it does not make much of a difference if you prefer Beijing over Shanghai for this clade's coalescence in the end).

Notice that this N cloud is almost identical as would be the M cloud (not shown but look here for a reference if you wish). Whether they were simultaneous or, as I think, N coalesced and expanded a bit after M did, their geography was the same: South Asia, East Asia and Australasia without distinctions. This T-shaped region (with the East on top) was the homeland of the first Eurasian (or more properly non-African) population of Homo sapiens (excepted those who remained in Arabia, which are another story).

The geographic origin of N

Alright, I have described the scatter of N subhaplogroups and the most likely sequence of the expansion but my main purpose here is to estimate the origin, the urheimat of N: where did the N matriarch, the ultimate matrilineal ancestor of all N people today, live?

I apply the statistical principle by which the derived basal haplogroups should tend to remain not too far away from the common origin. Being the most removed ones, exceptions and never the rule. It does makes sense, right?

Hence if we can estimate the centroid of the geometry described by the 15 haplogroups, we will have found the origin of N - or at least a raw estimate of it. There are several methods to estimate centroids but I chose to use the geometric one. In fact, for simplicity, I divided the subhaplogroups in three sets of five (so they all weight the same) and estimated their centroids by geometric decomposition. Then I estimated the centroid of the resulting triangle.

If I am correct the raw centroid of N is at the lower Mekong:

2.- Possible origins of mtDNA N (blue flowers): A - 'raw' geometric centroid, B - corrected against directionality.

I have argued on occasion that, in order to compensate for the directionality of the expansion, a correction can be applied to the geometric centroid or raw estimate of the origin. This correction should pull the origin towards the parent node, in this case L3 in East Africa (estimated here). How much? Maybe 1/4, maybe 1/3... this step, even if probably very reasonable, is a guess and not rocket science. Here I chose to use 1/4 and then look for the closest coast, which is that of Bengal - alternatively I can use a crooked line that follows the geography and get the same result (even less ambiguously Bengal again).

If I would have chosen a 1/3 value for the correction, it would fall in a more central part of India, if 1/5 in Burma surely. We can't be sure of where exactly that happened but we can be more than reasonably sure that it was between India and Cambodia.

And nowhere else: not in West Asia, not in Altai... thanks for the suggestions but I have heard that before... many times... always without a single piece of evidence nor well-reasoned backing of any sort.

The data says otherwise: around the Bay of Bengal or even further East maybe.

Getting R into the picture

I have said before (and is obvious for anyone interested on population genetics) that mtDNA R is peculiar. While it is not different phylogenetically from other subclades of N which are separated by just one coding region mutation, its geographic distribution is very different, because R, like its mother N, is everywhere.

In order to show it more clearly, I drew approximate origins of all basal R-subclades (in lighter blue). The size of the circles follows the same logic as do those of N above, representing only the distance from the mother node (R in this case, what means one step further downstream in relation with N), and hence a probable order of coalescence:

3.- Scatter of N (deep blue) and R (cyan) subhaplogroups. The flower indicates the possible common origin.

The scatter of R fits very curiously within that of N(xR). They do not overlap too much maybe and it looks on first sight like R could have pushed other N around to the margins of the common expansion cloud. However this does not seem to happen with M, so maybe another explanation is needed, like undifferentiated N and R traveling together, mostly under the leadership of the latter and causing different founder effects in different locations.

Whatever the case it is worth a good meditation, because it is possible that both haplogroups (mother N and daughter R) coalesced in rapid succession in a single region (Bengal probably).

April 15, 2011

Stoning at Dmanisi

H. georgicus

Lots of cobbles have been found in a gully near Dmanisi, Georgia, the oldest known site of Homo erectus (or habilis or georgicus) in Eurasia. The cobbles could not have arrived there naturally and are not found in other areas of the site (Olduwayan style tools have been found instead).

The researchers suspect that the abundance of cobbles in the gully was caused because it was a pass our ancient relatives used to attack carnivores, by stoning them, and rob them their prey. The unusually high frequencies of carnivore bones in the site would seem to support this kind of strategy and ratify our earliest relatives as active scavengers who could rob their prey to other predators by using the skills nature gave them: sociality, intelligence and two free hands.

Full story at The Great Beyond (found at Archaeology in Europe).

December 23, 2010

Denisova hominins, Neanderthals, Melanesians and so on...

A new "bomb" has been dropped by the Paabo team and their Neanderthal Genome Project. This review is just a very preliminary approach to really heavy material, dealing essentially with the autosomal DNA of the Denisova hominins, now sequenced, but also with their relations with Neanderthals and us.

A tooth found in the same cave carried mtDNA very similar to that of the finger bone. The tooth is morphologically distinct from both H. sapiens and H. neanderthalensis.

David Reich et al., Genetic history of an archaic hominin group from Denisova Cave in Siberia. Nature 2010. Pay per view (supplementary material is freely available).

Denisova closer to Neanderthals?

NJ tree

You are by now probably familiar with the Denisova hominin, a mere finger bone found in a cave from Altai (Mousterian context). While the mitochondrial DNA placed Denisova's lineage almost twice as distant as our divergence from Neanderthals, the autosomal DNA makes Denisovans a closer relative to Neanderthals (left).

However I'd take this with a pinch of salt because autosomal DNA is subject to admixture and may therefore indicate a hybrid population or even individual.

For example it could well be the case that Denisovans were a hybrid population of H. erectus and H. neanderthalensis (or a related species such as H. heidelbergensis). Or your best guess.

Also you may notice that in the above tree H. sapiens populations appear unusually divergent. This is not a distortion of this graph only, but it is also sustained when the Chimpanzee outgroup is taken into account, yielding age estimates for the autosomal divergence of our species that are several times older than that achieved by comparison of haploid lineages or justified by the archaeological record.

So I am quite uncertain on how to read this and if errors are happening that cloud our understanding.

Are Melanesians more admixed with Denisovans? Are Native Americans less admixed with Neanderthals?

The possibility of Melanesians being slightly admixed with Denisovans is probably the most explosive aspect of the paper. Following Supplementary Information 8, the authors find some greater similitude between Melanesians and Denisovans than any other Eurasian population.

A visual explanation is in the following eigenvector graphs:

Notice that the second image is nothing but a high resolution zoom of the central clump in the first one (H. sapiens). Only at such high resolution three micro-clusters can be noticed, apparently reflecting different admixture levels with Neanderthals and Denisovans.

To further clarify this matter, the authors resort to statistical methods that confirm these clusters and maybe add some information on several individual populations' admixture levels (not anymore just the four Eurasian populations represented above but also others). These calculations show that effectively Melanesians are slightly but significantly closer to Denisovans, while also retaining the general Neanderthal admixture of all non-Africans (or almost all).

And I say almost all because the Karitianas (a Native American nation of Brazil) are found to have much lower Neanderthal blood than other non-Africans.

The estimates for Neanderthal admixture in Eurasians are overall of c. 3%, with the following variations:

Cambodian 4.4%
Mongolian 4%
Han Chinese 3.2%
French and Sardinians 2.6%
Melanesians 2.5%
Karitianas 0.9%

Additionally Melanesians have c. 4.8% of Denisovan genetic contribution, totaling c. 7.4% of archaic admixture.

Update (Dec 25): it may well be only 4.8% of total archaic admixture if Denisovans were hybrids of Neanderthals and H. erectus (see here - scroll to near bottom).

Note: I have a technical doubt because in table S8.2, French appear quite closer to Neanderthals than Sardinians, who seem less admixed than all other non-Africans but the Karitiana, but in table S8.3 they are given the same values of admixture. At the moment I do not understand why this difference in the values, really.

In the same table S8.2 French, Han and Cambodians (and only them) also appear to show some admixture with Denisovans, though maybe a third or fourth of that of Melanesians.

Affinities of the Denisova tooth, chronology of the Denisova cave.

In Supplementary Information 12, the authors deal with the possible paleo-anthropological affinities of the Denisova tooth (a molar), finding that it is closest in morphology to those of Australopithecus sp., H. habilis, African (but not Chinese) H. erectus and (oddly enough) Oase 2 (a H. sapiens that does not cluster with the rest of our species in this aspect).

Indonesian H. erectus is also very close if it is a second molar but not if this is a third molar.

H. sapiens (other than Oase 2), H. neanderthalensis, Chinese H. erectus, H. georgicus (Dmansi), H. antecessor/heidelbergensis (Atapuerca) do not cluster in any case.

In this section, they also deal with the radiocarbon chronology of the site, concluding that:

... we propose the following scenario: a first hominin occupation of the cave more than 50,000 radiocarbon years ago by the Denisova hominins, and a second occupation during the Upper Palaeolithic, at 30,000 years BP or later, probably by modern humans.

Feel free to discuss.

Update (Dec 23): Denisova mtDNA "modern"?

Dienekes mentions today that Niccolo Caldararo has published an article at Nature (freely available as PDF) suggesting that the Denisova mtDNA sequence may be corrupt. If this would be true, then the sequence would be that of a H. sapiens.

This could explain some of the anomalies in the autosomal NJ tree and related age estimates, that would make Chinese and French (for instance) diverging by more than 500,000 years, what is totally absurd.

However, considering that a very similar sequence was successfully sequenced also for the tooth, this claim seems less likely.

Still many questions remain open because there are issues such as the divergence estimates for various H. sapiens, specially Eurasian H. sapiens, that just do not make any sense at all. So I'd say it's best to lay back a bit and wait patiently for more brilliant insights, which will no doubt come.

Update (Dec 25): see this new review for a more elaborate review of mine on this matter, including some intriguing hypothesis I am launching, partly on feedback provided by commenters.

December 22, 2010

Horse had multiple domestication events (ancient equine mtDNA)

German researchers have successfully retrieved mtDNA from 85 ancient specimens from diverse Eurasian regions and periods, ranging from c. 12,000 BCE to the Middle Ages. They have found that most of the extant mtDNA diversity in the species existed before domestication.

M. Cieslak et al., Origin and History of Mitochondrial DNA Lineages in Domestic Horses. PLoS ONE 2010. Open access.

Abstract

Domestic horses represent a genetic paradox: although they have the greatest number of maternal lineages (mtDNA) of all domestic species, their paternal lineages are extremely homogeneous on the Y-chromosome. In order to address their huge mtDNA variation and the origin and history of maternal lineages in domestic horses, we analyzed 1961 partial d-loop sequences from 207 ancient remains and 1754 modern horses. The sample set ranged from Alaska and North East Siberia to the Iberian Peninsula and from the Late Pleistocene to modern times. We found a panmictic Late Pleistocene horse population ranging from Alaska to the Pyrenees. Later, during the Early Holocene and the Copper Age, more or less separated sub-populations are indicated for the Eurasian steppe region and Iberia. Our data suggest multiple domestications and introgressions of females especially during the Iron Age. Although all Eurasian regions contributed to the genetic pedigree of modern breeds, most haplotypes had their roots in Eastern Europe and Siberia. We found 87 ancient haplotypes (Pleistocene to Mediaeval Times); 56 of these haplotypes were also observed in domestic horses, although thus far only 39 haplotypes have been confirmed to survive in modern breeds. Thus, at least seventeen haplotypes of early domestic horses have become extinct during the last 5,500 years. It is concluded that the large diversity of mtDNA lineages is not a product of animal breeding but, in fact, represents ancestral variability.

The paper provides ample insight on the various haplotypes and where they are found first. Not all lineages are from the putative area of first domestication (the Eurasian steppe) but they are also from other regions (East Asia, mainland Europe, Iberia).

Fig. 2 - Ancient horse mtDNA haplotypes with timeline and region

Pottoka

They also researched primitive horse breeds. 39 haplotypes were confined to one of these breeds. Among them are notable those of the Iberian peninsula (Lusitano, Marismeño, Cartujano, Garrano), which appear to have roots in pre-Neolithic local wild horses. ~~Similarly the Basque pony known as pottoka also has roots in ancient mainland European horses (X1, derived from D)~~. [Correction (Apr 6 2011): Pottoka's matrilineage X1 is of apparent Siberian origins: C - only attested in one individual. X1 as such is only documented since the Iron Age, in mainland Europe].

Other primitive breeds with unique haplotypes are Arabian, Cheju, Akhal Teke, Sicilian Oriental, Yakut, Debao and Fulani. The authors argue that the Altai Mountains and the Takla Maklan and Gobi deserts were barriers partly impeding the genetic flow from the Eurasian steppe to East Asia, where several of these breeds belong.

Update (Apr 6 2011): see also to this more recent post: Horse's double origins, on new research supporting by means of autosomal DNA diversity the double origin in the steppes and SW Europe of modern horses.

Pages