Fu 2013: new ancient mtDNA sequences and "molecular clock" madness

It took me quite a while to get time to look at this study in some depth and when I finally did I must say I was rather disappointed. In any case the popular demand makes necessary to discuss it.

Qiaomei Fu et al., A Revised Timescale for Human Evolution Based on Ancient Mitochondrial Genomes. Current Biology 2013. Pay per viewLINK [doi:10.1016/j.cub.2013.02.044]

The study has two aspects: one, of great interest, which is the sequencing of a number of ancient remains, the other a complex and quite poorly explained and rendered speculation on how these sequences could be used to produce a refined molecular clock. 

Ancient mtDNA sequences

Most of the sequences used by Fu et al. in their molecular clock speculations are new and that part is very interesting:

I have highlighted in lime green the new sequences, otherwise also noted by the marker b. It is of note that the "Crô-Magnon 1" sequence produced a C14 age of just a few centuries, being therefore removed from the collection. Other Crô-Magnon 1 remains produced no useful data. 

The authors also decided to discard as possibly contaminated the UP sequence  from Pagicci Str. 4b. I have highlighted in red why they decided to do so: because the C→T misincorporation rate, characteristic of ancient remains, is too low, what makes contamination at least a serious probability. 

So we have as new data for the Upper Paleolithic landscape in Europe that the people of Dolni Vestonice carried lineages U* (found also in Swabian Magdalenian) and U8, in the line of haplogroups K, U8a (Basque) and U8b (Eastern Mediterranean). Also some late UP and Epipaleolithic sequences from Oberkassel (Low Rhineland, Germany), Loschbour (Luxemburg) and Continenza (Abruzzo, Italy) are U5b variants, consistent with other findings from various parts of Europe. In Paglicci (Apulia, Italy), another sequence yielded U2'3'4'7'8'9, surely an extinct variant of the ancestor of U8 and U2 (among other lineages). No radiocarbon date is available for any of the Italian remains.

In East Asia, Boschan, with B4c1a, provides one of the first Epipaleolithic sequences for the region. 

Molecular clock madness

The authors seem to intend, or so declare, to refine the molecular clock estimates by means of using these sequences as intermediate calibration references. Here I get the first big question: with all the literature on ancient DNA, why only these sequences? No idea.

Then the contradictions arise. I believe that I have synthesized the most obvious ones in the following marginal annotations (in red) to their molecular clock estimates:

Furthermore, the authors claim in the text that U5 is the oldest branch to diverge from U, however their TRMCA figure is of only 34.4 Ka BP (coding region), while Kostenki 14 has an age of 38 Ka BP and already carried U2, what really makes this claim extremely unlikely: U2 and its ancestor U2'3'4'7'8'9 should be considered the oldest U sublineage. 

I do not understand either why they force age estimates for many lineages for which they have no working aDNA references and instead desist of estimating the age of lineages for which they have several calibration points, like U2'3'4'7'8'9 or B4'5 (aka B). 

In brief: the claims of this paper on molecular-clock-o-logy are ill-explained, confusing, incoherent... a total mess. The raw data on ancient mtDNA is however good looking and of doubtless interest.


  1. I recently found this paper (in fact, it is a 2013 paper) http://link.springer.com/content/pdf/10.1007%2Fs13219-012-0069-z.pdf The paper contains an endocranial study of Cro-Magnon 1. But now that is clear that Cro-Magnon I is a fossil from the Iron Ages, i think most of the (more specific) conclusions of the study are obviously wrong (they assume it is at least a Gravettian skull). I wonder how many studies, models and hypothesis about skeletal morphology evolution during upper pleistocene might need a revision.

    1. "Hailat Araka and the South Arabian Neolithic" http://onlinelibrary.wiley.com/doi/10.1111/aae.12009/references this site see,s t imply that someone studied ancient dna from Arabia.

    2. It'd be interesting to get a copy. Even if it's just genetic testing of modern populations from Dhofar and Mahra, they are interesting enough. It's plausible however from the abstract that they also tested Neolithic aDNA.

  2. Update my comment: A colleague just told me that the sample dated as Cro-Magnon I for the Current Biology paper is not considered being part of Cro-Magnon I fossil anymore. So, i suppose a gravettian date still makes sense.

    1. Indeed. The radiocarbon date corresponds to an specific bone of the Crô-Magnon 1 collection, which has therefore been discarded. It does not affect the whole Crô-Magnon 1 set, which otherwise looks legit - but did not provide any usable genetic material.

  3. In all sincerity it is looking like you may be running out of arguments with respect to the mutation rate issue, at least when it comes to mtDNA, it is quite hard to argue against these ancient mtDNA calibrated rates, and it is only going to get harder, for instance, the new paper out on Euro mtDNA genomes says this in the abstract :

    "Dated haplogroup H genomes allow us to reconstruct the recent evolutionary history of haplogroup H and reveal a mutation rate 45% higher than current estimates for human mitochondria."

    Paul Brotherton et al. (2013)

    ....consistent with the findings of fu et al. (2013).

    1. I have not read Brotherton's paper (care to send me a copy?) so I can't judge his method and therefore his conclusions (which you quote but lack all the details)... but really, Fu and co. are not even considering a single ancient mtDNA H sequence. And, if you care to read the paper they are very messy in their exposition and logic: it's a very poor paper in regards to the molecular clock issues. Each time I tried to read it in depth I was perplexed: where's the logic?, where's the data? And that's largely why I took so much time to comment on it: because I could not understand... until I understood... that it did not have a well explained or even performed logic, except possibly for U5.

      The dates they mention for U5 and U are possibly correct but for the rest... nuthin'!

    2. “(care to send me a copy?) “

      Sorry, only read the abstract so far, I don't have access to it.

      “And, if you care to read the paper they are very messy in their exposition and logic: it's a very poor paper in regards to the molecular clock issues. Each time I tried to read it in depth I was perplexed: where's the logic?,”

      There is nothing 'messy' about it, to the contrary it is a clean and simple method. If it can be verified that a certain ancient remain lived within a certain time in the past, using for example carbon dating, then this time frame becomes an independent data-point . If that same ancient remain can also securely have it's mitochondrial genome sequenced, then by comparing its sequence with that of modern peoples you have another completely independent data-point for the accumulation of mutations between the ancient remain and modern one. Solving with these two independent data-points simultaneously will thus give the rate at which those mutations in the mtDNA sequence accumulated since the time the ancient remain was alive. What is not logical about this?

    3. I'll ask around for the Brotherton paper. I'd like to ponder it properly.

      But about the Fu one, you will have to explain me how is it that you see "a clean and simple method" where I only see confusing statements and conclusions that don't seem to correspond with the stated methodology. As I said above: why no estimate for U2'3'4'7'8'9 or B4'5? What calibration do they use for H, C, J, etc. when they have no such calibration points in their (quite self-limiting) working sample?

      So, yes, probably their method is good enough for U or U5 but all the rest seems spawned out of thin air.

      "Solving with these two independent data-points simultaneously will thus give the rate at which those mutations in the mtDNA sequence accumulated since the time the ancient remain was alive. What is not logical about this?"

      That they have no data points for most of the haplogroups they "resolved" and that they did not resolve most of the haplogroups for which they had two or more data points. U5 (and overall U) is the only haplogroup for which they seem to have applied their method: all the rest is thin air, while they failed to give us estimates for two haplogroups for which they do have enough data points: U2'3'4'7'8'9 and B4'5 (aka B).

      The logic is semi-OK but the application of it is chaotic and certainly not methodical, as one would expect from a study that claims to be scientific. So I'm not even questioning the logic: I'm just saying that they did not use that logic when producing most of the age estimates.

    4. I'm already reading Brotherton's study and I will comment on it this weekend but I can advance that I find it interesting and well explained, even if I have minor differences about their pooling of MNE ("Middle Neolithic", includes: Late Neolithic and Early and Middle Chalcolithic, four cultural layers in total) but is a matter of detail, although it may affect to the interpretation of the data.

      They do argue for a recent mtDNA H age but that does not seem too important and actually, IMO, damages the quality of the study, because they are not considering all the available ancient mtDNA H dates known to date, totally ignoring the Paleolithic and Epipaleolithic ones, which should be used when addressing these matters. They claim "10.9–19.1 kya" for H but there are aDNA sequences known quite older than the lower boundary.

      IF we'd take the average age of 15.9 Ka BP, we'd have to conclude that H originated in Iberia, the only place where H of about that age has been sequenced with total certainty. However there are many reasons to think that H has been sequenced only with HVS-I in other even older cases: Taforalt in Morocco (12,000 Ka BP, Kéfi 2005) and Sunghir in Russia (Gravettian period, H17'27 judging on HVS-I data). These two cases are not well documented but I see no reason to discard them as false either.

      Importantly when discussing H of all lineages we have to understand that it cannot be in most cases detected with near 100% certainty using only HVS-I because H overall and most sub-branches has very few basal mutations, so honest researchers without an agenda and intelligent researchers without mental blinders should always consider:

      (1) The oldest certain known cases (Cantabrian Magdalenian) - which these papers ignore (again), what is simply not acceptable.
      (2) The oldest possible (even if uncertain) cases, some of which I mentioned above (but there are many more with R* that may and is probably H), which should introduce a great margin of uncertainty, to be clarified, if possible, by future research.

      So, unless you are treating the actual evidence with all the seriousness it deserves, don't expect me to take you seriously. It's easy to jump to false conclusions out of (a) intellectual laziness, (b) self-deceit and/or (c) intentional manipulation.

    5. By the way, I just noticed that I already had to disqualify Qiaomei Fu last year for another "junk" paper in which he promoted the Neolithic replacement agenda.

    6. Maju, I think it's about time you abandon your insistence on older dates and reluctance to accept the significance of the Neolithic in terms of human populaton dispersals. The ancient DNA evidence is piling up against you. You have no argument against the molecular clock derived from Fu et al., and now there's even a paper specifically tailored toward mtDNA H. It's best to keep an open mind in this age of ancient DNA.

    7. Yesssir, yes General Lank. ;-)

      Or rather not. Try persuading me with good science first.

      "You have no argument against the molecular clock derived from Fu"...

      Yes I do: they are making up all the lines that are not U5, while not providing the data for the only lines for which they have calibration points (U2"9 and B4'5). It stinks to poor science. Care to read what I wrote above? Care to read the study itself and forge your own opinion?

      Also both U and H are anomalous within R and in general: one has way too many mutations and the other way too few. Both contradict the basic assumptions of molecular clock of random (and therefore roughly equal) accumulation of mutations. Building a "clock" only on U will almost necessarily be wrong.

      "and now there's even a paper specifically tailored toward mtDNA H"...

      Which is inconsistent with the data we know in all what regards to molecular clock. More tomorrow.

    8. Yes I do: they are making up all the lines that are not U5, while not providing the data for the only lines for which they have calibration points (U2"9 and B4'5). It stinks to poor science. Care to read what I wrote above? Care to read the study itself and forge your own opinion?

      What do you mean by "making up"? All of the age estimates, U or non-U, are derived from the same mutation rate obtained by analyzing the branch lengths of the ancient samples. Certainly this is by far the most reliable mutation rate obtained so far; much more reliable than deriving a mutation rate from the age of the human/chimp split, which we don't even know (and even if we did, their method is more relevant to modern human mutation rates). And now there is even another mutation rate and age estimate based specifically on ancient mtDNA H samples. What else do you ask for?

      Fine-tuned mutation rates for specific lineages will be available as more aDNA becomes available. But the basic picture is clear enough already.

    9. How're you going to have the same mutation rate for H and U when they are such extremely different lineages in their mutational behavior?!

      See this please: http://forwhattheywereweare.blogspot.com.es/2011/06/mitochondrial-dna-and-molecular-clock.html

      Where most HV lineages have a mere 5-6 mutations downstream of R to present, most U ones have around 11. And this gets worse when we only consider the coding region. So, in the best pro-MCH hypothesis, HV has half the mutation rate of U on average, while in my own pet theory, there's no measurable mutation rate because of the "cannibal mum" effect: i.e. the dominant lineages tend to drift out almost every novel one. I and a friend spent some time checking equations and it seems to work for any minimally sizable population: fixation is easily achieved and minority haplogroups tend to be driven out quickly, unless the population is extremely small, what evens the odds between mutant and ancestral non-mutated lineages.

      Personally I do not think that the MC hypothesis can be applied to mtDNA easily. Certainly not without carefully considering this issue of unequal lengths, with the intriguing result that it is precisely the star-like haplogroups (i.e. those with clear signs of very fast expansion) which show the smallest average branch lengths (very notably M relative to N and H relative to all R and even R0).

  4. Maju, in these age estimates, do they take into account the Central Asian H? In my previous comment I referred to this remark: “in contrast to that found in Europeans, sub-Hgs H6 and H8 among Central Asian/Altaian populations are characterized by distinctly divergent haplotypes. This finding may reflect a long-time separation of Asian and European H6 and H8 mtDNA pools and/or an earlier expansion of H6 in the eastern part of its present range. Indeed, the coalescence age of H6 in Central Asians is very deep—40,400 years.” The study also says that “The Near Eastern samples cluster together with Central Asian mtDNAs in the sub-Hgs H6b and H8, which are very rare in Europe. The finding is demonstrating a separate flow of maternal lineages south of the Caspian and the Black Sea in addition to well-known long-lasting migrations of pastoral nomads alongside the steppe belt that connects the Danube Basin, over the Pontic-Caspian, with Central Asia, Altay, and Manchuria.” http://mbe.oxfordjournals.org/content/21/11/2012.long

    As for the origin of H, it is indicative, IMO, that the highest frequencies of HV are found in West Asia (Iran, Turkey) and also in Europe. The highest frequency of HV in Europe appears to be in Finland (12%; according to Wikipedia). Hg H has very deep roots is West Asia and the story in Iberia and the expansion from Iberia must be quite recent postglacial event. If there was an important expansion of people from Iberia extending all the way to Asia, it is noteworthy that the haplogroup V, sister haplogroup of H, has quite remarkable frequencies even in the Urals (among Bashkirs who have also ydna R1b, Tatars, Chuvash, Morvinians and Mari).The highest frequency of V is in Northern Scandinavia among the Saami (42%), as we know. This Iberian postglacial expansion theory is based on the high frequency of V among the Basques (12.4%). Other important frequencies are the following: Maris (10%), Sardinians (6%), Slovenes (5.5%), Ukrainians and Poles (c. 4.5%), Germans (4.3%). Do you have an opinion on this?

    As for the Northwestern Russian Yuzhnyy Oleni Ostrov burials and the haplogroup H detected there, according to this research: http://www.sarks.fi/fa/PDF/FA25_3.pdf, radiocarbon dates on human skeletal samples from five burials date the cemetery to approximately 7700–7300 BP, which is about 7000–6200 cal BC. I suppose that these dates have not been taken into account in this Fu paper.

    1. I'll send you a copy if I have your email so you can judge by yourself, Kristina. All the exposition is very sloppy in Fu's paper, so I cannot reply directly to your question of "do they take into account the Central Asian H?" The paper (nor the supp. materials) are not explicit about this nor many many other details.

      We will discuss better this matter surely on Brotherton's paper, which I plan to discuss in length tomorrow, deals directly with H and is much better explained. In this case I have to disagree with their method (especially ignoring the Magdalenian H, never mind Sunghir, etc.) and resulting estimates but at least the study is clear and transparent.

      "The highest frequency of HV in Europe appears to be in Finland (12%; according to Wikipedia)".

      Frequency is not too informative, diversity is instead. It's plausible that Finns have very specific lineages because of founder effects and the well known accelerated drift caused by relative endogamy. Anyhow I presume that they mean HV(xH,V), right?

      "... haplogroup V, sister haplogroup of H"...

      They are not really sisters: H is derived directly from HV-root, while V belongs to HV0, a derived clade, more like aunt and niece. H and V anyhow have no particular phylogenetic relation other than both being part of HV. They have been paralelized in the literature because of certain similitude in distribution (from North Africa to Northernmost Europe) and maybe origins but their actual relation is somewhat oblique in terms of phylogeny. They are not more related than U5 and U4 (or U2, or K) for example.

      "I suppose that these dates have not been taken into account in this Fu paper".

      No idea, as I say the paper is messy and they do not seem to consider actually any ancient H reference at all. Brotherton certainly ignored them, the same he did with the Iberian sequences (notably Magdalenian but also Neolithic). Tomorrow...

    2. PS- I don't seem to have your email. Email me to lialdamiz[at]gmail.com and I'll send you copy of both papers, so you can judge on your own. (Whoever else wants them, also, feel free to ask).

  5. I just checked the frequencies of R1b among the Volga-Ural people (Mari 10%, Komi 16%, Udmurts 3%, Hanti 19%), and it is exciting to observe that there may be a correlation between ydna R1b and mtdna V in Central Asia, although the both Komi, Udmurts and Hanti do not carry hg V and Mansi only 0.7%.

  6. Well, Central Asians seem to carry R1b1a1 (R-M73) which is not found in Europe. If there is correlation between R1b and V, the point of expansion must be somewhere in Caucasus and Turkey and not in Europe.

  7. I will send you my email address! It is true that v was a bit beside the point, but I am just personally interested in its origin and I find its distribution exciting. Yes, Finns should have HV, if this what they say on Wikipedia is correct:
    HV0 and HVSI C16298TDefining mutation C/T at location 16298 in segment I one of the hypervariable segment is labeled as HV0 as of 2012. The percentage of people that tested positive for the above mutation in a study of western European populations in 2002 is given below:
    Population #No % of population
    Finland 50 12%
    Is this HV in Finland then preHV, the ancestor of V? It would be very logical.

    1. It does not have to be ancestral to V but parallel branches like HV0a1 or even V itself. The Wikipedia article is not clear on the matter and the referenced 2002 study is pay per view (so I can't check if they tested for V markers or not). V would test positive for that marker (as part of HV0) but maybe was segregated by another test (or possibly not). Considering that a subclade of V is extremely common among the Sámi, I would imagine that at least part of that HV0 is V (AFAIK most, but not all, European HV0 is V), but I'm not sure.

      In any case, including V, the population on all Earth with the highest frequency of V and HV0 are the Sámi (about 40-50%), but all is a single subclade of V (can't recall which one), clearly caused by founder effect + subsequent drift (very small population all the time).

  8. Yes, that percentage on Wikipedia must be misleading. The percentage of H is 40% and the percentage of HV + V is 6.5%. However there seems to be HV in Finland. I went to the following Family Tree page at the address http://www.familytreedna.com/public/Finland,Finland,Finland/default.aspx?section=mtresults (I am not sure if you're allowed to see it)and found the following frequencies: HV = 9, HV0 = 4, HV8 = 2 and HV9 = 1. The amount of V in this chart was 69 and the amount of H over 700. Two of these HV0 are HV0a1 which seems to be parallel to V! Lines HV6 and HV9 are found also among Eastern and Western Slavs. Many of all different HV lines are found among Eastern and Western Slavs: HV 4 and HV 6 - HV9. http://mbe.oxfordjournals.org/content/25/8/1651.full

    By the way, I checked if I could find any rare LBK lineages in Finland based on that Family Tree page. I could find only one H26c. Instead, I found many other rare lines: H52, H69, H72, H11, H13, H17, H24, H27, H28, H31, H35, H36, H39, H45, H49, H85. I am perplexed. So many rare lines and this is only the list of a small country.

    When we are doing these comparisons between a handful of ancient samples from 1 burial place and a whole nation like the Germans and the French, I would say that there are many uncertainties. I would bet that the genetic landscape was much more patchy in the earlier times, like the linguistic map of Native Americans. If we examine one burial place, it is probably typical of that small place and people are much more related to each other than to people 300 kilometers away.

    1. It is interesting what you say, although it's difficult to evaluate without a wider context (i.e. in which other places do these lineages show up, frequency, diversity if available). HV(xH,V) do exist in Europe and West Asia but, being rather infrequent they are not well studied.

      The more you sample the more rare lineages you will find and NW Europe, including Finland surely for this case, is very densely sampled by companies like FTDNA (this depends on the interest of people and probably also purchasing power, etc.)

      "... I would say that there are many uncertainties. I would bet that the genetic landscape was much more patchy in the earlier times"...

      Very possibly so. And we have very wide blanks.

  9. I think that they should study people in the areas around these burial places and test if they can find any continuity between them and the earlier people.

    I agree with you that H is probably paleolitihic in Europe. HV seems to have spread in Eastern Europe very early, it could well be with the first wave, i.e. 40.000 years ago. H may have developed in Southern Russia and spread to all directions. Iberian H developed into H3, and there seems to be also many other H lines that developed in Europe after the Ice Age. I would say that different H lines were moving with many different ydna lines, but ydna R could be the most important.

    I do not understand why so many want to be descendents of Neolitihic farmers. I really do not understand this zeal to believe that "most" lines are of Neolithic farmers from Middle East. This weekend I had very interesting discussions with a person who is writing a book on the history of dogs. He again said that Paleolithic Europeans were taller and stronger and much better nourished than Neolithic farmers and the later agriculturalists.

    1. AFAIK there is no Paleolithic cultural flow from Russia or elsewhere in Eastern Europe to the rest of the continent, so it is much more likely that H expanded from Central Europe (where it has high diversity) to East and West, either in Aurignacian or Gravettian times (arriving to North Africa with Oranian, which probably derives from the somewhat peculiar Southern Iberian Solutrean with marked Gravettian substrate). However the little aDNA data we have for Central European (late) UP (two samples from a single site) may suggest that U*-CRS was dominant, at least in Swabia, in the Magdalenian period.

      The somewhat more extense sample of the Epipaleolithic period (n=5) suggest an U5+U4 hegemony in the Epi-Magdalenian of that region, which can't be derived from U*-CRS, suggesting the expansion of a different group or groups in that period.

      In any case together they suggest a re-expansion of H into Central Europe after the Epipaleolithic. The first signs of this re-expansion of H (from the Danube?, Balcans?) came with LBK Neolithic but alone explain nothing. Further expansions must have followed and may have arrived from the West, as the strong BB association suggests.

      "I do not understand why so many want to be descendents of Neolitihic farmers".

      I guess it's influenced by the image of "industrious", "civilized" and (very arguably) "peaceful" associated with agriculture in our societies. By contrast the hunter-gatherers would be perceived as "barbaric", "hippie" and "marginal", while the Kurgan invaders may be associated with "violence", "slavery" and "barbarism" as well. I guess that the yeoman farmer is the closest that modern bourgeois mentality can identify with in past times other than traders and crafters, too specific and generally non-dominant.

      So I guess that there is a semi-conscious choice between:
      1. hippie primitive forager
      2. industrious semi-civilized farmer
      3. barbaric warlord raider from the steppe

      A lot of middle-class people prefer option #2. That's surely also the reason why Renfrew detached himself from Gimbutas and promoted the theory of Neolithic Indoeuropean: because, as modern Indoeuropean, he (and his followers) found difficult to identify themselves with the Scythes (probably the closest historical thing to primitive Indoeuropeans) exactly the same they would find always difficult to understand Genghis Khan or Attila.

      Greco-Roman culture, Middle Eastern religion, bourgeois mentality...

  10. I like your typology. It is amusing and probably so true! This friend of mine said that these farmers had to work hard day in day out (with some exaggeration). The Neareastern week is a 1-day rest after 6-days work . I think that in Kalevala (our national epos) there was 1-day rest after 2 or 3 days work. On the other hand, the problem is that so many believe that migration is necessary for cultural diffusion.

    I suggested Southern Russia (or Ucraina), because there was so much U in Central and Northwestern Europe and I linked it with Gravettian culture and the ydna I. Aurignacian culture spread from the East, but I think that H is not that old. It should be HV instead. I suppose that your idea is that this HV developed into H in Central Europe from this Aurignacian HV. Might be. My idea was that H spread in Europe later with ydna R: with R1b to Iberia through Mediterranean and with R1a to more Northern areas. However, it is true that the ancestral R1b came from Turkey and Caucasus, not from Southern Russia. However, I think that R1a took a more Northern route to Europe. It is clear that the answer boils down to the age of haplogroup H!

    As for the other lines, there are already so many of them that are clearly Southern compared to H. There is JT in Iraq/Caucasus area, R0 and N1a in Saudi Arabia, N1b in Egypt, N1c in Gulf area, N2a, I and U7 in Iran, preHV and U3 in the Levant, U6 in Maghreb. Europe is big and we deserve at least a few autochtonous lines, and the best candidates are H, U5. 

    1. As far as I know both Aurignacian and Gravettian spread from Central Europe, with West Asian precursors. Then, since the LGM there are no important flows into Europe detected in the archaeological record, but there are some expansions from SW France (Solutrean first, Magdalenian later...) AFAIK there's not a single expansion from Eastern Europe before Neolithic.

      So the archaeologically-based structure of pre-Neolithic Europe should be somethinng like:
      1. Aurignacian Layer (pan-Euro, maybe different subgroups), Kostenki in Eastern Europe (distinct, later "Aurignacized"). Centrer: Central Europe with West Asian precursors.
      2. Gravettian Layer (pan-Euro). Center Central Europe with West Asian precusors.
      3. Solutrean Layer (SW Europe with possible offshoots to Central Europe, North Africa). Center: Dordogne. Epi-Gravettian persistence in Italy, Eastern Europe.
      4. Magdalenian Layer (SW and Central Europe). Center: Dordogne. Epi-Gravettian persistence in Italy, Eastern Europe. Ahrensburgian-Hamburgian in North Sea area.
      5. Epipaleolithic: complex, ill-understood sometimes, but essentially descendant of previous cultural layers. Expansion to the North.

      The main expansions from Eastern Europe seem to be in the Chalcolithic: (1) Pitted Ware, derived from Dniepr-Don Neolithic (can probably be tracked therefore to Gravettian roots), (2) Kurgan cultures from the Lower Volga (Samara Valley), pre-Neolithic roots unknown (not yet dug). And in parallel to both but further North the Finno-Ugric expansions that can be associated with Combed Pottery.

    2. " I think that in Kalevala (our national epos) there was 1-day rest after 2 or 3 days work".

      That's very interesting and intriguing. In Basque language, monday is said "week's first" (astelehena), tuesday "week's middle" (asteartea) and wednesday "week's last" (asteazkena), as if the original "week" also had just three or four days (aste implies workday, asti instead is leisure, holiday).

      For the record the other weekdays have the following names:
      · Thursday: Osteguna (day of Ost, Ortzi or Urtzi: the personalization of the sky, possibly a foreign deity originally (Uranus?), later assimilated to Jupiter and the Christian God: "et Deus vocant Urcia" wrote the Medieval pilgrim).
      · Friday: Ostirala (fern field of Ost, it was the holiday of the primitive Basque religion: the akelarres or witches' sabbats took place that day).
      · Saturday: Larunbata ("lauren bata" probably: the unification of the four: reference, I guess, to the lauburu or Basque curved swastika, which is NOT an Indoeuropean cultural element but clearly pre-IE everywhere).
      · Sunday: Igandea (igo(-ko) handia?? = the great of above?, seems a Christianized name but very unclear).

      I really wonder in any case if this 4-day "week" was widespread once.

    3. Well, this friend of mine who is writing this book said that there were four days of work. Then it seems to be the same! Our names are from Swedish and the Swedes replaced the names of Latin gods with the names of their own gods with a similar character. Sunday is Sun's day, and you could say in a way that it is something big high up. Saturday is completely different from Latin, it is "washing day". Finns used to bathe in the sauna on Saturday. Nowadays, people bathe in the sauna on any day, even several times a week. As for akelarres, we have instead ghosts' mountain where dead women haunt. In Finnish tradition, it is a place were well-behaved spinters go when they die. It was made of glass, so only virgins could climb it. According to the Swedish mythology, it is a mountain were the witches go on their brooms for witches' sabbat.

    4. This spinsters' mountain is Kyöpelinvuori in Finnish, Blåkulla in Swedish and it has nothing to do with weekdays!

    5. That what you say is again very interesting. Basques and Finns are two very different cultures but it's clear that we share some pre-Indoeuropean traditions, which must have been widespread, and the same kind of cultural hostility from IEs and later Christians.

      In Basque mythology, Mari (goddess) and Sugaar (god) inhabit in caves within mountains, which are locally relevant ones. The religion was chthonic and it was said that they met in (some?) Friday evenings to procreate, generating the storms (Hodei, also Eate, possibly from assimilation with the Celtic god Teutates). In correlation with that mythical recreation of life (storms fertilize the earth) poorly known religious activities took place, apparently involving a black he-goat, whose genitals or arse had to be kissed (so say the inquisitors at least). The black he-goat is iconic of Mari (also red cows, rams, horses), while the snake is associated to Sugaar. The witches or sorginak were both mythical servants of Mari and real people initiated in the cult. Sorgin seems to mean creator or midwife (sor(-tu) is to create but also to give birth).

      Of course, under Christian domination all that iconography was assimilated to Satanism and the negative sense of witchcraft. This kind of beliefs must have been widespread in the pre-Indoeuropean past and we can find them as substrate elements in Greek, Celtic or even Scandinavian mythology. Even in Christianity, the equivalent of the triplet Mari, Sugaar and Hodei in a cave can be found in the Christmas iconography.

      Basque mythology seems to be full of magical beings but almost all are benign... unless mistreated. There's no fear of ghosts or the dead, no known otherworld to worry about, only the mysterious gaueko or inguma sometimes haunts the night (but most normally iratxoak, i.e. "little ferns" or imps, will come by night to plow your farm in exchange of a little food present). The ethics seem to have been of communitary responsibility (witches punished the wrongdoers in the name of Mari).

  11. Thank you for the list! It is a very informative package. So, H does not have much to do with Indo-European. In the Palaeolithic Europe, there were groups of people carrying H and U mtdna and F, G2, I, x R1* ydna and speaking undefined Proto-European languages. Where do you place R1b? Is it Magdalenien to you? I could imagine that R1b language in Europe was Basque-like and, in any case, had a different origin from R1a.

    Are we then back to this steppe hypothesis, i.e. Calcholitic Pitted Ware and Kurgan cultures, rich in R-M198 and mtdna lines such as HV, U5a, W, X, T2 (plain H is uninformative) and speaking this mythical Indo-European proto-language. There seems to have been a steppe language that spread to Europe, i.e. to the Baltic area and the Balcans and from there to further West. I think that Albanian and Lithuanian are closer than any other Indo-European language to the proto-language. Can you detect in the archaeological record that there were people coming from the East to the Baltic area and the Balcans?

    However, I would not connect Indo-European language family with farming. In fact, I find it problematic to connect any language family or ydna with farming. I believe more in cultural diffusion, and all the more so because the main haplogroups seem to have spread before farming.

    I think that European languages must contain substrates from many extinct languages. I also believe more and more in layers and substrates in all existing languages.

    1. I understand that, regardless of genetics, Indoeuropean expanded with the Kurgan complex from Samara Valley in Eastern European Russia (would be Asia for Ancient Greeks, who placed the boundary at the Volga, but whatever). No other hypothesis explains Indoeuropean nearly as well as the Kurgan model does.

      The origin of Samara Valley culture is not known because no digs have gone that deep, i.e. to layers older than Samara culture itself, which is already Neolithic. The Paleolithic of Europe East of the Don basin is not really researched AFAIK, except for the Caucasus area maybe. It's possible that Samara has a similar origin to Dniepr-Don (i.e. Epi-Gravettian) but there are clear cultural differences also, so maybe it has a distinctive origin of some sort.

      Whatever the case, the first thing is to understand the material-cultural prehistory as well as possible, then try to fit genetic data into that frame (not the other way around as many try to do with quite limited success in most cases).

      "Where do you place R1b? Is it Magdalenien to you?"

      IMO the most likely possibility is that R1b arrived to Europe with Gravettian or Aurignacian. One reason is that R1b should be (again IMO) at least 25 Ka old, another is that the largest subclade of it, R1b-S116, appears to have spread from SW Europe, probably Southern France, what almost necessarily demands a Solutrean-Magdalenian chronology (otherwise no such large population expansions are known to stem from that area) - see here.

      "I could imagine that R1b language in Europe was Basque-like and, in any case, had a different origin from R1a".

      There is no such thing as an "R1b language" nor an "R1a language", etc. People change languages but they cannot change genes. However it is true that the Vasconic substrate area (Venneman) and the R1b area have a notable overlap. The main exception is however Italy, which is generally low in R1b but seems to have strong Vasconic substrate. IMO the Balcans also seem to have some Vasconic substrate (for example the Ibar river in Kosovo, ibar=river bank in Basque, akin to Iber→Iberus→Ebro, see also Hevros river in Thrace, words like Serbo-Croat gore=up, so similar to Basque gora=up, reka=river compare with erreka=creek, etc. - ok now "creek" is also suspect of Vasconic roots, although it could also be a non-Latin IE borrowing by Basque).

      So nowadays I suspect that Vasconic languages are Neolithic, regardless of genetics and that Paleolithic languages did not survive in most of Europe. But Paleolithic genetics probably did and in big numbers, although mixed with Neolithic lineages (Y-DNA E-V13, G2a, J2b - also I2a, although this one is of Balcanic European origin surely) and they re-expanded with European-specific Neolithic and Chalcolithic complex dynamic flows. Same in the mtDNA side (most H and U would be Paleolithic but incorporated to Neolithic/Chalcolithic dynamics, which altered their apportions very substantially).


    2. ...

      "I find it problematic to connect any language family or ydna with farming"...

      As I just said, I strongly suspect that the Vasconic family is the one that can be more tightly correlated with European Neolithic of Thessalian roots (i.e. both the Balcano-Danubian inland branch and the Impressed-Cardium Mediterranean one). If Vasconic languages would have to fit in that scheme, then I hypothesize two sub-families:
      1. Balcano-Danubian Vasconic (partly erased in the Balcans by the Vinca-Dinmini secondary migration - from Syria or Kurdistan probably). Replaced by Indoeuropean before or at the beginning of the Bronze Age.
      2. Mediterranean-Atlantic Vasconic, with a secondary Atlantic branch expanding maybe from Portugal with Megalithism (and BB). Basque and Iberian would belong to this subfamily, as would be the case of Ligurian as well. It's plausible to imagine that c. 4000 years ago a language of this family, spoken originally in Southern or Central Portugal, was a lingua franca for trade and maybe also a prestige language for religion and culture (which was then only oral, it seems), spanning from Scandinavia to Iberia and to Italy and North Africa.

      However nothing of this affected in any important way to Eastern Europe, which had its own locally rooted Neolithic (Dniepr-Don centrally but also some other smaller cultures). Danubian only made lesser inroads in SW Ukraine, near Moldova (the famous Cucuteni-Tripolje culture), and the influence of Megalithism in the Baltic is limited to some areas around Denmark.

      "I think that European languages must contain substrates from many extinct languages".

      I do wonder also but it's extremely difficult to discern with so many layers atop of them.

    3. According to Eupedia, the frequency of R1b in North Italy is 55 % and it decreases towards the south, being only 22% in Sardinia. I can't remember the source, but I have the impression that once I read that in Italy there is more R1b on the Alps and Apennines and inland areas. With the Roman empire, Greek colonists, Barbarian invasions, etc., Italy may have had a bigger share of immigrants of all sorts compared to Iberia.
      When you said that "you suspect that Vasconic languages are Neolithic", I did not quite understand it, if you claim that R1b is Paleolithic. Do you mean that Vasconic languages are modern and they have been evolving all the way to the present day and have absorbed many features from cultures around them and may have substrates from other languages?

    4. I guess this map and this blogpost may be referential for R1b frequencies in Italy. However it'd be nice to know the fine print of subclades and for that the best I know is this set of maps from Myres 2010, complemented by my own mapping of his data (R1b only).

      From this it'd seem that the excess of R1b in North-Central Italy is partly R1b-U106 (North clade), which may correspond also partly to Celtic expansion. Instead R1b-S116 (South clade) seems more centered in NW and Central Italy, whatever the reason. There's also some notable R1b-M269(xU106,S116) ("ancestral clade") which must have originated in the Balcans, Central Europe or Italy itself prior to the formation of the main Western clades. So much of that Italian R1b, especially in the South is not from Western Europe.

      It's clearly a different case, although it also seems as if there was some important flow, notably of R1b-S116 (Southern clade), from the West and, within it, mostly R1b-U152 ("Celtic" or Alpine clade) at some point. When? I never found any good explanation, maybe Celts and other Indoeuropeans in the Bronze and Iron ages... or maybe there was some ill-understood proto-Ligurian expansion in the time of Chassey-La Lagozza from SE France.

      But whatever the case more than 1/4 of that Italian R1b in the South and 1/5 in the North is unrelated to Western Europe. So only some 30% of Italian patrilineal ancestry has clear origins in the West or North of the Alps, it seems, whichever they are. By contrast R1b-Western (South or North clade but mostly the Southern one) is almost always >50% this side of the Alps (>80% in Wales, Basque Country, Ireland, Catalonia...).


    5. ...

      "When you said that "you suspect that Vasconic languages are Neolithic", I did not quite understand it, if you claim that R1b is Paleolithic".

      I mean that the influence of some 10-30% Neolithic settlers (in Iberian and French cases) was enough to impose their language, that people switch languages with culture even if they keep their genes and the Neolithic was a major cultural change. It's just a theory but I'm leaning towards it as of late.

      Alternatively (if Paleolithic) we'd have to search for migrational flows Eastward (to Italy especially) that are not too apparent in the archaeological record. But surprise me if you know something I do not, please.

      "Do you mean that Vasconic languages are modern and they have been evolving all the way to the present day and have absorbed many features from cultures around them and may have substrates from other languages?"

      All languages except Esperanto are equally old, well Esperanto too because its based on Latin and other Indoeuropean: all languages have been evolving all the way since humans began speaking. How? That's another story...

      What I mean is that I hypothesize that Vasconic languages may have originated as such in Thessalian Neolithic (from whichever ancestry, maybe some Anatolian language long extinct) and expanded with both main European Neolithic waves: Balcano-Danubian and Mediterranean-Atlantic. But this expansion was only mediated by gene flow from the Balcans at low (but observable) frequencies: that Western Paleolithic peoples absorbed the new language and made it their own, exactly as they absorbed Indoeuropean later, with only occasional and quite minor gene flow associated to it. Similarly when you compare Taiwanese Aboriginals with Malayo-Indonesians you only see minor Taiwanese genetic influence in the latter, or you see other Austronesians in the Pacific with very strong non-Taiwanese genetics (Melanesian instead), even Polynesian main Y-DNA lineage C2a is of clear Melanesian origin. Yet they all speak a Taiwan-derived family of languages.

      I can only imagine that the new Neolithic languages (i.e. Vasconic) may have absorbed a substrate from previous linguistic layers but that discerning it is a most challenging task, especially with only one language surviving to present day to compare with. I'll leave that to some genius phylologist - beats me.

  12. I correct myself. There may have been migrations of farmers, in particular in LBK area, but it is not so easy to distinguish the ydna of the Neolithic newcomers from that of earlier inhabitants, and on top of it, there have been continuous connections between Europe, on one hand, and Africa, Levant and Near-East, on the other, for example during the Hellenic/Hellenistic & Roman times, during the Arab & Ottoman empires etc. Farmers probably carried hgs J, E, G and R, but only a part of European J, E, G and R can be from these people. Perhaps mtdna is more informative in this respect? At least N1a and U3 seem very Near-eastern.
    I get back to that interesting linguistic part later on!

  13. Sorry, I correct myself again. I noticed that LBK people carried ydna F*. What if this European F* is from LBK people. That's again a surprise! There seems to be F* in Caucasus, United Arab Emirates, Jordan and Iraq.

    1. F* is very ambiguous. It's most common (and F overall most diverse) in South Asia but you can also find it in SE Asia and even Africa probably. Just minor lineages of F that are rare and therefore poorly researched: they could mean anything, so they actually mean nothing.

  14. My mtDna is HV9(european) and all of my family is from India from atleast 4 generations (or so i thought) - does HV9 occur in India (southern)

    1. I'm going to check if I find something but the info I have so far is almost nil.

      R0, which is the ancestor of HV (the other large clade being R0a), has probably been in West Eurasia since the earliest Upper Paleolithic. Its origin is clearly West Asia and that's probably also the origin of HV and maybe of H even (less clear). My notes on R0 mentioned: "almost exclusive of West Eurasia and North Africa. Low presence in South Asia". So it's not unheard of in the subcontinent but much less common than farther West. But we didn't document its subclades (not even H nor V) other than in mere annotation (that's probably because we were more focused back then in understanding the global structure and these are quite derived branches, although quite important numerically, less relevant to understand the settlement of Eurasia).

      I would say that you should expect West Asian lineages in South Asia, not just male ones like R1a or J2 (very common) but also female ones. These surely arrived with Neolithic (Iran → Baluchistan → NW South Asia → rest of South Asia), at least in most cases, and should therefore be associated with the so-called ANI (Ancient North Indian) autosomal component, which is closely allied to the Zagros-Caucasus one and therefore clearly intrusive (even if old enough to be considered "native" but definitely not "aboriginal").

      I'll see what I can find about your specific HV9 lineage but I don't expect much info to be available, really. In any case there's no reason for the haplogroup, in principle, not to have been in South Asia since the Neolithic. Also it's likely that, rather than "European", the line originates in West Asia, as happens with many (but not all) HV sublineages.

    2. An interesting study is this one but it's Europe-only, with focus on Italy. Interestingly HV9 (or more exactly: HV-16311, its ancestor) appears to be most common in Sicily (rare in the rest of Italy), which we know from other recent studies that has a significant extra inflow from West Asia compared with any other European populations. This is not conclusive but, while we keep missing, comparable studies in West Asia, it may well indicate a West Asian centrality of the haplogroup - or, at the very least, we cannot exclude it.

      More interesting even is this database of HV9 mitogenomes I found as direct XLS download via Google (it's safe, as it's from PLoS ONE, but unsure of the related paper it seems to back up) because it lists HV9 in at least three West Asian individuals: Turkish, Bedouin (probably from Palestine) and, critically, Iran. You may want to compare your private signature with them. In any case it is clear that HV9 does exist in West Asia, Iran included, so my speculation seems a bit better supported with this dataset.

    3. Thank you - much appreciate your response.

  15. Thank you, great info on R, RO, HV, HVO as I'm HVOF ...so will follow

  16. Thank you, great info on R, RO, HV, HVO as I'm HVOF ...so will follow


