April 27, 2013

Brotherton 2013: cherry-picking the evidence for mtDNA H

Unlike the conceptually akin paper by Fu 2013 (PPV - discussed here), this one is very neatly explained and allows no doubts on how they reached their conclusions. Another thing is to agree with the method being good enough to provide for any conclusions at all. It is still an interesting study on the evolution of mtDNA lineage H in the specific context of the Elba-Saale region of Germany.

Paul Brotherton et al., Neolithic mitochondrial haplogroup H genomes and the genetic origins of Europeans. Nature Communications 2013. Pay per viewLINK. [doi:10.1038/ncomms2656]


Haplogroup H dominates present-day Western European mitochondrial DNA variability (>40%), yet was less common (~19%) among Early Neolithic farmers (~5450 BC) and virtually absent in Mesolithic hunter-gatherers. Here we investigate this major component of the maternal population history of modern Europeans and sequence 39 complete haplogroup H mitochondrial genomes from ancient human remains. We then compare this ‘real-time’ genetic data with cultural changes taking place between the Early Neolithic (~5450 BC) and Bronze Age (~2200 BC) in Central Europe. Our results reveal that the current diversity and distribution of haplogroup H were largely established by the Mid Neolithic (~4000 BC), but with substantial genetic contributions from subsequent pan-European cultures such as the Bell Beakers expanding out of Iberia in the Late Neolithic (~2800 BC). Dated haplogroup H genomes allow us to reconstruct the recent evolutionary history of haplogroup H and reveal a mutation rate 45% higher than current estimates for human mitochondria.

Let's deal with the interesting part first and then with their impossible molecular clock speculations. 

All the samples used in this study belong to haplogroup H as you can see in table 1. This does not allow to consider the overall apportion of H in each population, for which we would need to go to the original studies. For example in the region's LBK samples, H was just some 20% of the total, what alone talks of a population that was not at all like the modern one, never mind N1a. On the opposite side of the spectrum are the Bell Beaker (BBC) samples, where H made up 88% of the total (Adler 2012, discussed here), again non-modern but a possible source of H increase in frequency. 

We must keep in mind all the time that in this study only H is considered, with all the derived pros and cons. 

Maybe the most interesting result is therefore the comparison with modern populations done in fig. 2a:

Figure 2 | Population affinities of select Neolithic cultures. (a) PCA biplot based on the frequencies of 15 hg H sub-haplogroups (component loadings) from 37 present-dayWestern Eurasian and three ancient populations (light blue:Western Europe; dark blue: Central and Eastern Europe; orange; Near East,Caucasus and Anatolia; and pink: ancient samples). Populations are abbreviated as follows: GAL, Galicia; CNT, Cantabria; CAT, Catalonia; GAS, Galicia/Asturia; CAN, Cantabria2; POT, Potes; PAS, Pasiegos; VIZ, Vizcaya; GUI, Guipuzcoa; BMI, Basques; IPNE, Iberian Peninsula Northeast; TUR, Turkey; ARM, Armenia; GEO, Georgia; NWC, Northwest Caucasus; DAG, Dagestan; OSS, Ossetia; SYR, Syria; LBN, Lebanon; JOR, Jordan; ARB, Arabian Peninsula;ARE, Arabian Peninsula2; KBK, Karachay-Balkaria; MKD, Macedonia; VUR, Volga-Ural region; FIN, Finland; EST, Estonia; ESV, Eastern Slavs; SVK, Slovakia; FRA, France; BLK, Balkans; DEU, Germany; AUT, Austria, ROU, Romania; FRM, France Normandy; WIS, Western Isles; CZE, Czech Republic; LBK, Linear pottery culture; BBC, Bell Beaker culture; MNE, Middle Neolithic.

BBC (Bell Beaker) and LBK (Linear Pottery Culture) are clear-cut cultures in this graph. However MNE (Middle Neolithic) is a pooled agglomeration of several not too related cultures from the Late Neolithic and Early and Middle Chalcolithic. So, using the haplogroup vectors (grey), I remapped its unlikely components:

Fig. 2a annotated by Maju: green "MNE" cultures, grey: other cultures. Dotted circles just for reference.

Suddenly the mirage of modernity and homogeneity in MNE's H collapses, very specially for Salzmünde (2/2 H3) but really also for the other components of the MNE pool: Rössen (directly derived from LBK) appears here as Balcano-Estonian and similar to Bronze Age Sardinia, Schöningen (derived from Rössen) appears Norman French and close to the original LBK pool, the first Kurgan culture in Central Europe, Baalberge, is the only one really close to the MNE dot but its closest modern relatives are NE Iberians (IPNE), while its successor Salzmünde is "hyper-Iberian" much as Bell Beaker after them - however the intermediate Corded Ware, C.W., leans back to the right and appears Catalan.

No conclusions can be inferred from this, for that we'd need to compare whole genetic pools and not just H, which is minority in most ancient samples but for whatever is worth... I made yet another annotated version of this graph:

Fig. 2a annotated by Maju: changes in Central European mtDNA H composition along time (arrows).

I considered here Rössen as different from Schöningen, as Rössen or Epi-Rössen persisted in much of Germany and nearby Alpine areas for long, but feel free to draw or imagine it differently.

Whatever the case the appearance is of gradual "modernization" or "Germanization" of haplogroup H culminating in Baalberge, followed by an "Iberization" of the haplogroup pool in the Middle and Late Chalcolithic, coincident roughly with the expansion of Megalithism and Bell Beaker and just mildly countered by Indoeuropean expansion from the East (Corded Ware, Unetice). Here they mention six Unetice H sequences but, judging on Adler 2012, H was very very rare in this culture at least in the Elbe-Saale area (1/31).

Beyond this I doubt that the paper can provide us with any more enlightenment.


It does provide for some false leads however.

The authors use this Elbe-Saale limited ancient mtDNA evidence to construct a "molecular clock":

Another major advantage of the temporal calibration points provided by ancient hg H mt genomes is that the data allow a relatively precise estimate of the evolutionary substitution rate for human mtDNA. The temporal dependency of evolutionary rates predicts that rate estimates measured over short timespans will be considerably higher than those using deep fossil calibrations, such as the human/chimpanzee split at ~6 million years.

6 million years?! Where have you been in the last five years, Paul? Ahem...

It doesn't really matter but it illustrates the reactionary scholastic inertia that plagues the Academia, very especially in the field of population genetics.

What matters is that they continue as follows:

(...) The rate calibrated by the Neolithic and Bronze Age sequences is 2.4 x10⁻⁸ substitutions per site per year (1.7–3.2x10⁻⁸; 95% high posterior density) for the entire mt genome, which is 1.45 (44.5%) higher than current estimates based on the traditional human/chimp split (for example, 1.66 x10 ⁻⁸ for the entire mt genome and 1.26x 10⁻⁸ for the coding region). Consequently, the calibrated ‘Neolithic’ rate infers a considerably younger coalescence date for hg H (10.9–19.1 kya) than those previously reported (19.2–21.4 kya for HVSI, 15.7–22.5 kya for the mt coding region or 14.7–22.6 kya when corrected for purifying selection).

What matters is that by cherry-picking only some sequences of ancient mtDNA H, they are denying themselves (and the rest of us by extension) a realistic calibration of the haplogroup. What happened with the Cantabrian Magdalenian and Epipaleolithic Basque H? What happened with Epipaleolithic Karelian H? Never mind Sunghir's Gravettian H17'27 or Taforalt's massive pool of R*-CRS, most likely H1 (Kéfi 2005), which may be more questionable but never rejected without direct negative evidence.

In other words: they are cherry-picking the evidence. They could argue that the Elbe-Saale data was the only one readily available for them to sequence in full or whatever and that therefore the evidence was cherry-picked by Destiny... but that would not justify in any case the arrogance of their conclusions: they should have been much more humble and admit that this evidence is only part of all the ancient mtDNA H (known or suspected), some of which is clearly much older and therefore much more relevant.

I illustrated this problem using their fig. 1a:

Fig. 1a, annotated by Maju.
(Note: one of the "Magdalenian" H* sequences from North Iberia is actually Epipaleolithic, my error)

In orange color I have marked an alternative minimal "molecular clock" extrapolation using the La Chora H6 sequence (Hervella 2006 open access). This is minimal because I'm assuming this sequence to be underived H6, if it'd be derived (what I don't know), the estimate would be even larger.

I have annotated all the sequences I am aware of ancient confirmed (unquestionable) mtDNA H. There are many more that are very likely, and in many cases older (see maps), but not yet confirmed.

So well, molecular-clock-o-logical pseudoscience again. It's a pity that otherwise respectable scientists pay tribute to this academic fetish.

The molecular clock hypothesis has never been proven, being a mere statistical construct, and it has many problems particularly in mitochondrial DNA, where branches are dramatically unequal, obeying to either: (a) randomness, (b) differential adaptive fitness or (c) ancient population dynamics (variable drift results depending on population size). I discussed some of that here and also here.

I beg here to population geneticists to be more serious and careful and not try to push their ideas against the available evidence. That is not proper of scientists but belongs to the field of ideological propaganda.

Update: La Chora Magdalenian H6 is probably H6a1, with implications for the age estimate of H.

All known H6 of Iberia and all or most of Western Europe is H6a1, while the "famous" Central Asian H6 (very minor overall) is all H6(xH6a), which is also relatively important in Eastern Europe. See Álvarez Iglesias 2009 (open access), especially Supp. Table 3. H6a(xH6a1) has only been detected so far in Austria (oversampled - I miss data from France again).

Brotherton's H6 only sample (Corded Ware) is H6a1a. Álvarez Iglesias did not test for this phylogenetic level, hence would show in his data as H6a1 but he did test for H6a1a1, only found precisely in Cantabria.

So the La Chora H6 Magdalenian sequence can be:
  • H6(xH6a): extremely rare in Western Europe modernly
  • H6a: reported in Austria only (modern sample)
  • H6a1: most common in Western Europe and especially North Iberia
  • H6a1a: like Brotherton's Corded Ware sequence
  • H6a1a1: found only in Cantabria modernly, it seems
  • etc. (PhyloTree allows for some other options)
I already discussed the possible age (using molecular clock theory, calibrated) of H if La Chora H6 would be H6-root. But, considering that H6b and H6c seem to be Eastern European or Central Asian, it seems more reasonable to think it is H6a or downstream of it. What would be the age range of H for the other possible assignations of La Chora's H6, would it be tested for coding region mutations? Let's see:

  • If H6a-root: 47,500 to 24,500 years ago (median: 36,000 BP)
  • If H6a1: 73,200 to 34,800 years ago (median: 54,000 BP)
Of course I do not really think that the molecular clock can be easily applied, if at all, to mtDNA, because the rarity of accumulating mutations poses way too many challenges. But if it had to be applied, as Brotherton, Fu, their teams and some amateurs seem to think, then we'd have to test the La Chora and La Pasiega (and Sunghir and others) for coding region mutations in order to have the most valid calibration points.

Otherwise is like the blind man who touched the trunk of an elephant and imagined it was like a snake.


  1. I may be being dim here but my understanding of what they were saying was the prevalence of maternal H in Europe was caused by an expansion out of Iberia of a particularly H-heavy population. Are you saying you don't think that's what happened or simply that the evidence they have isn't enough to support the idea?

    1. Probably I did not explain properly. They are saying that mtDNA H is c. 14,000 Ka old (I'll edit the text to make that more clear):

      Consequently, the calibrated ‘Neolithic’ rate infers a considerably younger coalescence date for hg H (10.9–19.1 kya) than those previously reported (19.2–21.4 kya for HVSI10, 15.7–22.5 kya for the mt coding region31 or 14.7–22.6 kya when corrected for purifying selection30).

      I'm saying therefore that, by simply looking at ALL the confirmed evidence and not just some of it, you get again those "previously reported" ages, and that the door must remain open for even older ages as there is much other aDNA evidence which is less clear but may well be correct and won't ever be falsified nor confirmed looking only at the Neolithic of the Elbe-Saale region.

      It's like the blind man who touched the trunk of an elephant and said: it is like a snake... Deal with all the evidence before judging.

      I'll edit the article to make that more clear, thanks for the question, Grey.

      As for part of mtDNA H expanding from Iberia in a Chalcolithic time-frame, I believe I was the first person to ever suggest it, so I'm not in disagreement with that proposition. Just that it's not the bulk of the paper nor what they emphasize (or even prove), although the data is somewhat supportive of such model.

      The main novelty here is that Schöningen culture, which is of Megalithic time-frame but derived from a Kurgan culture (Baalberge) as far as I can tell, has all the mtDNA H of SW European affinity (H3), much as Bell Beaker later on (and to some extent Corded Ware in between both of them). But without knowing the whole samples, it's hard to judge what this means.

      It's plausible (and consistent with what I have been thinking on the matter) that mtDNA H expanded with the Megalithic phenomenon and that the 88% H we see in Elbe-Saale Bell Beaker is just derived from this previous Megalithic flow, for which we have only very limited aDNA evidence: some HVS-I from Neolithic Portugal (which appears to be very high in H), a single Megalithic French burial (much like LBK genetic pool) and a trio of sequences from Megalithic Southern Sweden (intermediate). Maybe add to that the Early Neolithic Basque pool which is also quite high in H but not as extreme as Portuguese Neolithic or German BB.

    2. Edited: I hope it's more clear now.

    3. kk ty

      "As for part of mtDNA H expanding from Iberia in a Chalcolithic time-frame, I believe I was the first person to ever suggest it, so I'm not in disagreement with that proposition."

      Yes, that's what confused me but i see what you meant now.

  2. I have some problems accepting that mtDNA H was present in the Iberian Upper Paleolithic based on current data. The Hervella et al. results aren't very convincing, and based on the sequences they might well be U. Indeed, is there any plausible reason why H6 would be found in Pre-Glacial Spain?

    The presence of H1 in Epipaleolithic North Africa (and thus probably also Iberia) sounds more reasonable, but again, those Kéfi 2005 samples need to be tested again at higher resolution to make sure.

    So at this stage, is H older than the Neolithic in Europe? Yeah, possibly. But is it older than the Ice Age? Possibly not.

    1. The methods used by Hervella are widely accepted and have never been questioned. They confirm with all certainty that it's H* and H6. Has it ever been found a single case of R* testing negative for Alu I that is not H? Nope. We are not discussing anymore HVS-I sequences alone but RFLP testing.

      Also I fail to see how an HVS-I sequence for H6 (which requires two transitions: T16362C A16482G) can be U of any sort or anything but H6. This is not anymore the sometimes confusing R-CRS but clearly unquestionable H6.

      But the key test is RFLP testing, about which I have never ever read a single criticism, even if today has been generally replaced by coding region testing.

      "The presence of H1 in Epipaleolithic North Africa"...

      It is Upper Paleolithic. Taforalt remains are dated to 12,000 BP, contemporary with European Magdalenian. Furthermore the culture they belong to (Oranian or Iberomaurusian) began c. 20-22,000 BP, contemporary with and related to Solutrean. It has no relation whatsoever with Magdalenian or European Epipaleolithic.

      And let's not forget Sunghir's double and quite unmistakable H17'27, which is Gravettian. True that here there's only one transition but there are no other known alternatives, so, unless proven otherwise, it is H17'27.

      ... "is H older than the Neolithic in Europe? Yeah, possibly".

      For sure.

      And why "in Europe"? Isn't the paper discussing haplogroup H as a whole. Nowhere I see that they restrict it to any region, which would be in any case the Elbe-Saale one, and not "Europe".


    2. ...

      "But is it older than the Ice Age?"

      Probably but so far untested. While the Magdalenian samples in Cantabria are the oldest 100% confirmed ever H in all Planet Earth, there are many more likely H from the LGM and even Gravettian time (there are almost no pre-Gravettian sapiens remains in Europe). The most notable case is Sunghir because it is two identical sequences (surely direct relatives) and all we know about that HVS-I marker is that it defines H17'27, no other haplogroup within R.

      But there are many more R* and R*-CRS which are no obvious known haplogroup judging by HVS-I. This excludes U1, U2, U3, U4, U5, U6, U7, U8a1a, U8b, K, U9a, R0a'b, HV0, HV1, HV4a, HV4b, HV6, HV7, HV8, HV12a, HV15, HV17, and essentially every other R (except some R8). So it can essentially be: extremely rare variants of U (probably within U2'3'4'7'8'9* or U8* - therefore not related to U5 nor U4), extremely rare variants of HV, R8 (too exotic to count) or H. Normally it is H, certainly in modern samples 99% of it is H.

      You have the right to doubt but unless H is disproved by direct testing, I have even more right to think that at least much of it is H. Molecular-clock-o-logy is evidence of nothing because I can create alternative models and estimates easily and they tend to suggest (never prove: MC can prove nothing at all) that H expanded with the early colonization of West Eurasia by Homo sapiens before the LGM.

      How? From H (node) to R there are 4 coding region mutations, from U5 (node) to R there are 6 cr mutations, from U2'3'4'7'8'9 to R there are 4 cr mutations. This means, everything else equal and assuming MCH parsimony, that H and U2'3'4'7'8'9 are of the same age, while U5 is 50% younger. Per Fu's only acceptable age estimates, U5 is c. 34 Ka old, while U overall is 56 Ka old (these fit well with the archaeologically expected time-frame, especially if U5 expanded with Gravettian): 56-34=22, 22/3=7.3, 56-7.3x2=41.4 Ka BP. This is the age estimate of U2'3'4'7'8'9, which must be roughly the same as that of H. Hence U2'3'4'7'8'9 and H probably expanded in Aurignacian times.

      Also we must understand that between Gravettian and the Neolithic there are no important genetic inflows to Europe, especially not from West Asia. Solutrean is local, Magdalenian is local, Ahrensburgian is local, Epipaleolithic is local and derived from Magdalenian (most), Gravettian (Italy, Eastern Europe) or Ahrensburgian (North Sea area). The only exceptions are possible minor gene backflow from North Africa (U6, L(xM,N), Y-DNA E1b-M81) to West Iberia within the Solutrean-Oranian interaction (which is mostly from SW Europe to North Africa in any case) and the likely Afrasian (Y-DNA E1b-V13, maybe others) influence in Epipaleolithic Greece (source of European Neolithic).

      There's really no window for African or West Asian significant influence that could be behind H between the LGM and Epipaleolithic. Also the highest diversity of H seems to be in Europe, at least judging by basal haplogroups and not just HVS-I (which is not reliable in general, much less for H).

      So for me H seems to be of Aurignacian origins, regardless that the Gravettian second wave and/or differential Aurignacian spread patterns to various parts of Europe may have allowed for a dominance of a secondary U5 or rather U*-CRS (U8?) layer, particularly in certain parts of the subcontinent like Central Europe.

    3. PS- I'm noticing that there's no known U5 in Europe other than Nerja (Andalusia, HVS-I) until the Epipaleolithic. Similarly there's no known U4 in Europe until Epipaleolithic as well, although one of the Taforalt (Rif, North Africa) sequences seems to be U4 again by HVS-I. All the U known in Europe north and east of Iberia in the Upper Paleolithic (by all means) is either U*, U2'3'4'7'8'9* or U2. Unless I'm missing something of course.

      What do you make of that? U5 is not really related to any of those rare U sequences. Did Epipaleolithic flows (from France?) cause that spread of U5 and maybe also U4, regardless that they probably existed in previous periods (Solutrean Nerja and Oranian Taforalt stands as the sole direct evidence but suffices to me).

    4. Yeah, true. Maybe we need a few more samples though? Only a handful of UP remains have been tested thoroughly to date, so it's difficult to base any theories on that.

      Next-generation sequencing should solve all these issues, and if H was more than just an oddity in Pre-Glacial Europe, then I'm sure we'll see that very soon.

    5. That's part of the problem: we have too little direct data, often sequenced only with inconclusive methods. The other part of the problem is ignoring the data we have, even if incomplete or inconclusive, and pretending to "solve" all the puzzle with a single brute force attempt. Scientists must remain open especially when they know that the data they are managing is incomplete and the methods (MC modeling and such) hardly conclusive, instead they cheat (because it is cheating!) and pretend to be conclusive when all they are doing is at best a partial inconclusive attempt.

      "Next-generation sequencing should solve all these issues"...

      I hope you are right. But I also hope they pay attention to other regions than Central Europe, which may well be an exceptional case.

      BTW, what about U5? Was U5 "more than just an oddity" in all Upper Paleolithic Europe? I mean: some people talk of U as it was a homogeneous thing but it's actually quite heterogeneous and U5 is not there until Epipaleolithic (Nerja excepted).

  3. U5 and U4 were probably somewhere in Europe before the Ice Age, and I think they will turn up in European UP samples at some stage. But of course I can't be 100% sure of that until they do turn up.

    As for the cherry picking criticism of this study, I think you're right that the scope of the sampling was very narrow. But it seems to me the authors wanted to focus on full Neolithic genomes, which aren’t readily available.

  4. Hey, scroll down to the bottom of the post and take a look at the 29/04/2013 update.

    What do you think of those quotes from the new R1b website pertaining to R1b subclades and the Bell Beakers?


    1. I can't see much on that R1b site and I'm using an updated Firefox browser on an updated Linux desktop. I unblocked all javascript but nothing.

      I do see that they are using one of my maps, I just can hope they give me credit. I can also see that they seem to be considering only Neolithic hypothesis and IMO that's a wrong approach.

      As I discussed recently, when looking at more complete Y-chromosome data from the 1000 Genomes Project (credit: T.D. Robb) and calibrating the resulting tree with the most plausible OoA dates (i.e. age(CF)=80 Ka BP), R is c. 40 Ka old and the fraction of R1b included in the TGP (probably all European, hence very incomplete) appears to be 25 Ka BP. So...

      Anyhow, I used the read of your updates to discuss H6 at your blog: Central Asian H6 is not H6a and Iberian H6 is all H6a1, being Iberia also the region where H6a overall is most frequent (oversampled Austria however has greater apparent diversity... but then it's n~2200 in Austria vs n~500 in Iberia and n=0 in France, so beware). Western European H6 seems all or nearly all to be H6a, which is rare in the East.

      This seems to imply that the La Chora sample is H6a and possibly H6a1 (both defined only by c.r. mutations and hence not detectable via HVS-I + RFLP). H6a1 is three c.r. mutations downstream of H6 and in the same line as the Corded Ware sequence of Brotherton (H6a1a). It's even possible that both are nearly identical, as H6a1a1 (G5460A under the C.W. sequence) is found in modern Cantabria (and nowhere else per Álvarez Iglesias 2009). But whatever we assume re. La Chora H6 sequence, it's clear that the timeline to the Brotherton CW sequence shortens and the timeline to H-root lenghtens, what implies older and older age estimates for H overall, using this sequence as reference.

      I'll update with this issue.

    2. Here's a direct link to the R1b.org article about R1b and the Bell Beaker Phenomenon. This should work on your system.


      I can't find the article where your map is being used. I can only see it in the flash animation at the start. But apparently it might be up soon.

    3. The map is only in the flash animation and the menu by its right but apparently the article "L51 and Z2103 (sic): where West meets East" is not up yet.

      I haven't read that Lee 2012 study but I can only imagine that the article synthesizes it correctly. There's no surprise in finding R1b in Chalcolithic Germany anyhow because all theories I know of (Paleolithic, Neolithic in various flavors and even the Indoeuropean hypothesis) support that presence and being the most common lineage today, it's only natural that it shows up in both samples.

      The issue is what happened before. In this sense I am a bit surprised by the high STR diversity of R1b in Iberia, when people is claiming that all the Iberian R1b is within a single subclade of a subclade of a subclade, etc. (i.e. low phylogenetic diversity, what is what matters in the end). I am not certain if that claim is correct (you are into forums, I believe, so maybe you know more details) but if it is then Iberia would not be the likely urheimat of R1b-S116 (much less of higher phylogenetic levels). However, judging on the phylogenetic diversity, France, probably towards the South, should be the urheimat of R1b-S116. In principle this fits well only with the Paleolithic theory, because after that period the territory of today's France does not seem to have been central to any phenomenon spanning all Western Europe other than Napoleon's short-lived empire.

      But let the data flow! It's always good to find new pieces of such a complicated puzzle.

    4. I have a feeling that the high STR variance of Iberian R1b is due to multiple and significant population movements into the peninsula after the Neolithic. Something similar probably happened in South Asia, where the R1a falls under one subclade, R-Z93, but shows unusually high STR diversity.

      Having said that, Iberia is still severely undersampled compared to Northwestern Europe, and we know from archeological, autosomal and now mtDNA data that there was a major out of Iberia episode which affected the genetic structure of all extant Europeans. Indeed, there's really nothing preventing a scenario in which R1b initially expanded out of Iberia as L11/S127, like with the Bell Beakers, diversified in Western and Central Europe, and then migrated back during the Iron Age as mainly S116 and/or DF27. The late and massive expansion of DF27 across Iberia might have swamped the earlier R1b SNP diversity there.

      If so, I think only heavy sampling of modern Spanish and Portuguese populations, as well as ancient DNA from late Neolithic Iberia, both at a high resolution, will be able to tell us how it all went down.

  5. Really good work maju. I'm glad somebody finally gave review of this paper.

    I'll propounded give some comment on it once I carefully read this post. Just out of curiosity, what do you think it means when German bb match modern Spanish? Does that mean the BB came from Spain, but got diluted in Germany or does it mean the modern Spanish descent partly from German BB and the Germans themselves have changed?

    1. Portuguese actually.

      The only thing I can say is that such very high levels of H were only detected previously in Neolithic Portugal (Chandler 2005) but it's an HVS-I only study. I'm fairly certain that part of it is H1b. Also Kèfi in Taforalt (Morocco, c. 12,000 years ago).

      I'll add that the Southern half of Portugal, along with some border districts of what is now Spain, is clearly at the origin of the Megalithic phenomenon, pre-dating by more than a thousand years the secondary center of Brittany and parts of West France. In the Chalcolithic, soon after Megalithism began expanding along the Atlantic and other routes, Southern Portugal, especially the culture of Vila Nova de Sao Perdo (or Castro de Zambujal, which has uncanny resemblance in some details to Plato's Atlantis) was also a major civilization center, which would also be central in the Bell Beaker secondary phenomenon, trading with regions as distant as the Baltic and West Asia. It is a very unknown civilization for the public but almost without doubt the most important one in prehistoric Europe (only partly shadowed by SE Spanish Los Millares first and later El Argar, in obvious deals with Mycenean Greece and probably related to the legends of Herakles in the Hesperides).

      There used to be several theories about the origin of Bell Beaker and the matter is not fully clarified yet. The mainstream one advocates a Bohemian origin, another not too popular advocated a Dutch origin and yet another claimed an Iberian origin. The presence of what seems to be abundant H in Iberia prior to the Chalcolithic seems to support (but does not prove on its own) the Iberian origin model. Whatever the overall origin, it's clear that both Zambujal/VNSP and Bohemia were centers of this phenomenon. The mot widespread type of beaker the "International" or "Maritime" type is clearly of Portuguese origin, while the more rustic "Corded" one may well have originated in Bohemia instead (there are other minor types, especially in the late Beaker period but they are localized).

    2. "I'll add that the Southern half of Portugal, along with some border districts of what is now Spain, is clearly at the origin of the Megalithic phenomenon"

      I think this will turn out to be the key piece in figuring out the puzzle.

    3. I'm very surprised. Maybe, Megalithism in South Portugal predated Megalithism in West France, but not by more than a thousand years.
      Several french monuments were dated about 4500BC and there are some dates before as 4785BC in a Bougon tumulus.

      In more, sometimes, the collective tombs in Final Mesolithic are considered as a possible precursor as the 9 tombs of Hoedic island (5500-5000BC), not a island at this time.

    4. Depends who you read and whether the dates are calibrated or what. I used to think in the 90s, ~4800 BCE for Portugal and ~3800 BCE for Brittany, while the rest would be from ~3500 BCE and later. However the calibrations of everything have been revised in the last decade and also overall all ages seem a bit older everywhere. Whatever the case there seems to be little doubt about Dolmenic Megalithism (rather than just "Megalithism" in all forms, Nabta Playa may be older but it's not dolmenic at all) being first developed in Portugal, just after the very first Neolithic but IDK if what you say about Brittany and West France could really challenge this view or not.

      In principle collective burials in shell middens are not the same as Megalithism and are found elsewhere if I'm correct, as well as later collective burials in caves, co-existing in time and sometimes space with dolmenic rituals. However the idea that they could be somehow precursors of Megalithism is suggestive, assuming that we accept the notion that first dolmens had tumuli on them, rather than the also heard opinion that they were naked and tumuli were only built (and not always) as a long secular process of ritually dropping stones across generations (in some parts of the Basque Country it was still done less than a century ago, it seems).

      Brittany, or rather Armorica, has other issues because its Megalithism seems to be rather elitist while in other places (excepted Great Britain) Megalithism seems a more generalized, popular, phenomenon. Also, while Finisterre has rather high frequencies of H (50%), Morbihan seems to have them quite lower (35%, lowest in France per Dubut 2004). Also the only megalithic monument of France studied for aDNA produced a very "exotic" mtDNA pool (X2, N1a, U5a), more similar to what is found in German LBK than what is found in Portuguese or Basque Neolithic.

      It's clear that Armorican Megalithism, jointly with North French "Danubian" Neolithic was crucial in British and Irish Neolithic but elsewhere I don't really see a clear signature of its influence. As we enter in the Middle Chalcolithic, we see how this kind of elitist (proto-druidic?) Megalithism collapsed vs. the Danubian pressure from Seine-Oise-Maine, which was only countered by the advance of Artenac culture from further South, under which a more standard, plain or "popular" Megalithism was established in all Western France and Belgium. Meanwhile, a more urban kind of Megalithic society developed in Portugal, with less impressive monuments but very notable towns instead (Zambujal especially but also many other walled towns), a trait missing in the spectacular Megalithism of "both Britains", which seems quite rural.

    5. "Meanwhile, a more urban kind of Megalithic society developed in Portugal, with less impressive monuments but very notable towns instead (Zambujal especially but also many other walled towns), a trait missing in the spectacular Megalithism of "both Britains", which seems quite rural."

      Speculation of course but i think the urban Portuguese centre was at or near the edge of the meditteranean ecological zone and the extensions further along the atlantic coast were part of a different atlantic ecological zone.

    6. Not sure what you mean but the case is that Portugal is mostly transitional between Mediterranean and Atlantic climate zones, with the North already fully in the Atlantic area. Here it rains more than in much of Britain and in Galicia/Norther Portugal it rains even more than here (example map).

      Said that, nobody is suggesting that Neolithic went from Portugal northwards (except surely to Galicia): Megalithism is mostly a Late Neolithic phenomenon: first farming, then megaliths. It'd be if anything a secondary wave (but it's not clear enough yet, just a work hypothesis).

    7. I was thinking settlement pattern might have determined the megalith pattern.

      If Southern Portugal was the edge of a zone where the original neolithic farming package was viable and thus able to support a more urban population density then that may have led to the form of their megaliths e.g. clan or family stones within a town.

      If the secondary extension along the atlantic zone had a lower population density with a more scattered settlement pattern (because the neolithic package couldn't support urban population densities in the atlantic zone) then that might have lead to the larger shared megalith structures as a unifying mechanism.

      Just a thought.

    8. The urbanization in Southern Iberia happens only since the Chalcolithic, a few centuries after it is generally considered to begin. Chalcolithic in Iberia (soft metal metallurgy, long distance trade routes, probable social stratification, some "neo-megalithic" burials like tholoi or artificial caves) begins c. 3000 BCE, while urbanization starts c. 2600 BCE if I'm correct.

      Megalithism is older, not just in Iberia but also in many other places (although in many areas really begins only nearing or at this very period). And Neolithic even older. Roughly in Southern Iberia we can talk of the following chronology:

      1. Since c. 5500 BCE or even earlier: Neolithic, followed quickly by the first Megalithism (dolmen burials) in the SW.
      2. Since c. 3000 BCE: Chalcolithic (before and after this date: expansion of Megalithism but Center-East of the peninsula remain non-Megalithic mostly)
      3. Since c. 2600 BCE: Urbanization (notably Los Millares I, VSNP I)
      4. Since c. 2200 BCE: Bell Beaker (within the wider pre-existent cultural substrate: Los Millares II, VSNP II)
      5. c. 2100-1900 BCE: Apogee of Zambujal-VSNP with "International" Bell Beaker, Palmela points, etc.
      6. c. 1900-1700 BCE: Final Bell Beaker (regionalization), beginnings of Bronze Age (initial El Argar A)
      7. c. 1850-1500 BCE: Early Bronze Age: El Argar A, beginnings of SW Bronze (vanishing of towns in southernmost Portugal, replaced by "barbaric" Bronze horizons that gradually expand northwards - proto-Tartessic?)
      8. c. 1500-1300 BCE: Middle Bronze Age: El Argar B (Greek burial practices in pithoi but otherwise continuity)
      9. c. 1300-1000: Late Bronze Age: El Argar state (?) disintegrates into many city states (post-Argaric culture), arrival of Urnfields (proto-Celts) to the NE, end of Zambujal-VNSP as the 10 km long canal is silted (tsunami?)
      10. c. 1000-750: Early Iron Age (notably expansion of Urnfields up the Ebro River), late Atlantic Bronze in the West (connections with British Islands, Cyprus, etc.) Tartessian culture in Andalusia. Establishment of Phoenician colonies.
      11. c. 750-650: Important transition: Urnfields → Hallstatt → "Celtic" expansion to Central and West Iberia, followed by Iberization of Catalonia and the sub-Pyrenean area (Massilian influence surely).
      12. c. 650 on: Late Iron Age: proto-historical period: Iberian civilization, Carthaginian expansion → II Punic War...

    9. ah ty, will look a bit deeper

  6. Maju, have you seen this? It just came out, but I don't know how useful it is? Looks kind of outdated at first glance.


    1. Had not seen it before, thanks. Would not clearly read "2013" on top of the first page, I'd imagine the paper was from several years ago. It's only "somewhat" interesting and one wonders why the Institute of Molecular Biology of Paris is so out of touch. The bibliography itself, excepting a self-reference dated to 2009 and the almost routine mention of Karafet 2008, everything is 1990s or early 2000s (2006 at the latest), even some 1980s references!

  7. "From H (node) to R there are 4 coding region mutations, from U5 (node) to R there are 6 cr mutations, from U2'3'4'7'8'9 to R there are 4 cr mutations. This means, everything else equal and assuming MCH parsimony, that H and U2'3'4'7'8'9 are of the same age, while U5 is 50% younger."

    Given the highly variable rate of accumulation of mtDNA mutations, this is not a reliable approach for estimating ages of haplogroups. It is possible that 4 cr mutations could accumulate in a few thousand years or more than 20,000 years. You need a very large number of samples to calculate an average mutation rate for a haplogroup.

    The two Dolni Vestonice samples (14 and 15) dated at 31,155 BP are confirmed U5 based on full genome sequencing. You could call them pre-U5, given that they lack the 3 cr U5 mutations. But this confirms that U5 was present in Europe well before the the LGM. I would be pleased to see pre-glacial H samples confirmed by full genome sequencing, as it would indicate a more complex migration history. But it looks increasing unlikely that they will be confirmed. Hopefully there are labs working on some of the samples that you cite.

    1. "The two Dolni Vestonice samples (14 and 15) dated at 31,155 BP are confirmed U5 based on full genome sequencing".

      Sorry? I just read and discussed Fu et al 2013 two weeks ago and it's: U*, U* and U8. That's what they say, not I. No idea where you get the notion that "full genome" sequencing, which means the whole nuclear genome (in addition to mtDNA) could in any way improve that. Whatever the case, it's not something I got from a magic hat... but something reported in a scientific paper (a bit obscure admittedly but quite clear in this part).

      "You could call them pre-U5, given that they lack the 3 cr U5 mutations".

      That's precisely not U5 but U-other. If it's underived U-root, i.e. without belonging to any other branch, it'd be as much pre-U5 as pre-U6, pre-U2'3'4'7'8'9 and pre-U1. But I do not know this for a fact. Most likely, like the U* from Magdalenian Swabia (U*-CRS, certainly NOT pre-U5 because it had already evolved in a different direction), they are in a distinct U branch, now rare enough not to have a name (i.e. U*).

      "... this confirms that U5 was present in Europe well before the the LGM".

      Not at all: it is NOT U5 nor we have any molecular reason to imagine it in the way to U5 (i.e. your unfounded claim of "pre-U5" in DV). The only known near-LGM case of U5 is a sample from Nerja (Andalusia), with a Solutrean context (HVS-I only). All the rest is from the very late UP (Oberkassel, 14-13 Ka BP) onwards, whatever the reason.

      "I would be pleased to see pre-glacial H samples confirmed"...

      It'd be interesting to know: there are a lot of suspects.

    2. "Given the highly variable rate of accumulation of mtDNA mutations, this is not a reliable approach for estimating ages of haplogroups".

      There's no way to estimate with any certainty the age of mtDNA haplogroups. But within my main hypothesis it is more likely for a large haplogroup to "freeze" for long because of drift than for a small one. Hence it's more convenient to look at the mutations from the common root, in this case R.

      H is the second largest haplogroup by number of basal mutations after M, that implies an explosive expansion at its origin (and at the formation of some derived branches like H1). The star-like structure of H tells us that it was not a minor lineage infiltrating behind the lines, so to say, but that it was very dominant when it coalesced instead. It can only be compared in all the human genome to the explosion of M, which represents most likely the arrival of Homo sapiens to South Asia.

      Considering that undetermined H (and even H6) were present in Magdalenian contexts in Cantabria, that it is almost certain that H17'27 was present in a Gravettian context in Russia, that there is a lot of undetermined (HVS-I) R*, often R*-CRS, which tends to be H in 99% of modern cases in modern Europe, and that there were no known meaningful immigrations to Europe after Gravettian, the most parsimonious scenario is that H or its HV seed (pre-H) arrived to Europe with either the Aurignacian or Gravettian immigration waves. There is no other chance for such a flow until Neolithic but H was 100% certain in this continent long before that, so nope.

      IMO H expanded with Aurignacian while it's possible that U variants did with Gravettian. It could also be the other way around or a more complex scenario but something like that. Aurignacian colonization offered H the kind of open expansive scenario that can justify its hyper-star-like structure. However Gravettian was also very vigorous, so I remain cautious on that.

    3. PS- I wish to correct the first comment in this row: there's also U5 in Magdalenian Basque Country. So what we know empirically about U5 via aDNA is:

      1. Solutrean Andalusia (HVS-I - but for U5 that's quite safe)
      2. Magdalenian Basque Country and Elbe
      3. Epipaleolithic Iberia (W and N), West Germany, Lithuania, Lower Volga
      4. Neolithic: pushed towards the North and West.
      5. Chalcolithic: only detected in Northern Europe.

      This scenario fits well with a hypothetical Gravettian expansion and later re-expansions (Solutrean, Magdalenian) and recessions (Neolithic, etc.) However we are most likely to remain without knowing about Aurignacian genetics forever because Aurignacian has left nearly no human remains (it's plausible that they did not bury their dead but used some other ritual like leaving the dead to the animals, as did Native Americans in many cases or the Hadza, etc.)

    4. PS to the PS: it also fits well with lower frequencies of U5 towards the SW. In Iberia it is as follows:

      1. Solutrean (of Gravettian substrate): 50% U5 (1/2)
      2. Magdalenian: 20% U5 (1/5), all the rest is certainly H (2/5) or is possibly H (2/5 R*, both in Nerja)
      3. Epipaleolithic: 25% U5 (3/12). The rest is: 25% safe H (3/12), 17% Lx(M,N) (2/12), 25% possible H (R*) and 8% U4 (1/12).

      It's reasonable to estimate that U5 was c. 25% all the time in UP Iberia, maybe with some local variations, maybe higher (50%?) in the Franco-Cantabrian Region (judging from the Cantabrian strip only). Whatever the case nothing like the dominance of various U subclades apparent in Central Europe.

  8. We could speculate that these people who had mainly U5 were culturally cold-adapted and after the ice age preferred the cold areas and went to the North. I would suggest that there were people who were used to milder climates wandering into Europe and introducing new haplogroups with them and this would have happened without any direct link with farming, which obviously happened later on. You said that there were no new archaeological cultures detectable in Europe at that time frame. But perhaps these cultures remained in Eastern areas and reached Western Europe only later on.
    However, I think that in particular mt lines can spread from one group to another without any real migration. In this way, I think that, with time, these haplogroups can move thousands of kilometres without any real migration of the original group. I think that it was quite usual to look for a wife among the women of the neighboring groups.

    1. Then how do you explain that the oldest known U5 is found in southern Iberia (Nerja)?

      I'd rather explain it (always very tentatively) within the context of two successive waves into Europe (or possibly even more, as Aurignacian "senso lato" is complex), rather than appealing to climate adaptions that actually do not fit well with the data. It is possible that in some areas (Germany, Czechia) H was "drifted out" within the Gravettian context but we do not see any U5 before Magdalenian, so it was probably not this lineage the one responsible but rather other U (U*, U8). U5 may have re-expanded from France before H did (but lacking enough data on the most important areas of Upper Paleolithic Europe, namely SW France, it's very hard to say).

      "You said that there were no new archaeological cultures detectable in Europe at that time frame. But perhaps these cultures remained in Eastern areas and reached Western Europe only later on".

      If you can't propose an specific model with archaeological references, I cannot really discuss it. In principle in SW Europe there is demic continuity between Gravettian and Epipaleolithic and this applies to Central Europe also at least since Magdalenian. There are some Epipaleolithic and other flows but all them are within this area of Western and Central Europe, strictly internal. The only exception could be some backflow from North Africa into Western Iberia but that should be at the origin of U6 and L(xM,N), not H (which is quite clearly derived from Europe in North Africa).

      "I think that in particular mt lines can spread from one group to another without any real migration"...

      But how would they become dominant? Drift statistically favors the already established, more numerous lineages, so it tends to keep things stable... unless numbers are very low, in which new mutations (or new minor arrivals) get much better odds.

    2. Also, I only remembered now, in Eastern Europe H is found (it seems) in the North (Sunghir), while U variants are only found in the South (Kostenki, Epipaleolithic Volga). So I'd rather think of it all of a somewhat chaotic, mixed situation, in which the Central European data is not being helpful enough to explain, but rather confusing us.

  9. Yes, at the moment we do not know the exact distribution of H and U in the Paleolithic period but I hope that the picture gets clearer with time.
    You are much better at archaeology and I am happy to learn from you in this respect.
    Nerja cave remains are dated 20.000 to 17.000 ybp, they are very early indeed. Gravettian lasted until 22,000 years ago, so there seems to be enough time for U5 to end in Southern Spain even if it was introduced with the Gravettian culture. It may have ended there, as I explained above, even without the migration of Gravettian men.
    I understand your criticism when you say that how come these lines came dominant, but I am not necessarily talking about dominant lines such as H and U, and you may agree that for example in the Americas the mtd lines A,B,C and D were at least not all in Central Asia with Q carrying men. I think that H and U are both good candidates for having a European(or Northern-Eurasian) origin. That these both lines correspond to two fundamental cultures in Europe(Aurignacian and Gravettian) is an appealing theory. However, many minor haplogroups may have moved in Europe without any bigger-scale migration, but of course not exclusively. Instead, LBK, Unetice and Corded Ware were cultures that had a strong demic component in them and it is easy to believe it when you look at the mtdna haplogroups they carried. It seems that hgs such as V,J,T,U3,I,W and X or certain subclades of H started expanding c. 5000 ypb and they may have moved in Europe in the way I explained above, at least in part.
    When you say that H and U correspond to Aurignacian and Gravettian, what is the respective Ydna. Is it in your opinion already R1b, R1a and I?

    1. Southern Iberian Solutrean had a strong cultural melting with local Gravettian. In fact, only two nearby sites near Valencia are strictly Solutrean, so we may better talk of Gravetto-Solutrean and there is very clear Gravettian-to-Solutrean demic continuity in any case. Probably a few immigrants with Solutrean culture arrived from France to those two caves and then the cultures melted (and later also received some influences from North Africa probably, like the concept of winged and stemmed points, which may well have been inspired by Aterian designs).

      "... you may agree that for example in the Americas the mtd lines A,B,C and D were at least not all in Central Asia with Q carrying men".

      Of course, but it may not be a good comparison because the population densities of Siberia were surely very low (allowing for easier dramatic changes). But, well, we just can't say much about the male lines of UP Europe at the moment.

      "When you say that H and U correspond to Aurignacian and Gravettian, what is the respective Ydna. Is it in your opinion already R1b, R1a and I?"

      I'm inclined to think that they were R1b (West) and I (East) but without any empirical certainty yet. In general, it seems reasonable, based on modern distribution and basal diversity, to think that R1b-S116 coalesced in the Franco-Cantabrian region, while I did in the Pontic one.

      I'm uncertain about R1b. It could be there or it could have arrived somehow in the Epipaleolithic/Neolithic from West or Central Asia, expanding essentially with the Kurgan phenomenon (Indoeuropeans).

      ... "European(or Northern-Eurasian)"...

      Europe and North Eurasia don't seem too related on first sight. Most of the parts of Northern Eurasia in Europe were completely under the ice sheet and therefore uninhabitable, the rest seems more related to West Asia, which is rather part of Southern Eurasia. Northern Eurasia (Siberia and such) does not seem to have played any major role, excepted the original settlement of America and the Uralic ethnogenesis later on. Only since Chalcolithic, with the domestication of the horse, seems the steppe corridor to have become a more dynamic area (Indoeuropeans first, Altaic peoples later). Otherwise the main link of Europe with the rest of the World was the Balcanic Peninsula and Anatolia, then surely a single land corridor. We can't exclude some flows between Eastern Europe and Central Asia (also through the Caucasus with West Asia) but I would not expect any game-changer from them because Eastern Europe itself was not too influential in the UP.

    2. Northern-Eurasian was in fact a wrong word, because I did not mean Siberia. When I now look at the map, I must have meant Europe.

      In your opinion, if H originated in Central Europe or even Germany or France, how did it come to the Volga-Ural area? Were there migrations from Europe or was it just in the way that I proposed above, through "migration of wifes". If H is 40,000 years old, it could have come to both areas with Aurignacian flows (not necessarily that far North as the Urals but somewhere South of it), but I do not know if it is really possible.

      It is very annoying that the mtdna seems to be older than the ydna, as if the original ydna has been almost entirely replaced by more vigorous and younger ydna lines. I would prefer that the age estimates are under estimated, but who knows. For example, native American ydna might almost disappear in 100 years' time while the mtdna will flourish.

    3. Both Aurignacian and Gravettian were very influential in Eastern Europe AFAIK. After them we can hardly talk of any single pan-European culture before Medieval Christianity or maybe the Kurgan-IE phenomenon as a whole (i.e. spanning several millennia: from Chalcolithic to the Roman Empire). Magdalenian or Megalithism/Bell Beaker were restricted to certain areas of Europe for example.

      "I would prefer that the age estimates are under estimated"...

      That's surely the case in my opinion: STR-based age estimates are surely all wrong and very much so. We haven't seen any actual Neolithic Y-DNA that is not what we would have labeled "Neolithic" a decade ago without so much noise from confusing and pseudoscientific guesstimates.

      "For example, native American ydna might almost disappear in 100 years' time while the mtdna will flourish".

      I most strongly doubt it will disappear altogether. Similarly I do not believe that European Paleolithic Y-DNA has completely vanished: that's nonsense.

  10. When you say that H is found in Sungir, I am not that surprised when I think that there is really a lot of H among the Finns (and most H clades seem to have come from the East) and other Volga-Ural people:
    Mari 40% of H, 11% HV0, 14% of U5, 14% of U4
    Mordvins 42% of H, 5% of HV0, 16% of U5
    Udmurts 22% of H, 9% of U5, 10% of U2
    Karelians 47% of H, 6,5% of HV0, 17% of U5
    Age estimates in Tatars are: U4 c. 17 ka ago, V c. 12 ka ago and H c. 18 ka ago

  11. I just was able to read the paper. Maju, another interpretation of your plot with segregated MNE sub-populations is that the sample sizes simply aren't large enough. Don't forget that even some of the modern populations match poorly (e.g., Czech, Romanian, Slovakia, NE Iberia). That way, it makes sense that the pooled population maps well.

    Another interesting aspect is that almost all BB haplotypes can be viewed as derived from local post-LBK (network Fig. 1). There are only a few exceptions: the basal H1, H5, and H13. Basal H1 IMO really doesn't tell us anything; a derived H5 also occurs in Rössen (and H5 is mostly eastern, anyway), and H13 below the highly-derived A13542G seems to be Caucasian, today. This is completely unexpected to me. I would have thought that an Iberian population would have a very different signature, one based on LGM H groups and Cardium - which should be completely different from post-LBK CE.

    This means that either SW Europe got a complete make-over similar to the post-LBK discontinuity and by the same folks; post-LBK CE is largely of SW European origin even before BB; or BB mtDNA in CE is of local CE origin after all, and its position on the map is just chance because of the small sample size. What are your ideas on this?

    (Also, I don't know how they tweaked the extreme western BB map position given the Eastern nature of some of its haplogroups - something does not add up. Perhaps it's an artifact of the basal H1, which IMO doesn't say much).

    1. Of course that their pooling of "MNE" samples is a more or less comprehensible choice but it also hides the differences between the pooled populations, which in some cases are very marked. Personally I would have pooled, if anything, Schoningen and Rössen with LBK on one side (Danubian cultures), and Baalberge, Salzmünde and Corded Ware on the other (Kurgan cultures) - but again it is a debatable choice. So I decided to show the complexity hidden in the arbitrary "MNE" pooling instead.

      "Another interesting aspect is that almost all BB haplotypes can be viewed as derived from local post-LBK"...

      Only 1/9 LBK and 1/9 BBC sequences would fit your claim. And even then it does not necessarily imply the relation you argue for but just a coincidence with the general structure of haplogroup H1a, which could well exist long before the Neolithic. Neither Schöningen nor Baalberge, which also have sequences in this H1a set, seem ancestral to the Bell Beaker one, so... very speculative.

      "There are only a few exceptions"...

      Not at all: 89% are "exceptions" and the only LBK-BBC pair that fits your claim can well be seen as a coincidence.

      "I would have thought that an Iberian population would have a very different signature, one based on LGM H groups and Cardium - which should be completely different from post-LBK CE".

      I don't know at this moment if the excess H of BBC comes from Iberia or some other place (no idea which one however) but what I know is that the LBK H pool is very different from the H pools of later populations.

      The LBK, Rössen and Schöningen H pools are full of rare H subclades, occupying all the lower left third of the haplotype tree, plus also rare H1 subclades (and one basal H1 in the Rössen group). The only ones that are somewhat coherent with the rest are those within H1a, amounting to exactly two sequences: one H1a in LBK (11%) and one H1a7 within Schöningen (50%).

      "... either SW Europe got a complete make-over similar to the post-LBK discontinuity"...

      That is very possible at least in some areas, like Portugal, although not as radical in any case: the frequency of H in Portuguese Neolithic was very close to that of BB in the Elbe later on (>80%) but today Portugal is the region of Western Europe with the lowest frequencies of H (<50%). This could have been caused by the Hallstatt era invasion of Celtic and other Indoeuropean peoples (Lusitani) or maybe other reasons I can't really explain well.

      "... or BB mtDNA in CE is of local CE origin"...

      It's possible but where from? Austrian LBK has 100% H but it was just one individual, so it can well be a fluke. East Austria, West Hungary and Moravia together make the real core are of LBK (later Lengyel, later Baden) and it has been poorly studied so far. Just a suggestion, as the Bohemian origin of BB still has the best archaeological support.

      "I don't know how they tweaked the extreme western BB map position given the Eastern nature of some of its haplogroups"...

      You seem to be right. I did not realize that: the actual position should be rather near the POT or even IPNE/CW dot. Probably this is because they generated the population dots and haplogroup vectors separately and for some error or algorithmic inadequacy they don't fit too well. Maybe there are hidden vectors under the few simplified haplogroup vectors - can't say.

    2. Yeah, Baalberge can generally be lumped in with Corded Ware, Single Grave, Battle Axe, Unetice etc. What's the bet all of those will end up heavy in R1a?

      Interestingly, H7 has now shown up in the Baalberge and Unetice samples. So I'm wondering whether my H7 is a match for one of those. In any case, it now looks like my R1a goes well with my mtDNA. Muahaha...

    3. H7 seems to have three main centers, at least by frequency (map: France, the southern Balcans and the Volga area. Also found in North Africa (plausibly from France via Iberia in the Solutrean).

      If you're right in the IE-Kurgan connection of your and Central European ancient H7, then I'd imagine that it should be most closely related to that of the Volga basin.

  12. "Only 1/9 LBK and 1/9 BBC sequences would fit your claim."

    Maju, I think you must have misunderstood what I was trying to say. Taking Figure 1 at face value (given that there is little other ancient full mtDNA analysis), almost all of BB is closely related here to the local post-LBK samples. Let's look at the network clockwise from the top:

    1. As I already mentioned, H1 of course doesn't say much. Still, one Rössen sample is also H1, and a Schöningen and a Baalberge sample are derived from the same node as the sub-H1 BB.
    2. H4 is the same as a Corded Ware sample (with a Unetice sample below).
    3. H13 below A13542G appears to be Caucasian, today. So while an exception and not local, it does not seem to relate to Iberia - quite the opposite.
    4. Two found below H3, while Salzmünde is H3 (and another one below together with Unetice).
    5. Two below H5, while Rössen is also below H5.

    So, from that perspective (and mostly ignoring today's distributions of known sub-groups, because who knows what happened), the BB samples here look very much local or locally derived, and don't show any exotic signature that one would expect from contemporaneous Iberia.

    1. There are two BBC H1 haplotypes at the root of the H1 tree. There's also a Rossen H1 there, but it's not ancestral to the BBC lineages. Most of the other BBC haplotypes come straight off the ancestral H node. That doesn't bode well for the BBC being a genetic subset of early and middle Neolithic Central European populations. Just based on those results, the BBC look like a group with old and diverse H, which is very interesting indeed.

      Now, one of the BBC H4 haplotypes is apparently sitting next to a CWC H4. There are also a couple of H1 and H3 BBC haplotypes that come off LBK and Salzmund haplotypes.

      But I wouldn't read too much into any of this. That's because the BBC practiced female exogamy with CWC and probably Unetice groups, so this would easily explain any western haplotypes in the CWC/Unetice samples and eastern haplotypes in the BBC samples. Moreover, there were expansions from the Atlantic Fringe and the Mediterranean Basin before the BBC Phenomenon, and these could have brought H1, H3 and H4 to Central Europe, where such western lineages subsequently became part of the LBK and Middle Neolithic gene pools.

    2. "... and a Schöningen and a Baalberge sample are derived from the same node as the sub-H1 BB"...

      Which is no other than H1a (see PhyloTree for reference).

      "H4 is the same as a Corded Ware sample (with a Unetice sample below)".

      You are talking again of the overall structure of mtDNA H (H4 in this particular case), of which that graph is just a version. H4 is part of the seemingly Iberian Solutrean legacy in North Africa (along with H1, H3 and H7, this last one probably from France, and maybe V, together adding to ~30% of North African mtDNA), so it has nothing to do with the narrow Chalcolithic or Neolithic-Chalcolithic timeframe alleged here. At least I fail to see how it'd be possible.

      Whatever the case, no LBK here.

      "H13 below A13542G appears to be Caucasian, today".

      I have no idea, I'll take your word for it.

      " Two found below H3, while Salzmünde is H3 (and another one below together with Unetice)".

      But no LBK/Rösen/Schöningen again. H3 is found (twice) in early Neolithic Basques (just for the record).

      I agree that Salzmünde looks intriguingly "iberoid", even more than BB.

      "Two below H5, while Rössen is also below H5".

      But different branches.

      You are also here cherry-picking the data in my understanding.

      I don't think I misunderstood you: I just happen to disagree radically and see it in a very different light: the "similitudes" you see are mere (and quite exceptional) coincidences within the wider H tree, which was obviously not originated in that are nor that timeframe. In all that generic H data you are cherry-picking a few that seem (only seem, it's a clear mirage) satisfy your criterion.

      All I can do is beg you to change your "rosy-colored" glasses and try to see the matter objectively. In particular if we compare LBK with BB, they can hardly be more different.

  13. Moreover, there were expansions from the Atlantic Fringe and the Mediterranean Basin before the BBC Phenomenon, and these could have brought H1, H3 and H4 to Central Europe, where such western lineages subsequently became part of the LBK and Middle Neolithic gene pools.

    Given the ancient DNA evidence, I have a problem with this except perhaps H4.

    I have often mentioned how Hoguette may have brought R1b north to the Rhine area, with subsequent diffusion eastward. Still makes the post-LBK mtDNA lineages suspiciously similar to BB - much more so than we would expect from "exotic" Portugal. That was my point.

    1. Well I don't have a problem with it, because I don't think H1 originated in Iberia. I think it arrived there from somewhere else near the Mediterranean, and if so, there's no reason why it couldn't have made its way to Central Europe at about the same time.

      Anyway, the evidence for an "exotic" southern origin of the BBC is now quite solid and varied, and what this means is that a major portion of the genetic structure in Central Europe arrived there from the west relatively late.

      In other words, first there were Pitted Ware-like hunter gatherers running around Germany. Then came the West Asian-like LBK farmers via the Danube route. But then, the Pittwed Ware-like genes re-emerged, at least in part, thanks to Corded Ware, Single Grave, Battle Axe, Unetice, etc. Kurgan group of cultures (most likely from the forest steppe of Poland, Belarus, Russia and Ukraine, but I know the steppe will always be a favorite for some). Last but not least came the Basque-like BBC from the Atlantic Fringe, adding Mediterranean/Atlantic admix to the Central European melting pot.

    2. Whoops, I forgot about the Megalithic Funnelbeaker (TRB) groups. They would've brought Med/Atlantic admix to the coastal areas of North-Central Europe before the BBC, including possibly mtDNA H1, H3 and/or H4. From memory, Gok4 belonged to some sort of H.

    3. For whatever is worth, I think very reasonable to imagine the main expansion of H and many of its sub-lineages (notably H1, probably also others) as happening in the early UP (Aurignacian and/or Gravettian contexts, the only prehistorical pan-European cultures ever). Other lineages (within U for example) were surely also involved, in ways hard to discern, in the same processes. Only that way we can explain that they are found, in ancient and modern samples alike, in places as distant as Russia, Karelia or the Caucasus on one side and Iberia, France, etc. on the other (also Central Europe but here it may be more confusing).

      When we look at the fine print, as happens with H6, we often see different subclades in Eastern and Western Europe (in the H6 case: H6a is Western, while H6b and H6c are Eastern and Central Asian). Ancient DNA allows us to discard the only possible "recent" wave of homogenization, which would be the IE-Kurgan phenomenon in all its extension (from Chalcolithic to the Roman Empire), so the only remaining candidates for shared Eastern and Western European lineages like H1, H6, etc. are the early Upper Paleolithic pan-European cultures: Aurignacian and Gravettian.

    4. "from "exotic" Portugal"

      I don't think it's as exotic as it may appear.

      If there were two ecological zones the first comprising Anatolia, the med coast, Greece, Southern Italy, Southern Portugal etc where the neolithic farming package worked without any adjustment then the first farmers could spread very rapidly east to west and settle those areas relatively densely. However given the distances involved if the centre of gravity remained in the eastern med then the western med branch centred on southern portugal although connected by trade might develop in a semi-detached way out on the fringe beyond the pillars of hercules (with all the fun stuff that might imply).

      So the story about what happened in the atlantic-west ecozone to the north of the med coastal zone would have multiple start-points: Greece/Thessaly, Southern Italy, Southern France *and* Southern Portugal i.e. i think southern portugal may be a bit of a missing link.

    5. While I agree with you in the broad meaning, Grey, although I need to object to the phrase "although connected by trade". Trade was not relevant at all in early Neolithic or Neolithic proper. It only became important later with Chalcolithic (known as "Late Neolithic" in Britain), i.e. when social complexity increased. Early Neolithic peoples even if maybe mobile (especially the Mediterranean branch with their deep sea mariner skills) were not really into trade other than maybe local one. There was not yet really almost anything to trade other than, say, variscite or high quality flintstone (but even these become only discernible s trading items towards the Chalcolithic as well) or some animals and excess crops. Communities were essentially self-sufficient and not markedly hierarchical. Cardium Pottery communities, for example were probably isolated from each other as soon as they migrated elsewhere. Maybe they kept some relation with their parent community but really nothing pan-cultural. That's why to expansion (in Cardium as in LBK, etc.) follows regionalization.

      But surely Portugal was indeed the place where Mediterranean crops could acclimate to Atlantic conditions the easiest. Western France (incl. Brittany, Gascony) may have been another such place (influences from SE French Cardial, Atlantic but mild climate). For LBK crops it could be the area around Belgium and Paris, where, as in SW Europe, there was cultural complexity, with many early Neolithic groups that are not LBK but eventually melted with them.

  14. As an addendum to the discussion on Megalithism, I was these days re-reading which is AFAIK the only mtDNA study of France on somewhat regionalized basis: Dubut 2004. I found very interesting that U6a1, a North African lineage, also found in the Western Iberian Third but otherwise extremely rare in Europe, exists at frequencies of 5% in Finistèrre department (Brittany). As we mentioned before Brittany-Armorica is, after Portugal, the second oldest Megalithic area of Europe, and to my eye, this finding suggests some Western Iberian genetic influence in the Breton peninsula.

    1. PS- Finistèrre also has the highest frequencies of H (50%), V (9%) in this study, as well as intriguingly high frequencies of J(xJ1,J2) (9%), whatever that means. The first two could be supportive also of an Iberian Neolithic flow, considering the very high frequencies of H in Neolithic Portugal (almost double than today), per Chandler 2005, and also presence of V.

      Another curious detail of this old paper, although apparently unrelated to this debate, is the finding of U2 in Normandy and Perigord-Limousin at relatively high frequencies (4-5%). Also notice the existence of 8% U8(xK) in Var department (Provence).

    2. What sort of R1b is found in Finistèrre?

    3. This comment has been removed by the author.

    4. This comment has been removed by the author.

    5. Following Myres 2010, France is in general dominated by R1b-S116 (South clade). The closest sample is however from the area of Nantes (La Rochelle?, not too informative considering that the Dubut Finistèrre and Morbihan samples are quite different among them) is 100% R1b-S116 (= 0% other R1b), of which 2/5 are R1b* (i.e. xM529,U152), 2/5 are U152 ("Gallic" or "Alpine" clade) and 1/5 is R1b-M529(xM222) (the "Atlantic" or "Irish" clade).

      That's all I can say on the matter. Try looking in FTDNA databases...

    6. North-African input can also be the result of Roman soldiers of North-African ancestry.

    7. Not on the maternal side anyway (e.g. haplogroup mtDNA U6a1).

      mtDNA U6 is found in Britanny as mentionned and it's a typical north African female lineage.

    8. Indeed, not in the maternal side. Mitochondrial lineages must mean some other sort of population flows. In most cases mtDNA shows much stronger affinity with autosomal (overall) genetic affiliation than Y-DNA, this is logical as women normally migrate less in the long distances unless it is via slavery trade or something like that. Most migration flows either were gender-balanced or were mostly males, who mixed with local women again and again (or something in between).

      MtDNA U6 or Y-DNA E-M81 in Atlantic Europe seems to me a clear sign of migrations from Western Iberia, where these North African lineages are most notable and which has an archaeological record clearly allowing to be at the origin of such demographic flows in the Chalcolithic ("Late Neolithic") period.

  15. Maju, you are still misunderstanding me. For the third time, I am looking at the distance or closeness between BB and post-LBK (and mostly post-Danubian) groups in CE. Not with LBK - after LBK. It is very obvious and undisputed that there is a strong discontinuity after LBK (and, to a lesser extent, also after Danubian in general).

    Also, we shouldn't forget that the data are not from all of LBK, but just the Elbe-Saale subset, which was surely much smaller and had its own drift history and peculiarities, as indicated by the many finds one or two or three mutations away from H that are now dead. If after the local demise the region got re-settled with a slightly different Danubian folk from a wider area, that alone would suggest a different subset of H haplogroups - as is apparent in the Rössen and Schöningen data.

    The paper deals with a ~4,000 year time period, so one has to be careful as it is easy to mix up space and time. IMO, there were too many migrations and too much diffusion since then to reliably use extant populations to discern origins. CE (especially when including the Balkans) is pretty large and diverse and harbored a significantly-sized population; so, many of the downstream H subgroups could have arisen locally there (before and after LBK). Except for the known neolithic migration routes, mobility was certainly orders of magnitude lower before BB and the copper age/ bronze age transition. With the lack of widespread European ancient mtDNA coverage, I find it reasonable and practical to look at the statistics of how many mutations different groups are apart, regardless of what location we associate those particular haplogroups with, today. In that analysis, BB is unexpectedly close to Salzmünde and Corded ware, and also to Rössen. As I mentioned above, much closer than what I would expect from an Iberian origin. That is all. It simply looks more CE or even EE (or W Asian in H13) than expected.*

    Maju, you make it sound like all these subgroups predate LBK and the neolithic and none of what we see in the network is associated with ~then-occurring mutations, but rather just with diffusion and migration. How can you be so sure of the antiquity of all these mutations and subgroups? Obviously, H and H1 and several others had ancient and widespread pools of large population. However, something like H88 or H89 may very well be a (more or less) contemporaneous, local mutation. The point is, without more ancient DNA you really (mostly) don't know which subgroups are from far-away or local. If they are not local mutations but rather very old, then they might as well be local (from a long time ago), anyway. See my point? The only hard population distance data that don't rely on speculation of the pre-neolithic distribution of H subgroups are mutation counts.

    I am not questioning whether there is a strong general connection between early BB and Portugal, which there unquestionably is. I am just debating whether this paper shows that there is such a connection in the presented Elbe-Saale genetic data, which I dispute. In fact, I think the paper's F_ST numbers actually agree with me.

    * Here is the paper that - to me - suggests H13 below A13542G to have Caucasian affinity:
    U. Roostalu et al., Origin and expansion of haplogroup H, the dominant human mitochondrial DNA lineage in West Eurasia: the Near Eastern and Caucasian perspective

    See also this link, in which most European (Italian) surrounding connections are far downstream:

    Italy, of course, after the southern Balkans has the most West Asian influence in Europe.

    1. Now that I re-read your original comment, I admit that I did misinterpret you, as you clearly said "post-LBK" and not "LBK". My apologies for that.

      "Maju, you make it sound like all these subgroups predate LBK and the neolithic and none of what we see in the network is associated with ~then-occurring mutations, but rather just with diffusion and migration."

      Or randomness too... why not?

      We cannot look at this dataset in isolation unless we are ready to abandon sanity altogether: that haplotype tree is nothing but a locally-relevant version of the H tree.

      As I pointed in the article and the annotations to the figure 1, there are in Northern Iberia alone, many examples of similar clades, what is evidence that a localized interpretation, as the authors or you attempt to do, is totally wrong.

      Those mutations did not (nor could in any sensible molecular-clock-o-logic interpretation) accumulate that fast: I estimate that each mtDNA c.r. mutation takes an average of c. 2,500 years to coalesce, quite more in the case of H, which has shorter stems. So in all that time you can expect maybe one mutation to happen per branch at most, probably zero in most branches. It is saying: the HV-H transition (2 c.r. mutations) was frozen for 40,000 years and then suddenly all those mutations happened and accumulated in just a few thousand years, maybe centuries only.

      And never mind what happened elsewhere, for example in Iberia, where we see at least some of those clades (H1b, H6, H3...) already present much earlier.

      "How can you be so sure of the antiquity of all these mutations and subgroups?"

      Because I look beyond the Saale-Elbe particular case. Nothing more nothing less. I am not sure about many of the lineages but some of them were present elsewhere in times contemporary with the earliest LBK or much older (Magdalenian, Sunghir, etc.)

      Also there's no archaeological way that from LBK (or post-LBK) would those lineages have spread to all Europe, West and East equally. You need pan-European processes there and LBK was regionally restricted. As I said before: only Aurignacian and Gravettian fulfill the requirements.


    2. ...

      "Obviously, H and H1 and several others had ancient and widespread pools of large population. However, something like H88 or H89 may very well be a (more or less) contemporaneous, local mutation".

      I'm not really concerned about H88 or H89, they don't seem too important in the overall discussion. Still H89 is three c.r. mutations away from H-root and in my MC book that means some 15,000 estimated years of evolution maybe, at least 8,000, too long for the mutation to have happened locally from basal H.

      In other words: I can take up to one mutation in all that period as possible (not demonstrated) local evolution but not the many that accumulate in the long branches, much less if the same lineages are found elsewhere in that same or an earlier time-frame (as is the case).

      I'm trying to look at the whole picture, not just this particular snapshot. I'm trying to see the forest and not just this particular tree.

      "I am not questioning whether there is a strong general connection between early BB and Portugal, which there unquestionably is. I am just debating whether this paper shows that there is such a connection in the presented Elbe-Saale genetic data, which I dispute".

      Well, I do not know if the Portuguese connection is the real explanation or if we should try to figure out other options (Austria?). The H3 of Salzmünde is quite striking but being just two samples it could be a fluke (but then again not the same haplotype, so certainly not close relatives - that would be indeed a huge coincidence), but the 88% H of BB can theoretically have other sources. What I underline is that we know very few populations of those times with such high levels of H: Neolithic Portugal is one, Austrian LBK might be another (just one sample, not enough data) and there could be also others not yet studied... but it's almost certainly not local (too little H in German LBK) nor Eastern European (too little H in Eastern Europe today, excepted some Finnic and Caucasian populations - and the CW H6 is clearly the Western variant).


    3. ...

      The Roostalu paper you mention and the Ian Logan graph seem to support your notion of H13 being related to Caucasus (+Italy+unknown).

      But regardless, the European sample used there is from Loogväli 2004, which is essentially Eastern European with a single Western exception: France. That way they are able to evaluate that haplogroups common in France like H1, H3 or H7 are of apparent European origin, but with haplogroups like H4 or H6a, which are rare in France but common in Iberia (unsampled), they fail miserably.

      Overall I imagine H expanding from Europe, maybe the Balcans or the Danube, long ago. Just like Europe has West Asian influences, West Asia has also European influences (which can be tentatively tracked to some paleolithic flows, for example, even via North Africa - Lower Egypt is shockingly "European" in autosomal data, not just West Asian + African).

      However these old studies, without enough Western European data, fail miserably at capturing the real diversity and local depth of H.

  16. Maju,

    Just to be clear, I generally share your conundrum of H-haplogroup sub-type antiquity and origin.

    H13 is as weird as or even more so than many others.

    Sometimes, it seems we just need to be patient.

    1. @Eurologist:

      "H13 below A13542G appears to be Caucasian, today. So while an exception and not local, it does not seem to relate to Iberia - quite the opposite." / "H13 is as weird as or even more so than many others."

      I think it's interesting to remind that in Gamba et al, 2011 "Ancient DNA from an Early Neolithic Iberian population supports a pioneer colonization by first farmers", you had a mtDNA H20 (H20a IIRC) in the samples, an haplogroup who is mainly found in the Caucasus.


      "These haplogroups are both found in the Caucasus region.[15] H20 also appears at low levels in the Iberian Peninsula (less than 1%), Arabian Peninsula (1%) and Near East (2%)."

    2. @ Eurologist:

      In the recently published "Hughey et al. 2013" (mtDNA from Minoan Crete) they also found a mtDNA H13a1a.

  17. Wagg,

    Yes, I saw that. However, the one in question is much more specific: it is (presumably) earlier, yet downstream of H13a1a2. And Crete is still the far SE, in Europe.


Please, be reasonably respectful when making comments. I do not tolerate in particular sexism, racism nor homophobia. Personal attacks, manipulation and trolling are also very much unwelcome here.The author reserves the right to delete any abusive comment.

Preliminary comment moderation is... ON (your comment may take some time, maybe days or weeks to appear).