May 31, 2013

Italian complex ancestry

This paper is probably the most detailed study of the haploid genetics of Italy to date, considering both Y-DNA and mtDNA.

Alessio Boattini, Begoña Marínez Cruz et al., Uniparental Markers in Italy Reveal a Sex-Biased Genetic Structure and Different Historical Strata. PLoS ONE 2013. Open accessLINK [doi:10.1371/journal.pone.0065441]

The study contains very ample data for both uniparental lineages and confirms that the origins for Italians are very complex. However their conclusions on the alleged sex-bias are totally founded on the very unreliable "molecular clock" methodology, which I will ignore in this review, focusing instead on regional affinities and similar groupings.


After toying a bit with table S1 for easier visualization, I took the following snapshot:

NW (I): Piamonte, Liguria, Lombardia
NE (II): Veneto, Friuli-VJ,
BOL (III): Bologna (or Emilia-Romagna if you dare to generalize from a single sampling point)
TUS (IV): Tuscany
C (Central, V): Lazio, Umbria, Marche,
S (South, VI): Campania, Basilicata, Apulia, Abruzzi, Molise
SIC (VII): Sicily
SAR (VIII): Sardinia

I changed the names of the regions from cryptic Roman numerals. Frequencies are highlighted if >2.5% overall or >5% regionally. All the rest is the same.

In order to more easily visualize the data, I made the following synthesis:

Labels for R1b are based on previous analysis based on Myres 2010 (quick map link). 

Most Italian R1b (27% of all patrilineal ancestry) belongs to the Southwestern clade, dominant (within R1b) in Iberia, France, Switzerland, Ireland... and Italy, and also very important in Great Britain, West and Southern Germany and Scandinavia. In Italy (as in Switzerland and Croatia), this clade is dominated by R1b-U152 (Alpine clade, sometimes also dubbed "Celtic"), which is also common in France and other places. Much less important is the "Irish" clade R1b-L21 (again common in France, as well as in Great Britain) which has however a notable peak in Bologna (10%). The presence of the Pyrenean clade R1b-SRY2627 is rather anecdotal (somewhat more common in NW and Sardinia). This grouping shows a clear strongest influence (almost 50%) in the Northwestern arch (NW, Bologna and Tuscany), with much lower frequencies elsewhere. This distribution does not look too "Celtic" to my eyes, I must say.

Second in importance within R1b is what I labeled as "Euro-root", most of which (6.9% of all patrilineages) belongs to R1b-M269(xP311). This paragroup connects more clearly with the Balcans and maybe West Asia, and is (coherently) somewhat more common to the South and less so in the NW.

Other R1b variants, which are likely to be mostly R1b-V88, are rare except to some extent (3.7%) in Sardinia, where this haplogroup was first identified. 

The allegedly Indoeuropean haplogroup R1a1a displays a very strange pattern for such attribution, being completely absent in the Northeast (NE, BOL), where we would have expected it to be common, as it is for example in nearby Slovenia. Instead the greatest frequencies are in the South and Center of Italy, what suggests that there is still a lot to understand about the origin and dispersal of this lineage. 

It is also notable the presence of I(xI2a), which I labeled "other NE European", although maybe "North, Eastern and SE European" would have been more correct. Within it, the allegedly "Nordic" haplogroup I1 (very common in Sweden), reaches c. 10% in NE Italy (NE, Bologna), again raising questions about the origin of this lineage as well as of all I (which I tend to consider of Ukrainian/Romanian Paleolithic origin).

The other half of the Italian Y-DNA should be of Eastern Mediterranean origins, be them in West Asia or the Balcans. I have divided this group into two categories: on one side what I label "Cardium Neolithic", all three haplogroups being attested in ancient DNA of this culture in Mediterranean Iberia/France, and on the other the rest, which is not attested but should also have arrived from the same broader region, either in the Neolithic wave or later ones (Bronze, etc.)

All three "Cardium Neolithic" clades are well represented in Italy, being the most notable G2a (11.1%), followed by E1b-V13 (7.8%) and then I2a (only 4.1% overall but a bulging 39% in Sardinia - also having the greatest I2b apportion: 2.4%). The most plausible origins of these three Neolithic lineages are respectively Anatolia (G2a), Greece-Albania (E1b-V13) and the former Yugoslavian Adriatic regions (I2). Italy surely acted as trampoline for their expansion Westward some 7500 years ago.

The "Other West Asian" category includes all other E1b-M78, E1b-M123 (both with ultimate origins in NE Africa but arriving to Europe almost necessarily via West Asia and the Southern Balcans), other G, as well as all J, L and T. The most notable of these lineages is J2a (11.4%, with strongest impact in Sicily, Central and NE Italy), followed by E1b-M123, which made an impact especially in Sardinia (6.1%) and L (major in NE Italy: 8.2%). They may all be localized Neolithic founder effects but uncertain. Of this group only J2 (J2a?) made some impact further West, reaching >5% in some parts of Iberia.

Overall African lineages (the rest of E) seem to have impacted more notably in Sicily (6.4% overall), however the characteristic NW African E1b-M81 also left some mark in Bologna (3.4%).

Some mention deserves also the rare F*, which has a rather Northern distribution in Italy, quite similar to that of R1b-SW.

Figure 1. Spatial Principal Component Analysis (sPCA) based on frequencies of Y-chromosome haplogroups.
The first two global components, sPC1 (a) and sPC2 (b), are depicted. Positive values are represented by black square; negative values are represented by white squares; the size of the square is proportional to the absolute value of sPC scores.

Mitochondrial DNA

Being too large and detailed I did not take a picture of table S7, which neatly displays the mtDNA data. The most notable lineages anyhow are the following ones:
  • HV*: 4.1% (notable in NW: 6.8%)
  • H*: 11.1% (widely distributed)
  • H1*: 10.4% (common except in NE, highest in Sardinia: 18.6%)
  • H1a (5.7% in Bologna)
  • H2 (7.7% in Tuscany)
  • H3: 3.9% (10% in Sardinia, 8.6% in Bologna)
  • H5: 4.3% (more notable in NW, Tuscany, Center)
  • T1a: 3.4% (9.3% in NE)
  • T2b: 3.4% (8.6% in Sardinia)
  • J1c: 3.9% (6.2% in NW, 14.3% in Bologna)
  • J2a (5.1% in Sicily)
  • J2b (7.1% in Sardinia)
  • U5a: 3.7% (most important in Central region, NE and Bologna)
  • U5b (7.1% in Sardinia)
  • K1a: 4.4% (most important in NE, Bologna, Tuscany and Center)

I also attempted a synthesis here, although some may disagree with my labels (I'm a bit in doubt myself in some particular cases, admittedly):

Let me explain the why of the labels and groupings:
  • Paleo1 corresponds to what some extremists consider the only valid Paleolithic lineages in Europe, i.e. those sequenced in Central and Eastern European "foragers" (excluding Sunghir's H17'27). I'm particularly uncertain about U8b: U8 has been sequenced in Paleolithic Europeans but U8b is closest to K and both are found also in West Asia.
  • Paleo 2 corresponds to the lineages that appear to spread, at least partly, from SW Europe, some of which (H6, H1b, H*) have been sequenced among pre-Neolithic hunter-gatherers.
  • Paleo/Neo is a category of lineages I am uncertain about: 
    • HV* has been sequenced in Italian foragers but some of it may also have arrived with Neolithic
    • V appears to have similar origins to the SW European H lineages but it has only been sequenced in aDNA since Neolithic, so... 
    • Other H: I was simply unwilling to ponder each of the many small lineages' possible origins.
  • Neo is the category of most likely lineages of Neolithic or post-Neolithic arrival. I have doubts especially about K, which is first sequenced in aDNA in Neolithic Syria/Kurdistan and spread clearly within Neolithic flows, however its phylogenetic connection with U8 makes me doubt about its ultimate origins and flows.
  • Exotic includes those clades of quite clear origin outside West Eurasia/Mediterranean basin (mostly Siberian lineages): they are quite rare even considered together*.
  • The categories in cursive are just groupings of the previous, as per description.

One of the aims of these groupings was to check if the molecular-clock-o-logical claims of the paper made any sense. It seems not. Italian mtDNA, like the Y-DNA seems split by about half between likely Paleolithic European clades (of possible post-Paleolithic arrival to Italy in many cases) and likely Neolithic ones. Regional variation does exist but it's not too remarkable. For example if we take the Neo row, it seems that the South of the Peninsula (S) was a bit more influenced by Neolithic or post-Neolithic flows, but the difference with the less influenced area (NW) is of just some 12 percentile points. This pattern is mirrored in reverse by the Paleo 1+2 row.

However if we take the Paleo 1 row, we see a pattern which does not seem consistent with Paleolithic continuity, at least to my eyes, with the highest frequency in the NE (open to migrations from Balcans and Central Europe), followed by the Central region and Sardinia. It rather seems to correspond, at least in part, to migrations from those regions: Balcans and Central Europe.

But, as always, your take.

Figure 3. Spatial Principal Component Analysis (sPCA) based on frequencies of mtDNA haplogroups.
The first two global components sPC1 (a) and sPC2 (b) are depicted. Positive values are represented by black squares; negative values are represented by white squares; the size of the square is proportional to the absolute value of sPC scores.

* On second thought (mini-update), the overall frequencies of "Siberian" lineages are not so negligible in two regions: Sicily and Central Italy, where they amount to >3% taken together. I'm wondering if this may be symptomatic of Roman slave trade, which is known to have Eastern Europe as its main source of slaves after its consolidation as Empire (also in the Middle Ages).


  1. Replies
    1. Never heard of the famed "Italans"? ;-)

      Fixed thanks.

    2. There were a few other typos, hopefully I corrected them all now.

  2. Replies
    1. R1b-M269(xU106,S116), see map, where it's labeled as "ancestral clade" and "transitional clade". In the case of Italy it's mostly R1b-M269(xL23). It basically does not seem to have a Western origin (but Balcanic or West Asian) but is not well studied phylogenetically, so it probably hides some interesting substructure in the area of its presence.

    2. I'm still awaiting for more studies on R1b-Euro-root. The last thing I saw, besides Myres, was Morelli 2010, who appeared to show with STR-haplotypes that Anatolian R1b-M269 is essentially a single sublineage with apparently Balcanic roots. In fact you can also see that in Balaresque 2009 if you can ignore her speculations and dissect her data calmly.

      So "root" here just means "near the root relative to Western European clades. It's not the root as such.

    3. You might not love Family Tree references, but, anyway, the Scandinavia R project shows many R1b1a2 matches (almost half of all R1b), and it seems to me that it is mainly M269(xL23), as the western European muatation types seem to be indicated. More than half of the Finnish R1b is also marked only with M269, so it also seems to fall in this category.

    4. That's not corresponding with Myres' 2010 data, which has zero M269(xL23) in Scandinavia and Finland. That paragroup (when tested for SNPs) is only found at relatively high frequencies within R1b between the Balcans and Iran (which is the most likely area of origin of R1b-M269 overall, with the Balcans emphasized maybe - Morelli 2010).

      I have no idea why FTDNA produces those "results" but it may well be because STR-based haplotypes are only somewhat informative and can well be misleading in many cases.

    5. PS- other areas with less numerically important R1b-M269 are Central Europe, Italy and SW Europe.

    6. Thank you for this Myres paper! I opened it and found out that Finns are not included in the study and Swedes are only Malmö Swedes, but I agree that Finnish frequencies should be similar to Estonian and Karelian frequencies. In this Myres study, in Northern areas, M269(xL23)is found only in Northern Russians with a frequency of 0,0079 which is similar to Italian frquency of 0,008. The Poles have a frequency of 0,024 and the Bashkirs 0,025 which is comparable to maximum German frequency of 0,053.

      However, L23(xM412)is present and it seems to be more frequent in Denmark and Sweden (0,043) than in Spain or France. Also Estonians and Karelians have a low frequency of L23(xM412) of 0,005 and Northern Russians a clearly higher frequency of 0,0236 and Komis even 0,115. On the basis of this, I would expect this Finnish R1b be, apart from these known western European types, in small part M269(xL23) and possibly a yet undefined European clade, but it is not impossible that there are sporadic occurences of this Northern Russian M269(xL23)in Scandinavia.

    7. Oops, you're right: no Finns in the data but Estonians. I just quickly looked at the reference map and mistook one for the other.

      "However, L23(xM412)is present and it seems to be more frequent in Denmark and Sweden (0,043) than in Spain or France".

      In relation to the overall R1b-M269 yes, however R1b is more common in the latter than in Northern Europe. Not sure which is your point anyhow, it's obvious that R1b-M269 did not originate in Northern Europe in any case.

      As I pointed in my original review, it's quite apparent (map) that R1b-M269 must have coalesced towards West Asia (maybe in the Balcans) with spread to Central Europe and Italy followed by founder effects cum expansion in two areas: (1) the Franco-Cantabrian region (R1b-S116) and (2) NW Europe, possibly Doggerland (R1b-U106). The timing of these expansions is debatable but the geography of the latter two suggests a Late UP process (otherwise it's nearly impossible to fit France or the FCR as origin for the main subhaplogroup S116, and there's no alternative scenario to that origin in Southern France for it).

    8. I am not at all claiming a norther origin for R1b, but if R1b-M269 coalesced in the Balcans, there must have been a flow of M269(xL23) to Northern Russia through Ukraine and excluding Bielorussia. This might then be a Neolithic/Palaeolithic, pre-Indo-European event. The other possibility is that R1b-M269 coalesced in West Asia and spread from there to Northern Russia. I might think that this is a more plausible route and the fact that M269(xL23) is more frequent in Northern Russia than in Ukraine and seems to be absent in South/Central Russia and Belorussia could be due to the expansion of Indo-European R1a. Do you prefer a late Indo-European flow from the Balcans?

    9. Sorry, there was an error in my reply at 8:35 AM. I did not mean M269(xL23), but L23(xM412), so
      "On the basis of this, I would expect this Finnish R1b be, apart from these known western European types, in small part L23(xM412) ..."

    10. R1b-M269 in NE Europe only seems notorious in the "Circum-Uralic region", almost only among Bashkirs (0-74% of all patrilineages, depending on locality), they also have M269(xL23) at higher frequencies than any other nearby population (2.5% vs <1% in West Ukraine and Northern Russians and 0% elsewhere), so the source are probably Bashkirs.

      The main sublineage among Bashkirs is anyhow R1b-U152 (Alpine clade, 71% in SW Bashkirs) followed by the "ancestral clade 2" L23(xM412) (3-32%), what may give us a clue on their ultimate origin, which could well be where these two clades are found at similar relative proportions, such as Italy or Croatia (rather the latter in order to incorporate more easily the traces of M269(xL23)).

      How and when did they arrive to Bashkortostan? No idea sincerely (although I'd speculate on some undocumented Neolithic migration first of all) but it would seem a very peculiar exceptional case: a founder effect specific to that people and almost nowhere else in the neighborhood (except minor influence of Bashkir ancestry in Northern Russians possibly).

      "... could be due to the expansion of Indo-European R1a".

      I would not think so, because it would imply a nearly total eradication of R1b in the area. I rather imagine that R1b was never important in Eastern Europe to begin with and that the Bashkir exception is just that: a peculiar anomaly that asks for an explanation we can only speculate about right now.

      "Do you prefer a late Indo-European flow from the Balcans?"

      No. I don't identify IEs with any expansion from the Balcans and I don't see any expansion from the Balcans into Eastern Europe at any time in Prehistory (nor History), at least since Gravettian times. Eastern Europe must have remained pretty much on its own with limited exceptions (Epipaleolithic expansion in the area around Lithuania, Corded Ware expansion) since Gravettian times, at least regarding other parts of Europe.

      I must insist that this seems just a peculiar anomaly in the proto-Bashkir ethnogenesis and nothing else.

    11. PS- One of the interesting infos circulating about Bashkir prehistory are the presence of stone circles, some comparable in size to Stonehenge, some 5000 years ago. I wonder if they may be a clue. It seems related to Abashevo (← Corded Ware) but it might have been carried by some peculiar subpopulation, maybe a dissident exiled tribe of some sort. Stone circles certainly are not expected within Corded Ware nor other Kurgan-derived cultures, so something peculiar is going on here.

    12. My comment "could be due to the expansion of Indo-European R1a" would be possible if M269(xL23) had spread to Russia at an early date before later European R1b clades arrived, but this may, in fact, not be the case.

      I would not say that Bashkirs are the whole story of R1b in Russia, as they have M269(xL23), L23(xM412) and U152, but their main clade U152 is not found in Chuvashes, Komis, Udmurts, Kazan or Bashkortostan Tatars or even Northern Russians, although they all share L23(xM412). Moreover, the frequency of another R1b clade, U106(xU198), in Tatars is 0,025 and in the Chuvashes 0,009 but at maximum 0,003 in Bashkirs. R1b clades in the Bashkirs, Northern Russians and Circum-Uralic people are not symmetric. Circum-Uralic people tend to have just those subclades that are missing in the Bashkirs (or which they have clearly less). Circum-Uralic people have S116*(xM529(xU152) and U106(xU198)) and Bashkirs have U152.

      Italians seem to harbour all the above mentioned R1b subclades and in frequencies that are higher than in Russia. It is interesting that also Poles have the same clades that are found in Russia. It is true that M269 (xL23) in Northern Russians could have originated in the Bashkirs. It looks as if M269(xL23) and L23(xM412) spread to the Ural area from Poland (originally from Italy?), possibly through Belarus. This same wave might also have reached Baraba Steppe during the late Bronze Age, c. 1000 BC. Instead, U106(xU198) and S116*(xM529xU152) might have arrived in the Ural area and Ukraine with the Viking traders. Bashkir U152 is strange because it is found in Ukraine (West, Central and East), Central Russia, Poland and even in Karelia and Estonia, but not in Chuvashes, Komis, Udmurts, Kazan or Bashkortostan Tatars, Belarussians, Southern or Northern Russians.

      So, I did some googling. The distribution of different clades in Bashkirs is very heterogeneous:
      n* M73 M269(xL23) L23(xM412) U152
      South-East Bashkirs 329 77 8 106 2
      West Bashkirs 54 0 0 0 0
      South Bashkirs 79 0 2 9 1
      North Bashkirs 70 1 0 2 50
      South-West Bashkirs 51 1 0 0 0
      Total 586 79 10 126 53
      The amount of U152 is bloated only in North Bashkirs and only South-East Bashkirs have high amounts of M73 and L23(xM412).

      Then I found this Bahadir’s (not any idea who he is but seems a Turkish speaker) comment with a Google translation ”(They found) in the tribe Gaina, who lives in the Perm region of Russia "Western European" subclade R1b-U152. This tribe before the migration to the South Urals (8-9 century) and the occurrence of the Bashkirs lived in the steppes near the Black and Azov Seas. Presumably, Gaina tribe - the descendants of the Goths. This is indicated by the name of the tribe "Gaina".This ethnonym was distributed among the Goths. So famous Gothic commander Gainas, who lived in the 5th century AD.

      So, could this Bashkir U152 have a Gothic origin?

    13. In Finnish, “Russia” is “Venäjä” and it comes from Winedas, probably the same as Venethi of Tacitus, and was originally referred to Corded Ware people from Poland. Google gives the following translation for Veneti (Italian Wikipedia): “In the Bronze Age between 1350 and 1150. a.C. terramaricoli the villages of the low plains of Veneto come in large commercial circuits involving and coasts of the Baltic, the Danube-Carpathian area, the Aegean and the Eastern Mediterranean.”

      The 8 most frequent mtdna haplogroups of the Slovenians and Poles are the following:
      H, J, U5a, pre-V, U4, T, W, K
      H, T, J, U5a, U4, pre-V, W, K, U5b
      Baraba late Bronze Age Western haplogroups are U5a, T1, and Chicha haplogroups J, H, U5b, 3xU4, U1a, 3xK, U3, W, H6a1. In comparison, new Western haplogroups of Mansi people (different from Eastern and Ust Tartas haplogroups) are V (also found in Finland), H1b (frequent in Lithuania), H*, H2 (frequent in Finnic people and in Eastern Slavs), H3 (quite frequent around the Baltic Sea), J1b1 (frequent in Estonia), T1 (quite frequent in Lithuania), T2 (very frequent around the Baltic Sea, (even 11 in Estonia and 7.5 in Poland), N1a (frequent in LBK).

      On the basis of this, this ancient trading area stretching from the Adriatic Sea to the Balticum and all the way to the Ural might really be the source of M269(xL23) and L23(xM412) in Siberia. I am still wondering the the ergative in Hanti. If this Baltic Corded Ware Indo-European language cannot be of Ergative type, it must be a feature of the language of ydna Q carrying people, for example Ket language, although hg Q is not even noted in my Mansi ydna chart.

      In this model, the stone circles should go back in time at least to the split between M73 and M269.

    14. For the sake of completeness, here you have the most frequent mtdna haplogroups of the Bashkirs and the Chuvash:
      Bashkirs: U5 (14%), U4, H, C (12%), D, F, G, T1, J, HV0 (3%)
      Chuvash: H (27%), U4, U5, HV0, K (7%), J, D, T1 (3.6%)

    15. One of the issues about Bashkir R1b-M269 is that U156 is only "their main clade" for SW Baskirs, while the other Bashkir populations have a very small fraction or nothing at all of it. So maybe my earlier idea of a "Croatian" link was wrong to begin with, anyhow.

      Reformulating: Bashkirs have plenty of R1b-M269, including both "ancestral" clade layers, as well as U156 (Alpine clade) but the Alpine clade is not as overwhelming as we thought first (except in the SW: a local founder effect, no doubt), therefore a possible origin could be around the Middle Danube in a context related to Corded Ware (because of the unusual Afanasevo relation of stone circles in Bashkortostan) because that area has those clades in roughly the right apportions (but also others absent or low among Bashkirs).

      On U106 you're probably right: it does not seem Bashkir-centered. But it is anyhow in all cases very small (2.6% at most) and could originate in Russia or the Baltic. So it may have a different origin but, well, having such a tenuous presence in general... who cares?

      ... "also Poles have the same clades that are found in Russia".

      Not really: Russians and Central Ukranians have some "notable" (for their low levels of R1b) R1b-S116*, which is absent in Poland.

      "The distribution of different clades in Bashkirs is very heterogeneous"...

      Yes. Well, somewhat heterogeneous. They share in most cases high levels of R1b and, within it, relative high presence of said "ancestral" clades. The main exception here are Western Bashkirs, who actually lack any R1b whatsoever.

      "So, could this Bashkir U152 have a Gothic origin?"

      I don't think the Goths had any major genetic impact anywhere. At least not in Spain nor Italy. Also Goths should have a Swedish genetic pool of some sort (with lots of I1 and R1a and little R1b), what does not correlate well here.

      "In Finnish, “Russia” is “Venäjä” and it comes from Winedas, probably the same as Venethi of Tacitus, and was originally referred to Corded Ware people from Poland".

      Actually an Iron Age people from Lithuania and Eastern Prussia. Roman sources are very clear about the Veneti living East of the Vistula, maybe Baltic Peoples of some sort. However in the supine ignorance and sloppiness of the early Middle Ages, often Roman geographical and ethnic names were displaced or confused, examples are Aquitania, Cantabria, Vascones, etc. Even Goths and Getae were confused. So Veneti or Vends became a loose synonym for Slavs in Germany - probably from Germany the ethnonym migrated to Scandinavia and from there to Finland.

      There were some other peoples named Veneti in antiquity but they do not seem related at all: the Veneti of NE Italy, of Italo-Celtic affinity, and the Veneti of what is now the Western part of Brittany, of unknown ethnicity. There's speculation on a common Indoeuropean root for all the names akin to "win" (i.e. the victorious ones) but who knows!

      I refuse to speculate about any correlation with the Baraba region, of which we only know mtDNA, which, considering the relative European homogeneity on this matter, can eventually be linked to anywhere you wish.


    16. ...

      "I am still wondering the the ergative in Hanti."

      No idea what you mean: what is "Hanti", what kind of ergative you mean?

      If you mean ergative-absolutive languages the Eurasian list seems very limited:

      1. Basque
      2. Caucasus and Zagros area languages (all three Caucasian families, some Iranian languages like Kurdish, Hurro-Urartean and Sumerian).
      3. Tibetan, Eskimo-Aleut.

      I don't see anything about Ket being ergative-absolutive, nor seems to be the case either with its relatives of the Na-Dené family.

      "In this model, the stone circles should go back in time at least to the split between M73 and M269".

      Uh? Not in my book: M269 diversity (R1b in that graph) seems to span to c. 25-30 Ka BP, so the M73 split must be older - but not older than 40-50 Ka BP, which is the attributed age for the R1b/R1b split. Anyhow I'm thinking in pushing those dates a bit to the past because I should have better calibrated the F node (and not CF) to c. 80 Ka BP.

    17. I am just wondering why Khanty (in Finnish Hanti) language has ergative. And sorry, it is not Ket but Chukchi language that is of ergative-absolutive type (and of course all the languages on your list) and Chukchi have 13% of hg Q and 21% of M45(xM17).

      With this split I refer to Malyarchuk et al. paper in which they say that the coalescence age of M73 in Siberia is 18Ka (on the basis of the comments I received I had the impression that 18Ka is too old to be true), but the age of subclusters A and B is only 4.4 Ka ja 5.6 Ka. And there was this back migration of these subclusters from Siberia to Europe and Caucasus and I thought that the age of the subclusters above is not that far from the age of Stonehenge. However, it is true that my wording was wrong.

    18. According to the Myres paper, S116*(xM529xU152) is not absent in Poland, it is found in Southwest (Wroclaw) Poles with a frequency of 0,022. Three out of five Bashkir groups lack M269(xL23), three out of five Bashkirs groups share M73 and four out of five groups share L23(xM412).
      I just thought that the big presence of U152 in one small Bashkir group could be explained with a local event such as the presence of Goths, but it is true that only one Bashkir group (Zapadnovo Orenburja - Western group) has a small amount, 2%, of I*-M170(P37), and they do not have U152.
      So, we are still looking for an explanation for U152 in Bashkirs. However, the Ukrainians have a frequency of 4.8% of I1a and 0.5% of I1c (old nomenclature).

    19. "... 18Ka is too old to be true"...

      Probably too recent IMO but I don't have enough data to be sure about this particular. There's nothing like a "genetic radiocarbon" anyhow, so we should never think in terms of time but of of structure (phylogeny and geography). Time is not told by the genetic data directly (at least not at the present level of knowledge).

      ... "we are still looking for an explanation for U152 in Bashkirs"...

      It is too small and irrelevant to matter? Wasting neurone use on peripheral noise does not improve understanding.

    20. I am still thinking about this ergative-absolutive pattern of Khanty language, and it must be linked to metallurgy and this considerable share of R1b in them. I checked the chronology of bronze metallurgy:
      3700-2500 BC Maikop
      c. 3300 BC Indus Valley
      3150 BC Egypt
      3100-2700 BC Majiayao, China
      2000-1500 BC Seima Turbino

      In addition to their Western M269, Bashkirs seem to have a strong Siberian component:

      Zapadnovo Orenburja (I wonder if this corresponds to Bashkirs South-east in Myers paper)
      R-SRY10831.2 40%
      R-M269 23%
      C-M48 12%
      N-Tat 7%
      E-M35 7%
      C-M130(xM48) 5%

      Vostochnovo Orenburja (Eastern Orenburja)
      N-Tat 65%
      R-SRY10831.2 18%
      R-M269 9%
      O-M175 6%
      I-P37 2%

      Permskije (I wonder if this corresponds to Bashkirs North in Myers paper)
      R-M269 84%
      R-SRY10831.2 9%
      N-Tat 3%
      R-M73 2%
      G-P15 2%

      Then it seems that, on one hand, Kabardians share M73 subcluster b with the Bashkirs and, on the other hand, M73 subcluster c is found in Balkars and Megrels (Mingrelians?), and these haplotypes must have come from Siberia. I think now that Bashkirs did not speak a Turkic language c. 1000 BC but a Caucasian type of language with ergative-accusative pattern, and changed their language to Turcic language with the growing power of Turkic and Mongolic speaking groups, including the Mongols, and Khanty has a substrate from this earlier type of language.

      Judging from their ydna composition there must have been close contacts between Bashkirs and Uralic people and it seems to me that they were very much involved in this metallurgy business of the Ural area that became closely connected with the Baltic area. Wikipedia says about Fatyanovo–Balanovo culture that Fatyanovo migrations correspond to regions with hydronyms of a Baltic language dialect mapped by linguists as far as the Oka river and the upper Volga. Spreading eastward down the Volga they discovered the copper ores of the western Ural foothills. The Balanovo culture occupied the region of the Kama–Vyatka–Vetluga interfluves where metal resources of the region were exploited. It does not seem to represent a northern extension of the Indo-European Yamna culture horizon further south.

      It is interesting that the Maikop people may also have spoken an ergative language. I think that their language was of Hurro-Urartian type, i.e. ergative and agglutinative language. However, I do not know if this has anything to do with ydna R1b. Do you think that this putative ergative language of the Bashkirs could be related to Hurro-Urartian type of languages?

    21. Finally I found something about "ergativeness" in Khanty: it is not an ergative-absolutive language but an ergative-accusative or tripartite language, which is a different kind of grammar. Ergative-accusative languages are Khanty, Ainu, Nez-Percé (not part of Na-Dené-Yenisean but of the Plateau-Penutian family) and at least two Australian Aboriginal languages.

      ... "the Maikop people may also have spoken an ergative language".

      IMO Indoeuropean (ancestral to Anatolian?) Anyhow Caucasian languages (all them) are not ergative-accusative but ergative-absolutive.

      "The Balanovo culture occupied the region of the Kama–Vyatka–Vetluga interfluves where metal resources of the region were exploited. It does not seem to represent a northern extension of the Indo-European Yamna culture horizon further south".

      Fatyanovo-Balanovo is generally considered a Kurgan derivative. They are not Yamna-derived however but from Corded Ware instead.

      From Wikipedia: "The Fatyanovo–Balanovo culture, 3200 BC-2300 BC, is an eastern extension of the Corded Ware culture into Russia".

      In my understanding Yamna would be proto-Indo-Iranian, while Corded Ware would be proto-Western Indoeuropean. Yamna is not the origin of Kurgans or PIE but only a branch of it. The origin of PIE is at Samara Valley culture. Yamna is too recent and derived to be the origin of IEs overall. It's just a major branch instead.

    22. This ergative-accusative pattern could, in fact, be the result of a mixture betweem two different systems, ergative-absolutive and nominative-accusative patterns.

      "Fatyanovo-Balanovo is generally considered a Kurgan derivative. They are not Yamna-derived however but from Corded Ware instead."

      Yes, indeed! The core area of Corded ware was in Poland, cf. "Corded Ware ceramic forms in single graves develop earlier in Poland than in western and southern Central Europe." (Wikipedia) I would say that Fatyanovo people spoke a baltic type of archaic Indo-European. Corded Ware Indo-European language must have been, as you said, proto-Western-Indoeuropean.

      In my view, during the Bronze Age, we have this thriving trade route from the Adriatic Sea up to the Baltic Sea and from there all the way to the Ural Mountains. I think that this is one plausibe corridor for R1b to spread from Italy to the North-East.

    23. "This ergative-accusative pattern could, in fact, be the result of a mixture betweem two different systems, ergative-absolutive and nominative-accusative patterns".

      No idea: beats me.

      "I would say that Fatyanovo people spoke a baltic type of archaic Indo-European".

      We do not understand well (at least I do not) the genesis of Balto-Slavic, and hence Baltic. It's clear that Balto-Slavic is derived from Western IE, as are Celtic, Germanic, Italic, etc. but it's also probable that it had many pre-historical branches that have not survived.

      So do you think that a direct line can be traced between Fatyanovo-Balanovo and the Bronze and Iron Age cultures at the origin of Balto-Slavic? If so, could you outline a synthesis of this process?

      Anyhow, I find hard to imagine the spread of R1b to Bashkortostan merely by trade routes. It is too important and localized and I'd rather think of a founder effect colonization in the earliest Bronze Age (i.e. Fatyanovo-Balanovo with stone rings), possibly with a military design or product of a forced exile, or something like that.

    24. Not necessarily a direct line. I said only "baltic type of archaic Indo-European", because they say that "Fatyanovo migrations correspond to regions with hydronyms of a Baltic language dialect". I cannot be more precise than that. However, it seems to me that Slavic languages developed in a more southern or southeastern location.

      In my view, in particular North Bashkirs should have a different origin. If these North Bashkirs (Bashkortostan, Russia) correspond to Permskije Bashkirs of Lobov paper, their mtDNA frequencies differ from the mtDNA of other groups in that they have less H and C, but more U5, F and Y and less T2 (zero). It is interesting that they have high frequencies of N1a and M1 compared to others! (
      Paternally, as we already know, they do have some M73, they do not have M269(xL23) and they have clearly less L23(xM412) than South and South-West Bashkirs and really extreme amount of U152. The presence of N1a and M1 might give us a clue of their origin. It is of course possible that, in general, Bashkir R1b has a different origin from that of Tatars, Northern Russians and Circum-Uralic people. For the latter, the origin could then be this Fatyanovo culture and for the Bashkirs the source could be somewhere Southwest of the Urals (Caucasus?), as indicated by the presence of M1 and N1a.

      R1a (R-SRY10831.2) is found in all Bashkir groups at quite similar frequencies as R1b, but R1b is lacking in one group, so R1a should also be considered a founder haplogroup.

    25. R1a or variants or it are supposed to have originated not much further West or SW than Bashkiria, in the Samara Valley, at least if we can still correlate them with Indoeuropean expansion. So their presence is not anomalous at all (unlike that of R1b-M269 at high frequencies).

      Anyhow you should consider that in all Northern Europe, agriculture was relatively poor before the Middle Ages because the heavy plow had not yet been invented and the production with old-style ("Roman") plows was limited. It was only since the introduction (from China?) of the heavy plow and horse collar in the Middle Ages that the agricultural productivity really exploded in such deep soils, allowing for dramatic demographic growth and the gradual shift in economic and political relevance from the Mediterranean to the North.

      With this I mean to emphasize that, even in relatively low latitudes, population density was relatively low in the Northern half of Europe before the High Middle Ages. And, naturally, it was even lower further North. This allowed for some dramatic founder effects and drift-caused long and narrow "bottlenecks" in the more marginal regions, as could be the Finno-Ugric belt or also the Bashkir mountain areas.

      It's possible that, say, a thousand men founded the Bashkir nation altogether (Y-DNA) in the early Bronze Age. I am not sure of the exact figure, it could be lower or larger but I would not be surprised if it was rather small.

  3. "However if we take the Paleo 1 row, we see a pattern which does not seem consistent with Paleolithic continuity, at least to my eyes, with the highest frequency in the NE (open to migrations from Balcans and Central Europe)"

    Just a general opinion - which may relate to the low countries population structure in the previous post also - but i think foragers are likely to have survived best in areas which provided relatively high forager population densities which were at the same time not particularly suitable to early agriculture. I'd suggest wetlands and deltas as examples of those kind of areas - lots of fish and game so a large healthy population of foragers but poor for farming (at that time).

    I don't know much about NE Italy apart from the piratical Veneti and later Venice - how swampy was that region?

    If so invaders coming from the Slovenia / Danube direction may have bypassed the swampy bit.

    1. What you say (in general) makes some sense but always within speculative terms, because the factors were no doubt many and agriculturalists in the Mediterranean do not seem to have any such radical advantage over foragers other than their long-distance ships (more fishers than agriculturalists in fact). And many many (not all) Cardium Pottery sites display apparent continuity with local Epipaleolithic toolkits, what indicates that local foragers were massively incorporated to the Neolithic society and economy once and again in this area, in Italy as in Iberia (possibly helped by the relative semi-forager nature of Cardium-Impressed Neolithic).

      But whatever the case, NE Italy is not any "swamp" but a semi-flat corridor widely open to the Balcans and the Danubian central plain. Sure: there are/were swamps along the coast all the way to the Po Delta but that does not define the region except marginally (those areas were seldom inhabited other than by fishermen and later also traders). Part of my family is from that area and in all my visits I have never seen any swamp except for the Venetian lagoon. See for example this old map, which shows the ancient coastal swamps but also the wide flatlands between them and the Alps.

    2. Yes, i didn't mean the whole region just a strip along the coast. Looking at your map people coming from the NE might then bypass the coastal wetlands leading to the possibility of them becoming a refugium for older DNA.

      It's just a possible explanation for the higher percentage in the northeast.

      "Part of my family is from that area and in all my visits I have never seen any swamp except for the Venetian lagoon."

      Wouldn't a lot of what is the most well-irrigated terriotory in the world today e.g. Nile valley, Wei valley etc once have been giant (and relatively densely populated) wetlands before irrigation?

    3. "It's just a possible explanation for the higher percentage in the northeast."

      should be

      It's just a possible explanation for the higher possibly paleo 1 percentage in the northeast.

    4. A more reasonable explanation IMO is that the "paleo 1" category is based only on Central and Eastern European data and that region is closer and open (the only access to Italy not blocked by sea or the Alps) to Central and Eastern Europe. In any case, most people no doubt lived in the plains and the subalpine hill country: Venice is a historical exception founded on trade and naval might, not agriculture.

      The Northern Italian plain was in fact "neolithized" by a wave originating in the Western Balcans, which, unlike most of their Impressed-Cardium Pottery relatives, who traveled by sea essentially, walked into it by that very corridor.

      "Wouldn't a lot of what is the most well-irrigated terriotory in the world today e.g. Nile valley, Wei valley etc once have been giant (and relatively densely populated) wetlands before irrigation?"

      No. Irrigation did not shape the floodplains in most cases, just maybe managed some of those naturally-occurring floods. Egypt's irrigation was mostly natural before the Aswan Damn: the Nile flooded the fields, which could be worked after the waters receded. The pharaohs did build a canal (emtyying in a lake) to distribute some of that water over time but it was not essential to the natural fertility of the Nile basin, which was actually based on seasonal flooding.

      Nothing of this seems to apply to Northern Italy anyhow. Sure: the Po river floods now and then, as does the Danube and other major European rivers, but the effects are limited in time and "irrigation" seems mostly unrelated to that - some modern canalization engineering maybe but not irrigation, which is the process of distributing river water to otherwise dry or semi-dry lands.


    "Tell me, when did they imagine that the Prince who cleaned up the Venetian lagoons of Lombardo-Veneto, who drained the swamps to arrest the spread of malaria.."


    "The area was never very anthropized as the whole territory behind the coast (Pinetus Maior and Minor Pinetus), the ancient lagoon, was transformed over the centuries, thanks to the contribution of rivers, in alluvial soils and unhealthy swamp.

    There were however people living in that areas over time as evidenced by James Filiasi in the book "Historical Memories of Veneti First And Seconds" of year 1797: "... in that time, we saw many ruins around the Caorle shores, and this was the end of the fifteenth century. They say it is still possible to find traces at Lido Altanea, ... ".

    There was also the port of Bishops of Ceneda (the ancient name of Vittorio Veneto) that built there in the sixteenth century, also a church (made by the Bishop Girolamo Righetto). The current extreme boundary of the diocese of Vittorio Veneto still comes down to the sea where once stood its port in the Altanea Valley.

    The work of draining the swamp began in these areas just before the First World War, continued between the wars and was completed, thanks to the work of Dr Giorgio Romiati between 1950 and 1965. After the draining of the 500 hectares of Altanea valley, started the initial works of reclamation of the land (1964) and were made the first crops."


    "The sestiere developed from a series of islets along the edge of the Grand Canal.
    It was a swampy marshland, drained only from the 10th century. At that time traffic developed in a two-way direction: from San Toma' to San Silvestro via San Polo and from Rialto to San Cassiano.
    In the 11th and 12th centuries the whole area was drained and a wide network of roads and canals was created."


    "The fourth itinerary is characterized by the drainage’s landscapes; from Conche di Cavallino it reaches Caorle, the Pearl of Adriatic. The street permits you to see the Sile, the near beach of Lido di Jesolo, the pinewood of Eraclea Mare and the way of the draining pumps."


    "At the beginning of the twentieth
    century, the Eastern Veneto area underwent
    radical changes. Banks were built
    to curb water courses, lagoons and malaria filled
    swamplands were reclaimed and woods covering
    vast areas were destroyed to make way for farmland."


    I think that coastal area was swamply wetlands in ancient times hence why the veneti were boatmen and pirates and why they weren't over-run by the celtic invasion - and why that stretch of coastline could possibly have been a refugium.

    1. But the area was not "anthropized" and very scarcely inhabited. It was considered hostile for human habitation, most people lived further inland, in the plains and hill country. My own ancestor participated in the partial dessication of the "paludi di Comacchio) in Ferrara, further South - nowadays it'd be considered an ecological crime but then it was just "progress", fighting malaria and putting "useless" lands to work.

      The swampy area is just a narrow strip near the coast anyhow, it's largely irrelevant because the vast majority of people did not live in it. It's like trying to explain the origins of the English appealing to Sherwood forest... it's not reasonable.

      ... "why they weren't over-run by the celtic invasion"...

      They were at some point in fact largely over-run by the Celtic invasion, which took over the Northern Italian plains:

      The Periplus of Pseudo-Scylax, traditionally attributed to Scylax of Caryanda, a Greek traveler and geographer active between 522 and 485 B.C., mentions the presence of Celtic-speaking peoples settled in northeast Italy.

      However these were surely related in culture to other Western IEs like Veneti and other Italics who had preceded them in the Urnfields period, so there was surely room for co-existance, much as it happened in Iberia, where the Lusitani are believed to be non-Celtic (but related).

      The Veneti weren't aboriginal in any case but a product of Western IE expansion since the Urnfields period into, initially, Northern Italy (Canegrate, Golasecca and Hallstatt cultures), illustrating again my point about the area being exposed to invasions from the Danubian basin (although these did cross the Alps).

    2. Fair enough.

      If there ever was a partial at least refugium along that coast explaining a higher than average surviving amount of potentially paleo 1 DNA - 11% i think you mentioned in NE Italy compared to lower levels elsewhere - then i'd expect it would correlate with some local malaria protection adaptation like thalassemia among that 11%


      page 3

      "In Europe, Silvestroni and Bianco, to mention only two of the more active hernatologists engaged in studies of this disease, made careful surveys of the Italian mainland and islands and noted a remarkable prevalence in the Po Delta area with an incidence of as high as 20 per cent in some small communities near Ferrara."


      The idea is that that some population structure might be the result of *past* physical geography, like swamps as well as still existing physical geography, like mountains.

      Just a thought.

    3. Well, adaption to malaria has some major side effects, so it's a clear case of dynamic equilibrium, which should be subject to strong negative adaptive pressure in the mid run, if you remove malaria-causing factors by means of migration. Also Pleistocene temperatures were quite colder, probably reducing the impact of malaria in Europe to near zero. I would think that the alleles of malaria adaption in Southern Europe have been borrowed from Africa in "recent" times, being either selected for or against locally depending on the health needs.

      Not sure if it's this what you imply or not.

    4. I've read that Neanderthal DNA may have survived in modern humans because it was useful DNA i.e. it had some useful adaptions - no idea how likely that is but it just occurred to me the same argument might apply to paleolithic DNA i.e. surviving more in regions where it conferred an advanatge e.g. malarial protection. If so then maybe the frequency of your paleo 1 category might coincide with thalassemia frequency i.e. regions that used to be very swampy.

      Just an idea.

    5. There are many factors in adaption:

      1. Of course the allele(s) must be useful, especially life-saving (although there's always a trade-off, so it only works for as long the adaptive pressure is active... unless fixation occurs).

      2. Population dynamics must be taken into account: it's not the same a selective pressure on a population of size Ne=10, than the same pressure on a population measured in thousands or even millions. In the first case probably only one allele will survive (fixation, whose causes may be just random drift), unless extinction happens first, in the latter diversity will be retained almost necessarily and the adaptive allele will have a hard time becoming prevalent, unless it's a radical life-death issue such as any non-carrier dying always or almost.

      So we can hardly compare what happened in small sized populations like those of OoA SW Asia and other small-sized Paleolithic ones maybe (especially in marginal environments like arid or subarctic climates) and what happens in the much larger populations of most Neolithics. It's not as simple as an allele is "useful", so it is almost automatically spread to everyone but utility is relative, there are trade-offs and there are complex population dynamics involved as well, all of which must be taken into account.

      You already mentioned that this issue of malaria/thalassemia affected only "some small communities", which is consistent with what I was saying before about the vast majority of people living outside the swamps.

      Also the thalassemia alleles are clearly African-imported in Europe, so they must have arrived with other African-originated markers like haplogroup E1b, etc., not really the kind of stuff I'd associate to "Paleolithic" European haplogroups primarily - although, of course, in the long run this E1b association must have broken to at least some extent.

  5. "I've read that Neanderthal DNA may have survived in modern humans because it was useful DNA i.e. it had some useful adaptions - no idea how likely that is but it just occurred to me the same argument might apply to paleolithic DNA i.e. surviving more in regions where it conferred an advanatge e.g. malarial protection."

    Most Neanderthal DNA has frequencies consistent with being ancestry informative and fitness neutral. A small fraction of Neanderthal DNA loci are present at elevated levels in West Eurasians and/or East Eurasians that are more consistent with some sort of fitness enhancing effects and this includes HLA (one kind of immunity related) genes at least in East Asians. John Hawks whose group does original research in that area has done some posts with charts that illustrate the point nicely. The purposes of most of the loci showing indications of fitness based selection based on their frequency is not well understood, although they almost certainly do not include pigmentation phenotype genes - the light pigmentation genes in modern humans in Europe are not present in the Neanderthal genome.

    In general, populations that have Neanderthal admixture are more vulnerable to malaria than African populations with very little or no Neanderthal admixture, so this is an unlikely source for malarial resistance genes.

    1. "In general, populations that have Neanderthal admixture are more vulnerable to malaria than African populations with very little or no Neanderthal admixture, so this is an unlikely source for malarial resistance genes."

      Yes i wasn't suggesting they were neanderthal but that they might be following a similar pattern of adaptations in an earlier population being passed on in a later admixed one.

      However if they're African-imported that doesn't fit.


Please, be reasonably respectful when making comments. I do not tolerate in particular sexism, racism nor homophobia. Personal attacks, manipulation and trolling are also very much unwelcome here.The author reserves the right to delete any abusive comment.

Preliminary comment moderation is... ON (your comment may take some time, maybe days or weeks to appear).