This paper is probably the most detailed study of the haploid genetics of Italy to date, considering both Y-DNA and mtDNA.
Alessio Boattini, Begoña Marínez Cruz et al., Uniparental Markers in Italy Reveal a Sex-Biased Genetic Structure and Different Historical Strata. PLoS ONE 2013. Open access → LINK [doi:10.1371/journal.pone.0065441]
The study contains very ample data for both uniparental lineages and confirms that the origins for Italians are very complex. However their conclusions on the alleged sex-bias are totally founded on the very unreliable "molecular clock" methodology, which I will ignore in this review, focusing instead on regional affinities and similar groupings.
Y-DNA
After toying a bit with table S1 for easier visualization, I took the following snapshot:
I changed the names of the regions from cryptic Roman numerals. Frequencies are highlighted if >2.5% overall or >5% regionally. All the rest is the same.
In order to more easily visualize the data, I made the following synthesis:
Labels for R1b are based on previous analysis based on Myres 2010 (quick map link).
Most Italian R1b (27% of all patrilineal ancestry) belongs to the Southwestern clade, dominant (within R1b) in Iberia, France, Switzerland, Ireland... and Italy, and also very important in Great Britain, West and Southern Germany and Scandinavia. In Italy (as in Switzerland and Croatia), this clade is dominated by R1b-U152 (Alpine clade, sometimes also dubbed "Celtic"), which is also common in France and other places. Much less important is the "Irish" clade R1b-L21 (again common in France, as well as in Great Britain) which has however a notable peak in Bologna (10%). The presence of the Pyrenean clade R1b-SRY2627 is rather anecdotal (somewhat more common in NW and Sardinia). This grouping shows a clear strongest influence (almost 50%) in the Northwestern arch (NW, Bologna and Tuscany), with much lower frequencies elsewhere. This distribution does not look too "Celtic" to my eyes, I must say.
Second in importance within R1b is what I labeled as "Euro-root", most of which (6.9% of all patrilineages) belongs to R1b-M269(xP311). This paragroup connects more clearly with the Balcans and maybe West Asia, and is (coherently) somewhat more common to the South and less so in the NW.
Other R1b variants, which are likely to be mostly R1b-V88, are rare except to some extent (3.7%) in Sardinia, where this haplogroup was first identified.
The allegedly Indoeuropean haplogroup R1a1a displays a very strange pattern for such attribution, being completely absent in the Northeast (NE, BOL), where we would have expected it to be common, as it is for example in nearby Slovenia. Instead the greatest frequencies are in the South and Center of Italy, what suggests that there is still a lot to understand about the origin and dispersal of this lineage.
It is also notable the presence of I(xI2a), which I labeled "other NE European", although maybe "North, Eastern and SE European" would have been more correct. Within it, the allegedly "Nordic" haplogroup I1 (very common in Sweden), reaches c. 10% in NE Italy (NE, Bologna), again raising questions about the origin of this lineage as well as of all I (which I tend to consider of Ukrainian/Romanian Paleolithic origin).
The other half of the Italian Y-DNA should be of Eastern Mediterranean origins, be them in West Asia or the Balcans. I have divided this group into two categories: on one side what I label "Cardium Neolithic", all three haplogroups being attested in ancient DNA of this culture in Mediterranean Iberia/France, and on the other the rest, which is not attested but should also have arrived from the same broader region, either in the Neolithic wave or later ones (Bronze, etc.)
All three "Cardium Neolithic" clades are well represented in Italy, being the most notable G2a (11.1%), followed by E1b-V13 (7.8%) and then I2a (only 4.1% overall but a bulging 39% in Sardinia - also having the greatest I2b apportion: 2.4%). The most plausible origins of these three Neolithic lineages are respectively Anatolia (G2a), Greece-Albania (E1b-V13) and the former Yugoslavian Adriatic regions (I2). Italy surely acted as trampoline for their expansion Westward some 7500 years ago.
The "Other West Asian" category includes all other E1b-M78, E1b-M123 (both with ultimate origins in NE Africa but arriving to Europe almost necessarily via West Asia and the Southern Balcans), other G, as well as all J, L and T. The most notable of these lineages is J2a (11.4%, with strongest impact in Sicily, Central and NE Italy), followed by E1b-M123, which made an impact especially in Sardinia (6.1%) and L (major in NE Italy: 8.2%). They may all be localized Neolithic founder effects but uncertain. Of this group only J2 (J2a?) made some impact further West, reaching >5% in some parts of Iberia.
Overall African lineages (the rest of E) seem to have impacted more notably in Sicily (6.4% overall), however the characteristic NW African E1b-M81 also left some mark in Bologna (3.4%).
Some mention deserves also the rare F*, which has a rather Northern distribution in Italy, quite similar to that of R1b-SW.
Mitochondrial DNA
Being too large and detailed I did not take a picture of table S7, which neatly displays the mtDNA data. The most notable lineages anyhow are the following ones:
- HV*: 4.1% (notable in NW: 6.8%)
- H*: 11.1% (widely distributed)
- H1*: 10.4% (common except in NE, highest in Sardinia: 18.6%)
- H1a (5.7% in Bologna)
- H2 (7.7% in Tuscany)
- H3: 3.9% (10% in Sardinia, 8.6% in Bologna)
- H5: 4.3% (more notable in NW, Tuscany, Center)
- T1a: 3.4% (9.3% in NE)
- T2b: 3.4% (8.6% in Sardinia)
- J1c: 3.9% (6.2% in NW, 14.3% in Bologna)
- J2a (5.1% in Sicily)
- J2b (7.1% in Sardinia)
- U5a: 3.7% (most important in Central region, NE and Bologna)
- U5b (7.1% in Sardinia)
- K1a: 4.4% (most important in NE, Bologna, Tuscany and Center)
I also attempted a synthesis here, although some may disagree with my labels (I'm a bit in doubt myself in some particular cases, admittedly):
Let me explain the why of the labels and groupings:
- Paleo1 corresponds to what some extremists consider the only valid Paleolithic lineages in Europe, i.e. those sequenced in Central and Eastern European "foragers" (excluding Sunghir's H17'27). I'm particularly uncertain about U8b: U8 has been sequenced in Paleolithic Europeans but U8b is closest to K and both are found also in West Asia.
- Paleo 2 corresponds to the lineages that appear to spread, at least partly, from SW Europe, some of which (H6, H1b, H*) have been sequenced among pre-Neolithic hunter-gatherers.
- Paleo/Neo is a category of lineages I am uncertain about:
- HV* has been sequenced in Italian foragers but some of it may also have arrived with Neolithic
- V appears to have similar origins to the SW European H lineages but it has only been sequenced in aDNA since Neolithic, so...
- Other H: I was simply unwilling to ponder each of the many small lineages' possible origins.
- Neo is the category of most likely lineages of Neolithic or post-Neolithic arrival. I have doubts especially about K, which is first sequenced in aDNA in Neolithic Syria/Kurdistan and spread clearly within Neolithic flows, however its phylogenetic connection with U8 makes me doubt about its ultimate origins and flows.
- Exotic includes those clades of quite clear origin outside West Eurasia/Mediterranean basin (mostly Siberian lineages): they are quite rare even considered together*.
- The categories in cursive are just groupings of the previous, as per description.
One of the aims of these groupings was to check if the molecular-clock-o-logical claims of the paper made any sense. It seems not. Italian mtDNA, like the Y-DNA seems split by about half between likely Paleolithic European clades (of possible post-Paleolithic arrival to Italy in many cases) and likely Neolithic ones. Regional variation does exist but it's not too remarkable. For example if we take the Neo row, it seems that the South of the Peninsula (S) was a bit more influenced by Neolithic or post-Neolithic flows, but the difference with the less influenced area (NW) is of just some 12 percentile points. This pattern is mirrored in reverse by the Paleo 1+2 row.
However if we take the Paleo 1 row, we see a pattern which does not seem consistent with Paleolithic continuity, at least to my eyes, with the highest frequency in the NE (open to migrations from Balcans and Central Europe), followed by the Central region and Sardinia. It rather seems to correspond, at least in part, to migrations from those regions: Balcans and Central Europe.
But, as always, your take.
_____________________________
* On second thought (mini-update), the overall frequencies of "Siberian" lineages are not so negligible in two regions: Sicily and Central Italy, where they amount to >3% taken together. I'm wondering if this may be symptomatic of Roman slave trade, which is known to have Eastern Europe as its main source of slaves after its consolidation as Empire (also in the Middle Ages).
* On second thought (mini-update), the overall frequencies of "Siberian" lineages are not so negligible in two regions: Sicily and Central Italy, where they amount to >3% taken together. I'm wondering if this may be symptomatic of Roman slave trade, which is known to have Eastern Europe as its main source of slaves after its consolidation as Empire (also in the Middle Ages).