June 21, 2015

Some improved knowledge of major R1b sublineage S116

While it is far from being the last word on the matter, the new study by researchers of the University of the Basque Country adds some important information to our knowledge of the main European R1b subhaplogroup S116, which dominates much of the continent with a south-western centrality, spanning from Ireland to Italy and from Iberia to Germany.

Laura Valverde, María José Illescas et al., New clues to the evolutionary history of the main European paternal lineage M269: dissection of the Y-SNP S116 in Atlantic Europe and Iberia. European Journal of Human Genetics, 2015. Pay per view (as usual supp. materials are freely accessible) → LINK [doi: 10.1038/ejhg.2015.114]

Preliminary status of the research

The improvement of the knowledge of this major European lineage had been stuck, as far as I know, since the various studies published in 2010, notably Myres et al. (discussed HERE). At risk of repeating myself, I will again display here some of the maps derived from the data of that key paper, as they are very useful references for discussing the new one:

Frequency of R1b subclades relative to overall R1b (per Myres 2010)
note: M529 is wrongly labeled M259
Composite image showing the overall frequency of R1b-S116 (red) and R1b-U106 (blue)
according to Myres et al. 2010

The new data

The new study is not as comprehensive in their sampling as that of Myres, so it heavily relies in previously published data, which is enough for the already studied subclades. The resulting maps are however somewhat different from Myres, because a lot newer Basque and Iberian data is present here (Myres did not sample Basques, whose frequencies and diversity for S116 are outstanding):

Fig. S1

They do however consider a third major sublineage of S116, defined by the mutation DF27, which is strongest among Basques and other SW Europeans. They also considered a sublineage described by the SNP L238 but only could find a single individual carrying it (a Breton from Brest), so it should be considered as part of the wider S116* paragroup and not relevant on its own. 

These are the results for DF27 and S116*, i.e. S116(xM529,U152,DF27):

An important detail is that, after excluding the three major subhaplogroups, the remaining S116* seems concentrated in Ireland and the Basque Country. However we should await for new information that could come from France, Southern Germany or even parts of England in the future (these areas showed some notable S116* in Myres 2010 but DF27 was not excluded then).

In this sense it must be mentioned that a sublineage of DF27 (SRY2627) has been known since more a decade ago, before even the modern nomenclature arose in 2001 (Rosser 2000 called it Hg22), and was indeed spotted not just among Basques but also among some Bavarians. So I would not dare to exclude at least some presence of DF27 further northeast than what this study shows. However it is indeed clear that its primary distribution is in Iberia and particularly among Basques. 

It is also important to underline that the maps may be a bit misleading because of the existence of two different Basque samples: a rural one (mostly with Basque surnames) with overall R1b and also S116* frequencies similar to the Irish and a urban one (of more mixed ancestry) with somewhat lower frequencies.

From the supplementary material I gather that Basques have the following frequencies (green: rural Basques, blue: urban Basques):
  • S116*: 16%, 8%
  • DF27: 71%, 51%
  • M529: 2%, 3%
  • U152: 3%, 1%
Otherwise the frequency of S116* is most notable among the Irish (18%) and Central-East Iberians (8-12%) but lower in Brittany (6%), Cantabria (6%) and Portugal (4%), being absent in Galicia and Asturias.

The frequency of DF27 is highest among Basques (63% on average, 71% for the rural sample) and then similarly high across Iberia (40-48%). It reaches 17% among Bretons and just 1% among the Irish. It must be noted that when you apportion DF27/S116, the result is similar through all Iberia (72-80%), Basques included (hat tip to Jean).

The frequency of M529 is very high among Irish (54%) and Bretons (52%) but under 5% everywhere else, except the following: 6% in both Asturias and in Cantabria, 7% among Galicians. The lineage is present in all sampled populations except Alicante and Andalusia. Therefore, inside Iberia, it shows some NW-SE clinality.

The frequency of U152 is under 10% across the board but also found in all sampled populations. The lowest ones are urban Basques (1%) and the highest ones Galicians and Asturians (9% and 8% respectively).

Note: figures above corrected (Jun 22) because there was a confusion between Cantabrians and Galicians in the first version of this entry. Thanks to Cousso for noticing.

Some conclusions

In the original entry I wrote here that I had found very striking a very sharp contrast between Basques, on one side, and Cantabrians and Asturians on the other. This was wrong because I committed an error in parsing to notes and confused therefore Cantabrians with Galicians. Hence the "sharp contrast" at the Western edge of the Basque Country is not so sharp after all, because Cantabrians act as buffer. There is still a curious contrast between Basques and the old Gallaecia province (later also Suabian Kingdom and Kingdom of Asturias-León), that is: Galicians+Asturians. These show the highest peninsular frequencies of M529 and U152, while Basques only have low frequencies of them; even more significantly maybe Basques have one of the highest S116* frequency in Iberia (just below the Irish in the overall sample), while Galicians and Asturians have none of it. In any case all this is reminiscent of the overall genetic contrast between West Iberia and the rest of the peninsula mentioned in other occasions, contrast that affects many haplogroups but not all.
[Paragraph edited: Jun 22].

The other conclusion is that, while we must await for further data, particularly from the French state and also from Southern Germany, the combination of this new data with that of Myres 2010 only ratifies me in my previous conclusions, which are:
  • R1b overall originated in West Asia, expanding in several directions from that region.
  • R1b-M269 (the main West Eurasian subclade) expanded from either the Balcans or Highland West Asia (Iran?)
  • The subsequent expansion in Europe (L23, M412 and L11 stages) is not too clear but could well have followed a double Central European and Mediterranean routes. More research is needed for these transitional stages between the Balcanic/West Asian phase and the Western European ones. 
  • Once in Western Europe, two L11 sublineages experienced parallel expansions:
    • U106 probably expanded from the Netherlands or Frisia (or maybe Doggerland in a Paleolithic scenario). Detailed research awaits however.
    • S116 surely expanded from somewhere in what is now France (possibly towards the Atlantic, judging on where S116* is most common), with three main subclades, each one following its own pattern of expansion:
      • M529 towards the Northwest (Brittany, Britain, Ireland...)
      • U152 towards the East (most notable in Switzerland and Italy, but also important in France, Germany and Britain, with offshoots of plausible Celtic transport in the Balcans and even Anatolia).
      • DF27 mostly to the South, peaking among Basques but also important in much of Iberia. It remains to be discerned how important it is in other European regions.
I put these notions on a map. It must be considered a rough sketch, a working hypothesis, because there is not enough data to be reasonably certain about all the details:

I would not dare to give tempos here. The sketched pattern of expansion can be equally consistent with a Neolithic or a Paleolithic modeling. The important pivotal role of France and the Netherlands could weight in favor of a Paleolithic model but it is true that aDNA and certain prehistoric reconstructions could allow for the French role (at least) to fit within an Atlantic Neolithic (Megalithism + Bell Beaker) theory for the expansion of S116 and I see no reason why the Netherlands could not have also played a similar role in NW Europe.

Thanks to Jean and Mike for the heads up.

Update (Jul 4): two "forgotten" papers of relevance:

1. George B.J. Busby et al., The peopling of Europe and the cautionary tale of Y chromosome lineage R-M269. Royal Society Proceedings B, 2011 → LINK.

Must read: demolishing (and well deserved) criticism of Balarasque 2010 and to some extent also of Myres 2010. They totally dismiss STR-based age estimates as wrong, misleading.

2. Rosa Fregel et al., Demographic history of Canary Islands male gene-pool: replacement of native lineages by European. BMC Evolutionary Biology, 2009 → LINK.

The anciente Guanche mummies' Y-DNA pool includes 10% R1b-M269. Considering that the islands were colonized c. 1000 BCE, I can only imagine that the Steppe Horde will find some way to blame the forgotten squire of Herakles for that. Or something...

Thanks to Georg for mentioning: that's the kind of feedback I love.


  1. Sorry for posting again, but I really think that you are mistaking the data of Galicia and Cantabria: Galicia and Asturias have 0% *S116, but Cantabria has a 6%. In fact the data of Galicia and Asturias are very similar, having relatively high U152 and M529, with the exception of Galicia's 4% U106 (Suebi). The frequencies in Catabria are somehow in the midway from Basques to Asturians and Galicians.

    1. You're welcome and you're right. Thanks for the correction: I'll edit accordingly.

    2. Corrected. I think it will now read fine.

      One question: why do you say "sorry for posting again"? Did you have any problem with the automated spam filter or something, I'm pretty sure you are not blacklisted here.

  2. Yep. Spam filter. Or either I didn't hit the publish button :-)

    1. Probably the latter because I checked the spam folder and there was nothing in there from you (or otherwise that it shouldn't be).

  3. Interesting to see DF27 concentrated so heavily where it is. I've been surfing around sailing forms (no pun intended) and looking that the bay maps for the blue water and the currents. Not a great place to be an inexperienced sailor. The area between Bilbao and Bayonne is kind of a unique area.

    1. Notice please that the greater absolute frequency of DF27 in the Basque Country is caused ONLY because of greater frequency of R1b. When DF27 is apportioned to R1b (or S116, which is the vast majority of R1b in all the studied regions), the result is very similar in all the peninsula (72-80%). Jean Lohizun pointed this fact to me and I did mention in the text above.

      Some of the same happens when we apportion S116* to S116: rural Basques are highest (18%) but only slightly above than Barcelona or Andalusia (15%), or other samples like Alicante (14%), urban Basques (13%) Madrid (12%) or Cantabria (10%). Only Galicia and Asturias are clearly below because they totally lack the S116* paragroup. Instead the Irish have an enhanced frequency of S116*/S116: 24%. Again this data is from Jean L. (didn't bother checking myself).

      You have me intrigued anyhow about what you say of the currents: I was not aware of that. Anyhow, Basque difference is probably much more articulated historically (and proto-historically) around the mountain axis than the sea itself. Also, for all I know, there's nothing distinctively Basque before the Celtic invasions (Urnfields to La Tène) cut the pre-IE (Vasconic?) are a into pieces. Much as Ligurians further East, the Basque genesis surely owes more to the need of uniting against the common enemy than to a pre-existent unity. In both cases the mountains acted as backbone of that resistance: a place that the invaders would normally not dare to adventure in or would pay a big price for doing it (ask Roland and his relative Charlemagne).

      So we have to look for other cues in order to understand the spread of S116 and these, IMO, should be further North, in Gascony, Occitania and France.

    2. "I've been surfing around sailing forms"

      I keep meaning to do that. If Atlantic Megalith was at least partly related to the amber trade then with a center in Portugal then I was thinking they might have a series of safe harbors along the route so i was going to try and see if the safest sailing route mapped onto the Atlantic Megalith sites.

      never got round to it though

    3. Well, Grey, instead of speculating freely, what about looking at the actual archaeological data, for example this map of the Bell Beaker presence in Iberia. It's incomplete and anachronous (it covers several BB phases and styes, which probably should be considered on their own merits) but it also has some correlations with other maps from the same book and for the same general Chalcolithic period that I'm parsing as I write this.

      The most different one is that of Megalithism (dolmens), which bypasses most of the Central Plateau and also seems minimal in most of the Eastern coast. However it must be said that "collective" (clannic?) burial in caves was common in many of those areas, so not that different in concept.

      More directly related to Bell Beaker are the following specific items:

      1. Palmela points, surely made in VNSP (Portugal) and scattered mostly to Galicia, much of the plateau, following Tagus river, "Silver Road" and "St. James Way", as well as another route from (roughly) 'Madrid to Bilbao'. Also very important in West Andalusia, coastal Galicia and then some scatter in the Eastern coast, as well as in Southern Brittany.

      2. Four types of V-perforated buttons, culturally related to Bell Beaker:
      2a. Turtle-shaped buttons seem to spread from Estremadura (Port.) to the Languedoc (Treilles group) and more rarely to Los Millares area, Barcelona province, plus three instances in the Bay of Biscay and Western Pyrenees.
      2b. Conical buttons seem to spread from Languedoc to Estremadura instead with a larger but overall similar scatter to the previous. There is some concentration in the area of Basque Upper Ebro as well (near La Hoya, modern Biasteri). They also spread up the Rhône and to Provence.
      2c. "Dufort" buttons seem most common in the Treilles (Languedoc) area with only some scatter towards the Upper Garonne and one instance in Navarre.
      2d. Pyramidal buttons are most common through Catalonia (probably a regional style) with some spread southwards along the coast to Los Millares and to the Balearic Is., two Western Pyrenean instances and one in Estremadura (Port.)

      Overall I would think that the Southern Basque Country played a minor role in Bell Beaker related trade. Instead the Treilles-VNSP connection seems very strong for some reason. I don't have any data indicating contacts with other BB provinces but that would be the SW province pretty much, Sardinia excepted.

      Anyhow I would think these peoples must have been good sailors, owing maybe to partial Cardial inheritance: the long distance trade, which extends up to the end of the Bronze Age (that ends with the Celtic "absorption" of the Plateau and invasion of the Western coast of Iberia) and the spread of Megalithism first and BB later largely by sea are witness to it. So maybe they did not need too many ports of call and sailed largely through the open sea, using their quite apparent astronomical knowledge.

    4. "Well, Grey, instead of speculating freely, what about looking at the actual archaeological data"

      Yeah the idea was to see if local sailing knowledge correlated with the known settlements. I've done a little sailing and know local knowledge of tides, winds etc is critical - but so far too lazy.

      "So maybe they did not need too many ports of call and sailed largely through the open sea, using their quite apparent astronomical knowledge."

      Yes that was the question I wondered about. I recall reading long ago about Roman era sailing in the Med. where it was said the standard practice was to keep close to the coast and beach the ship every night.

      If it was the same with the Atlantic Megalith people you might expect settlements to follow a chain of safe harbors roughly one day's sailing apart. On the other hand they might be as you say and the harbors much further apart - just curiosity though - it wouldn't effect anything significant I don't think.

    5. → https://www.academia.edu/4226082/Ancient_sailing-routes_and_trade_patterns_the_impact_of_human_factors

      Homer already contrasted the primitive navigation skills of Achaeans to those of "modern" ships of his time (800 BCE?), which crossed the open seas. We also find that Ulyses was allegedly taught by Calypso to sail the open seas by night. So it seems that the limitations attributed by some to ancient sailing are not too realistic.

      Anyhow the long distance trade that converged in Southern Iberia in the Chalcolithic (amber from the Baltic, ivory from both Syria and Africa, ostrich eggshells even!) almost necessarily imply navigational capacities at least as advanced in the general terms as those of ancient Phoenicians, if not better (the Atlantic is a much harder sea than the Mediterranean after all).

      For example if I'd be an amber trader, from Zambujal, I'd take to the open sea in Galicia and sail directly to the Britains, then maybe coasting along the Low Countries to Frisia, where I'd trade. I'd totally avoid the Bay of Biscay, what does not mean that other less ambitious traders would not go along its coasts in search of other types of trade.

      If I were a trader between Greece and Magna Grecia, I'd also take the shortest route across the Ionian Sea and not risk going all the way to modern Albania only to suffer Illyrian piracy. I'm pretty sure that direct open seas navigation between two known ports was common.

      I would think that a chain of harbors may have existed between Estremadura and Languedoc but I don't see clear evidence further North in the Atlantic. Also notice that certain key locations like Sardinia were not accessible via coasting and actually do require, like it or not, open sea navigation to be reached. In other cases open sea "shortcuts" would make sailing not just much more efficient but also generally safer (reefs are a much worse danger than open sea gales and let's not forget about piracy, more likely to happen near the coasts).

    6. "I'd totally avoid the Bay of Biscay"

      Yes, that's the kind of thing I mean. If a sailor was asked what was the safest or fastest (might be different, might be the same) route from a to b along the Atlantic coast what would they say?

      I tend to agree with your view - various hazards forced them to be open sea sailors - and I think that being open sea sailors may have influenced the pattern of settlement but it would be interesting to see a collection of old fishermen from Spain, France, Britain, Holland, Denmark etc explain their patch of sea in the context of the route from Portugal to Sweden to confirm or deny it.

    7. But we can't ignore, I believe, the lesser routes.

      On one side there is the outstanding relevance (especially within the BB context) of the VNSP-Treilles interaction, which may have partly go via the Mediterranean coast but also, it seems to me, via the Basque Country by partly or totally land routes, this lesser route probably had port connections in the Bay of Biscay, looking to Galicia and VNSP but probably also to the North (lacking data right now).

      On the other side there is the importance of the Plateau and the Upper Ebro (both prime agricultural areas), which are connected to both the Basque coast and to the VNSP region (via the rivers in the BB era, probably St. James Way+Silver Road in the pre-BB era).

      Finally I would not dismiss out of mere lack of data the probably relevance that Greater Aquitaine acquired in the Artenac period. The fact that Artenacians were able to impose themselves over long-established cultures like SOM and Armorican Megalithism, suggests that they must have been somehow important in the order of things of the time. In the West we do not see any other such serious "political" change before the Celtic invasions, excepted maybe the later evolution of the Iberian Plateau (formation of Cogotas culture, military colonization of La Mancha with the "motillas") but that's already in the Bronze Age.

    8. "Finally I would not dismiss out of mere lack of data the probably relevance that Greater Aquitaine acquired in the Artenac period."

      I agree with that. The later shrinking of Aquitania has hidden something significant imo.

  4. @Bell Beaker Blogger: as you are reasonably knowledgeable on the archaeology surrounding Bell Beaker culture, I would like to ask you for your opinion on the issue of Artenac culture (culture artenacienne), it's pre-BB importance, its connection with Iberian and Occitan cultures and its possible role in the development of and spread of Bell Beaker, if any at all.

    This is in relation with the issue of Basque origins that you raised above. I have the strong intuition that Artenac, which began in the area of Perigord (Dordogne) and Angoumois (Charente), and vigorously expanded northwards with archery warfare, taking over Seine-Oise-Marne and consolidating the border with the expanding Corded Ware Indoeuropeans near the Rhine, is a key important actor of some sort. It must be in relation with Basque-Aquitanian genesis, no doubt, but I am not sure how important it is in relation to the overal "geopolitics" of the time and their long term cultural implications. Their early abundant usage of archery, parallels that of VNSP/Zambujal and relates to what you mentioned elsewhere about Bell Beaker being largely characterized by love of archery (in contrast to the "war axe people", as Corded Ware used to be misnamed particularly - axes may have been more ritual and pre-CW but still no strong signs of archery).

    One of the reasons I raise this is because I once and again found myself heading to Angoulême when researching for Artenac culture (not too much online, mostly in French and nothing really comprehensive I can spot) and I recall some time ago, discussing at Heraus' blog on how strikingly Basque-looking were the Angoumois for such a northernly and non-Gascon people (particularly there's one guy (in the center of this photo)that looks totally like my Basque-purebred grandpa, totally "Caristian" in ancestry, just minor details like a broader face and different eye color, as well as thicker eyebrows make him different).

    My overall impression is that in Western France, much as in the Basque Country itself, while there was some BB, the overall pattern is of Megalithic continuity (of less grandiose design since Artenac, less aristocratic or not at all than the Armorican/British variant), gradually decaying into amorphous "individualist" burial tendencies as we get into the Bronze Age. That BB may not have been as important as such (although maybe it was more of a change in some areas, hard to judge except case by case) but that the overall Atlantic (Vasconic?) culture of Megalithic roots was, with BB being rather an epiphenomenon of this.

  5. I'm searching online for more info on the Artenacian. Now in Spanish (after more or less exhausting English and French language searchs), with maybe more success. For example:

    → http://dadun.unav.edu/bitstream/10171/8423/1/CA_09_02.pdf (direct download)

    This paper discusses Bell Beaker in Iberia and to some extent France. It has some interesting maps, for example the one in page 162, which shows Maritime and Mixed style Bell Beakers in Iberia and Western France. It evidences greatest concentration of these in Estremadura (VNSP), Southern Brittany and near modern Seville. But there is also an important scatter in the Plateau, the Upper Ebro the Western Basque Country, Gascony and Perigord-Angomois (i.e. the core Artenac area), suggesting some sort of inland connection between the Bay of Biscay and the civilizational centers of Estremadura, Languedoc and of course Brittany. The inland route by the Plaetau, which goes through Madrid province quite apparently, may have been precursor of the later Ciempozuelos style, which is in turn precursor of the formation of Cogotas culture.

    In page 159 there is another map showing the scatter of laminar appliances on gold, in which again the Basque area seems important (6/26 sites). Other concentrations are in Catalonia (6) and Brittany (5).

    The text also seems to emphasize the role played by the Basque Country in the spread of Iberian types of produce. For example the famous Palmela points are distributed outside Iberia almost only in Western France and the Norman Isles, scatter that is attributed to have gone via the Basque Country.

    Regarding Aquitaine, Alday considers it a knot of trade and influences, with access to the following routes:
    → Atlantic facade (reaching up to the British Islands)
    → Garonne and Aude riverine routes to the Rhône and Alps and, from there, to Central Europe
    → Another from via the Basque Country to the Iberian interior (Plateau, Upper Ebro)
    → A route to Languedoc via the "Naurouze corridor" (roughly the modern Canal du Midi, which follows the lowest possible route between the Garonne and the Mediterranean), that "leaks" to the Mediterranean regions via Languedoc (Treilles group)

    So overall it would seem that the Pyrenean Isthmus, especially its northern side, was rather important in trade and cultural interaction in the Chalcolithic, what in turn gives increased relevance to the Bay of Biscay and therefore the Aquitanian and Basque connection.

    1. Erratum "page 162" should read "page 152".

  6. Linguistic data points to a strong early layer of Gaelic in South Slavic languages. I believe that this indicates an early presence of R1b people in the Balkans. How early? This is tricky. We definitely have strong presence of Celts (Scordisci) in The central Balkans (Serbia) in the 3rd and 2nd century BC and later. Even today the percentage of the R1b haplogroup in Serbia and Bulgaria is significant indicating that this Celtic population merged into the South Slavic population. But I believe that Gaelic speaking people lived in the Balkan region much earlier, possibly even in Neolithic times. Again this is based on linguistic analysis and cultural analysis but also on Archaeological data which points to the possibility that the builders of Newgrange came from the Balkans and that they were cultural descendants of Vinca culture. But then that could also have been people of I2a haplogroup... :)

    1. Languages and genes are not too related. People learn new languages out of necessity, convenience and imposition, and lose old ones out of becoming "useless" (languages are "tools" after all) and sometimes also because of repression.

      Anyhow the Celts clearly invaded Pannonia and Upper Moesia in the 4th century BCE, so no mystery here:
      → https://en.wikipedia.org/wiki/Celtic_settlement_of_Eastern_Europe ←
      → https://en.wikipedia.org/wiki/Scordisci ←

      "I believe that Gaelic speaking people lived in the Balkan region much earlier, possibly even in Neolithic times".

      That is not likely at all. Actually there is no fat chance that Gaelic, Celtic or even Proto-Indoeuropean itself most likely had formed so early in time. Judging on the early divergence of Celtiberian, for example, one would think that the Celtic linguistic genesis dates most likely to Hallstatt culture time.

      "the possibility that the builders of Newgrange came from the Balkans"

      I wonder why would you think so... It's totally within the context of Atlantic Megalithism, even if somewhat peculiar in dimension and functions.

      Vinca culture was erased by IE invasions (Vucedol) and all that remained was the related Dimini area, which in the Bronze Age bears the name of Rakhmani culture and which was anyhow overrun by Mycenean Greeks (probably the very same Vucedol people, although they may have also arrived from some other nearby culture like Cotofeni). Vinca-Dimini as such did not have major impact outside the Balcans (other than some mixed penentration into Hungary) and some cultural influence in Chalcolithic Italy (but not implying migration surely because bronze tech was not yet transplanted nor similitudes are so major). Other cultures of the same "Highland West Asia" or "post-Halafian" area may have been more influential however, and so we see some elements reminiscent of Cyprus or the Cyclades (Anatolia-linked) in Iberia and Occitania (again major migration is unlikely but minor one is not impossible).

      Said all that, I think I recognize some Vasconic elements in the Balcans, for example:
      → Ibar river in Kosovo, where "ibar" in Basque means "river bank" or "river shore" and is etymologically related to "ibai" (river), "ibon" (creek) and possibily "ibili" (to walk). Probably related: Iber(--us) river (modern Ebro) that gives name to Iberia, Tiber (T-Iber?) river in Italy and Hevros (Maritza) river in Bulgaria and Greece (assuming it has suffered the same Iber(-us/-os) → Ebro deformation through Indoeuropean love of suppressing vowels into consonant clusters (-ber- → -br-, for example), which are unnatural in Basque (and Iberian, and hence Vasconic).
      → "Gore" in Serbocroat means up(-wards), exactly the same as Basque "gora". However Basque "gora" has a native etymology (goi-ra, where "goi" is "high" and "-ra" is "to", directional locative suffix: "where to?" = "nora?", "to Malaga" = "Malagara"). This might also apply to the more generic Slavic "gora" for mountain too but, if so, it should be unrelated to the Balcans and a much older linguistic introgression in Central Europe.

      So IMO the linguistic layers of a country like Serbia (can't be generalized to all the Balcans) could well be:
      1. Vasconic (first Neolithic of Starcevo)
      2. "Pelasgian" (second Neolithic of Vinca)
      3. Some unclear early Indoeuropean, maybe proto-Greek (Vucedol)
      4. Dacian and Celtic in the Iron Age
      5. Latin and Greek in the Roman era, even if probably never fully displacing older local languages
      6. Slavic

  7. There is a huge Gaelic linguistic layer in South Slavic languages. My research was originally concentrated on this area only, but eventually expanded to old European Cultures in general. There are hundreds of common Irish Serbian words, there is common grammar as well. There is huge overlap in cultural, pre-Christian beliefs of which some are only found in Serbian and Irish tradition. Interestingly some of these words and cultural traits have their root in Gaelic language and culture and some have their root in Serbian and wider Slavic languages and culture. This would suggest a complex long term mixing probably in multiple locations. You have plenty examples on my blog, but this is a good example of what I am talking about. This is a collection of rude words in Serbian and Irish. Most of them are not found in any other language group except Celtic and Slavic...


    I didn't spend any time looking at Basque language, but I have read articles written by other people listing common Serbian and Basque words. I will try to dig these out if you are interested. The reason why we have so many mixed linguistic elements in the Balkans is because we have so many mixed genes in the Balkans which means that the current Balkan cultures are mixes of many cultures, not just the last layer in replacement cultural cake...

    1. In general terms seem coherent (big facepalm exception: beggar comes from beg+-er, not ubog, you mention beg in another part of the text) but is it South Slavic or just Serbocroat? In order to be a general South Slavic it should also be present in Bulgarian/Macedonian and Slovenian.

      Have you made a contrasting comparison with Brythonic? Because if the words are specifically Gaelic and not Brythonic, then it tell us something about the distribution of these two main known dialectal groups of Celtic.

      As for Celts being in Turkey, yes: look up Galatians in Wikipedia. Mercenary Celts who were given a border province, named Galatia (there's even a Biblical "epistle to the Galatians") and founded Ancyra, modern Ankara. They were surely closely related to Moesian Scordisci, if not the very same people.

      As for Basque-like words in Serbocroat, I'm interested indeed.

  8. Last year I did my DNA Analysis with the Genographic project. Being a Basque named Fernandez, with a non-Basque speaking great-greatfather and instead assuming that our matrilinial heritage was totally local, Basque-speaking, of rural extraction, I thought I might get an "exotic" Y-haplogroup analysis and a more typically Basque Mt-haplogroup information.

    Well, it just resulted my father passed R1b-D27 to us :-) Of course, that might still of Iberian extraction rather than natively Basque, but it feels just ironic

    Mt-haplogroup info was also curious: X2c. A minority of Basque carry it, yet it seems to be present here since ancient times. Mom's matrilinial heritage, therefore, not mainstraim but looks Basque ;)

    I've written about these analysis in Basque in my Blog. I guess I should make a summary in English one of these days http://eibar.org/blogak/luistxo/r1b-haplotaldearen-misterioa-euskaldunen-jatorrian

    1. Aupa, tocayo. Neure izen oficiala Luis ere bada. Bueno, ingelesez idatziko dut beste jendeak uler ahal dutelako.

      I reckon that the "inversion" of haplogroups is somewhat funny and paradoxical but absolutely normal. As we have discussed above, R1b-DF27 is a common haplogroup through Iberia and also surely found, at least at low frequencies, in other parts of Europe. So it is absolutely normal to have DF27 with origins anywhere in Iberia or nearby parts of the French state. There's a sublineage that is more specifically Basque (SRY2627) but even that one is occasionally found in places like Bavaria. Basques may be different but we are still Western Europeans, not Martians.

      As for mtDNA X, it has been among Basques since the Neolithic (see linked data sheet). That is true also for other Europeans (and also for lineages W, J and T). These lineages almost certainly arrived from West Asia at the foundation of European Neolithic some 8500 years ago.

  9. The relict non-palatalized tsakavism of Adriatic islanders is not any lingual defect (as suggested in Yugoslavia), but a remarkable phonetic expression of the Adriatic island populations of preslavic genetical origin that are not yet slavicised nor palatalised, while the typical chakavian dialect in eastern Adriatic includes the affine biogenetic populations of semislavic origin that are now slavicized and palatalized. The actual delicate position of tsakavism of Adriatic islanders and chiefly of the archaic Gan-Veyãn in Krk is comparable with this one of Welsh and Man idiomes in Great Britain; however there in Balkans anybody takes care of them as in Europe, their speakers have no cultural rights, and their unique possibility in actual Croatia is to quickly disappear and to be assimilated. The extreme insular cakavism of Komiža in Vis island, and of Baška in Krk is divergent from the majority of other SerboCroat idioms. at N.E. Adriatic. The tsakavism of Rab and of Vis is the richest ones in Croatia by Romance words reaching a half of their glossary. The main sites of tsakavism speakers are coastal towns Labin, Rabac and Trogir, and in islands Cres, Lošinj, Baška, Rab, Pag, Brač, Hvar, Vis, Bisevo, etc.

    gora (= gora-brdski lanac),
    goratasun (gorštakinja-brdjanka),
    goritu (gorjeti: čakav.goriti),
    jasta (jesti), kutxa (kuća),
    prestatu (prestati),
    puztatu (pustiti),
    umetoki (umetak-uložak),
    uxaratu (usrati),
    xuko (suho: čak. šuhò),
    zapo (žaba),
    zitu (žito: čak. zyto),
    zoritu (do-zoriti)
    xilo (šilo-špica)
    askatu (tražiti: čakavsko-kajkavski: iskati) ask
    jabetu (silovati: cak. yebát),
    makilatu (batinati: č.-k. makljati),
    ufatu (nadati se: č.-k. ufati),
    xilatu (ušiljiti: č. šilat),
    zabartu (zaboraviti: č.-k. zabiti),
    zuri (suri-sivi)
    broska (kupus: bodul. broskva),
    gorahani (planinski: bodul. gorynji),
    kraska (krš-kamenjara: bod. kràsa),
    makilada (batinanje: bod. makjâda),
    tripontzi (iznutrice-fileki: bod. trìpice),
    triskatu (mlatiti-lupati: bod. triskàt),
    xirula(sviraljka: bod. šurla),
    zoritze (zrelost: bod. zrilòšt)
    gòrika (odozgo: bask. goraki),
    gorÿnji (planinski: bas. gorahani),
    kargát (tovariti: bas. kargatu),
    makjàt (batinati: bas. makilatu),
    pùska (općina: bas. puska),
    seupàt (nadati se: bas. ufatu),
    šilàt (naoštriti: bas. xilatu),
    šuhàt (osušiti: bas. xukatu),
    šurìt (posijediti: bas. zuritu),
    šušûr (šapat: bas. xuxurla),
    šušurlàt (šaptati: bas. xuxurlatu),
    triskàvac (gromovnik: bas. triskantza),
    ûla (pčela: bas. uli),
    ulîšće (košnica: bas. ulitxa),
    ulenjâk (saće-pčelinjak: bas. ulixetan),
    ûša (guska: bas. usoa),
    ÿšt-yâl (jesti-jeo: bas. jasta-jan),
    zilokân (nagrižen-rastočen: bas. ziloka),
    zilokàt (rastočiti-nagristi: zilokatu),
    zincàt (cvrčati-zujati: bas. zintzatu)

    1. Is that the list of Basque-like words in Serbocroat I told you I'd be interested in? I have yet to double check meanings and a handful are obvious Latin borrowings (prestatu, tripontzi) but it does look promising, thank you.

    2. Checked all the translations I could find (most couldn't). This is my diagnostic:

      Plausible cognates:
      jasta = to eat = [eu]jan
      žaba = frog, [eu]zapo = toad (variant?, more commonly "apo"). Actually Ibero-romance "sapo" (toad) is closer and has unknown origins, which could well be pre-IE (Iberian is speculated but unattested).
      žito = wheat (Czech: rye), [eu]zitu = cereal, harvest (very interesting one!)
      triskàt = bang, slam, [eu] triskatu (to smash, break up; to dance) triska (clacking the fingers while dancing, jump, dance). Very unclear, notice the unnatural TR phoneme, suggesting a possible *tiriska root).
      šurla = (elephant's) trunk, [eu]xirula = flute. Possible if it evolved from originally meaning an air instrument like a flute or trumpet, as happens with Spanish "trompa" (rel. trumpet, trombone, etc. but also meaning elephant's trunk).

      Problem words:
      gora = mountain. The problem it that it is proto-Slavic and has relations with Indo-Aryan), could still be related with [eu]gora (up, upwards) but is obscure and complicated. Much more clear to me is gore = up, upwards and exclusive of Serbo-Croat.

      False cognates:
      kuća = house =/= [eu]kutxa (box). Most likely the Serbocroat word comes from Lat. casa = house or is related to it via some other language like Illyrian or Celtic (?)
      prestati = to stop, cease =/= prestatu (to prepare, get ready), prest (ready) ← [es/it]presto (ready) ← Lat. praestus
      pustiti = to let go, release =/= puztatu (to squish), probably more related to [en] "push", both surely from French pousser or other similar Romance, ultimately from Lat. pulsare
      iskati = ask, seek, search =/= askatu (to free), aske (free), askatasun (freedom)
      ufati = to hope =/= ufatu (to blow, to turn off), ufa (void, empty), uf! (oof!). The Serbocroat word is almost certainly IE and related to English "ask" instead.
      zabiti = to pound, score, forget =/= zabartu (to behave lazily or depravedly), zabar (indolent, lazy)

      Notice anyhow that P/F are not genuine Basque phonemes but pronuciation variants of others (B→P→F typically) or IE loans, so if you see them in Basque words, it's good to search for a relative with more genuine phonetics. Similarly BR, KR, PR, TR, etc. are not natural Basque phonetics, a vowel is missing in between those two consonants. The only natural double consonant clusters in Basque are TS, TZ and TX (tsh = Spanish or English "ch").

      Can't find translations:
      usrati, suho, do-zoriti, šilo-špica, yebát, makljati, šilat, suri-sivi, broskva, kràsa, gorynji, makjâda, trìpice, zrilòšt, gòrika, gorÿnji, kargát (anyhow [eu]kargatu is Romance-derieved: Sp. cargar, Lat. carricare), makjàt, pùska, seupàt, šilàt, šuhàt, šurìt, šušûr, šušurlàt, triskàvac, ûla, ulîšće, ulenjâk, ûša, ÿšt-yâl, zilokân, zilokàt, zincàt

      Anyhow, judging from the few ones I could find, I'd say that ~50% of the alleged cognates are plausible.

  10. This is just Croatian and not complete Croatian list. I am trying to find the complete Croatian list and the Serbian list.

  11. Nice info there.

    It sounds like that you are already aware of this one:


    Btw, R1b1b2-M269 was found in Guanche remains as well.

    1. I was not aware of Busby's paper, so very much thank you because they pretty much issue the same criticisms I did against Balaresque's extremist and prejudiced oversimplicity, as well as STR age estimates in general. It is a great paper that I really lament to have been unaware of until now. Thank you very much.

      "Btw, R1b1b2-M269 was found in Guanche remains as well".

      Indeed: 3/30 (10%). Also I-M270 (2/30 = 7%), J1-M267 (5/30 = 18%, so it is pre-Islamic) and some K* (10%) and P* (3%) as well. However Canarian first settlement seems to be from c. 1000 BCE, and the mummies are surely from much more recent times, so it is not too conclusive about deep antiquity unless we get other references. But it is a good lead, of course. [Ref. http://www.biomedcentral.com/content/pdf/1471-2148-9-181.pdf]

    2. Entry updated with both references. Thanks again.

  12. That Busby thing is pretty clear.

  13. If we use Occam's knife, the most plausible explanation:
    -For High frequencies subclades R1b but with a few subclades in the West
    -For The population vacuum of II thousand BC in Iberia W and other areas of Western Europe
    -For lactose tolerance gradient from N to S Europe
    -For Unetice-Tumulus-Urnfield-Late Bronze sequence
    -For The prevalence of maternal lineages H in the West
    is that there was a sudden expansion of settlers arrived from Central Europe during the Bronze Age and thanks to its technological and military superiority displaced most of the local male population, taking as wives of these women, sometimes in regime Polygamy (reason for the predominance of non-Indo-European languages ​​in pre-Roman Iberia) what would help the natural selection factor of lactose tolerance for the local male population survivor. Then, the frequency differences between the regions of Spain R1b come marked by population historical movements (Phoenicians, Romans, Carthaginians, Arabs, Berbers, etc), so that the original mixture that bottleneck (R1b + H1) would remain more intact in remote, backward and inbred areas.

    1. "High frequencies subclades R1b but with a few subclades in the West".

      Don't think R1b, think R-L11 or R-S116 and R-U106. There's no more point of talking of R1b as there is of talking of P, K or F.

      "The population vacuum of II thousand BC in Iberia W and other areas of Western Europe"

      I don't see any "population vacuum". On the contrary: 2000 BCE was a very dynamic period.

      "lactose tolerance gradient from N to S Europe"

      There is no such cline. The European LP allele is concentrated in three focuses: Britain/Ireland, the Basque Country and parts of Scandinavia and dilutes outwards from them (→ map).


      "the most plausible explanation (...) is that there was a sudden expansion of settlers arrived from Central Europe during the Bronze Age".

      Obviously there were invasions (Celts, Romans) in the latest Bronze Age and Iron Age. That's undeniable on the basis of archaeology and historically known linguistic replacement.

      But we can see in the ancient mtDNA how Basques continue pretty much the same through those convoluted times, while probably Portuguese instead suffer dramatic change which I only dare to attribute to the Celtic invasion. This Celtic invasion from Central Europe (via NE and Central Iberia in a process that took centuries) seems to do exactly the opposite of what you say:
      → lowers mtDNA H frequencies, from c. 80% to c. 40%
      → probably also lowered LP and R1b frequencies

      Although I know that the data from that study of mine is quickly being overwhelmed by new data, the substance stands and we see:

      1. Unetice (East Germany): high JT+N(xR)
      2. Neo-/Chalcolithic Portuguese: very high H
      3. Modern Portuguese: roughly =(1+2)/X. Portuguese lose lots of H and gain JT+N(xR). (Basques instead remain mostly static although a lesser increas in JT is also apparent).

      My impression is that in Portugal there was around 50% of Unetice-like Celtic (and others?) immigration, which probably made a total genocide and incorporated like 50% of the modern genetic pool... but with the exact opposite traits that you claim:
      → dramatic reduction of mtDNA H, increase of "Neolithic" mtDNA lineages (JT and N(xR))
      → probably also meant reduction in LP, Rh⁻ and R1b frequencies but this is not directly documented AFAIK

    2. Erratum: "roughly =(1+2)/X". Not my best math formula. Actually should be [IE blood]/x+[Neolithic blood]/y, just that the impact of the IE invasion seems (in mtDNA at least) so dramatic that x=y, what seems to imply (with the available data on hand) a massive genocide, i.e. a 50% replacement. However the Y-DNA is mostly Iberian, what is quite odd. The matter definitely needs more research but mtDNA-wise, the impression is that of a massive demic replacement from an Unetice-like or similar {low H & high "Neolithic" clades} source.

  14. Hi Maju;
    Yesterday, by chance, i found this very interesting study " The fine-scale genetic structure of the French population. "
    I failed to email it to you, so i post it here.

    1. It is indeed a very interesting study, thank you very much Jean. Particularly because there are not too much data coming from France (except some samples in wider contexts) and this is the first such autosomal analysis of the Hexagon I have ever seen.

      It is particularly notable that the most differentiated region is the SW (SO) and that is for sure attributable to Basque/Gascon difference. The region includes some non-Gascon areas however, like Tolousain and Dordogne and is interesting in this regard that much of the overlap is rather with the Mediterranean (Provence-Languedoc-Corsica) region rather than others, all of which makes good sense to me.

      While PC1 is very clear in this aspect, PC2 and PC3 are rather amorphous, indicating that the rest of France or "France proper" can well be considered a single population region. PC2 however does suggest that some areas (although impossible to discern on plain sight) deviate from the norm, notably: (1a) a GE (Grand Est?, NE) cluster, which may be rather German-like, (1b) a GO (Grand Ouest?, Mid-West) cluster, which probably indicates a Breton exception, which may be somewhat British-like, and (2) a more amorphous GE, GO and IDF (Paris), which deviates in the opposite direction but not in the Mediterranean nor Basque/Gascon one, so it may be a group of "purer proto-French" maybe. PC3 only shows a weak GE deviation that may again correspond to Bretons.

      I'm a bit surprised of not perceiving any Corsican difference, really (the samples are all from people born in the 1930s, what means that Corsica was still not heavily resettled as it is now). Probably this owes to the PCA not being able to differentiate such a small sample and I would really like to see the Admixture graph for contrast and complementary info. Do you think it is in the supp. materials? I need a link to the main publication page.

    2. According to this: http://audesp.free.fr/Publications.html , it is a pre-print, so the supp. material are probably not yet published anywhere.

      Anyhow, I'll write something on it later because I think it is very important to peek at the "French" genetic structure and understand it as much as possible for the comprehension of all-European genetics and origins.

    3. Most of the Corsican living in metropolitan France are located in PACA (Provence-Alpes-Côtes-d'Azur region ) and Paris area.
      The samples are from the "Three-City study" and the three cities involved are Bordeaux, Dijon and Montpellier. They are not part of PACA and Paris area.
      I guess it is the major reason of the absence of Corsiacn samples.

      I am in a hurry to read your articleon this study...

    4. I'll try to write something by the weekend but I won't probably say anything I haven't said in the comments above, mostly because the data is what it is and other than underlying the apparent regional or non-regional differences that appear in the PCA, there's little more to say without the fine detail that the supp. materials could maybe give.

  15. I am confirmed R1b1a2a1a2 S116 and have entire lists of country locations of genetic distances where my DNA is located, to include 10 villages where upwards of 92% of R1b1a2a1a2 exist in Spain et al, would you be able to ascertain migrations from this type of data? And would it be useful to your studies?

    btw - Thank you for your incredibly in-depth information!


    The terminal Y-DNA SNP marker for Shane _____________ is S116. Population studies to date have found that the S116 marker which defines Y-DNA Subclade R1b1a2a1a2 is found in the highest concentration in South/Western Gipuzkoa, Spain. The distribution of the S116 marker and its ancestors is as follows:

    South/Western Gipuzkoa, Spain 92.99% > Bizkaia, Spain 89.47% > Gipuzkoa, Spain 87.24% > Ireland South 87% > Western Bizkaia, Spain 84.21% > Central/Western Navarre, Spain 83.34% > Roncal, Spain 83.01% > North/Western Navarre, Spain 82.35%

    The studies were conducted by sampling the DNA of indigenous populations and determining the percentage of each indigenous population which is positive for the SNP marker S116 and its direct ancestors.

    Thank you!

    Colorado, USA

    1. R1b-S116 is the most common haplogroup in Western Europe. That's what we can know without further downstream testing. You probably want to try comparing STR haplotypes in commercial databases for possible distant relatives. If your surname is, say, English, then probably comes from England, if Irish from Ireland, if Basque from the Basque Country, if Spanish from Spain, if French from France, if Italian from Italy, if German from Germany. It's not helping you much, I know, but that's what there is with the private DNA genealogy testing, which can well be described if not as "fraud", at least as making unrealistic promises. Anyhow, keep searching, you may find something.

  16. My data is juxtaposed against an incredibly large database, thus the defined subclade geo-distribution(s) and concentrations as illustrated. One would generally 'assume' such surname lineages, but as Celts migrated, the surname assumption might not be a probability but rather a possibility.

    However the surname Dáibhís:

    Mac Dháibhis → Ó Dáibhidh · Ó Dáibhis · Davis · Mac Dháibhidhlast name Davis:

    Would topically appear to be Welsh and or Irish. My subclade is of Proto Italo Celtic origin. More specifically the Lepontii Tribe. So my surname perplexes me as far as a surname origin and migratory evolution from known DNA archaeological ancestry.

    You can see my crux I hope. There's far too
    much historical information missing to assume I believe.



  17. My data is juxtaposed against an incredibly large database, thus the defined subclade geo-distribution(s) and concentrations as illustrated. One would generally 'assume' such surname lineages, but as Celts migrated, the surname assumption might not be a probability but rather a possibility.

    However the surname Dáibhís:

    Mac Dháibhis → Ó Dáibhidh · Ó Dáibhis · Davis · Mac Dháibhidhlast name Davis:

    Would topically appear to be Welsh and or Irish. My subclade is of Proto Italo Celtic origin. More specifically the Lepontii Tribe. So my surname perplexes me as far as a surname origin and migratory evolution from known DNA archaeological ancestry.

    You can see my crux I hope. There's far too
    much historical information missing to assume I believe.



  18. Hi have any info on haplogroup Y16018 which is from D27?

    1. Hi John. Sorry for the belated reply but seems I missed your comment back in the day.

      I have to answer with a question however: what the heck are you talking about? To be more specific, where did you get that nomenclature from? Notice that there is not always a standard nomenclature, particularly if you're getting your data (and possibly mistyping it?) from private DNA testing companies. They may have some info on alleged novel lineages of their own but I generally wait till it is confirmed by academic research or something. Often private lineages are not perfectly identified and they may be just one among many hanging under the same DF27 node, so you would want to ask those who gave you the info and nomenclature first.


Please, be reasonably respectful when making comments. I do not tolerate in particular sexism, racism nor homophobia. Personal attacks, manipulation and trolling are also very much unwelcome here.The author reserves the right to delete any abusive comment.

Preliminary comment moderation is... OFF (keep it that way, please)