I've been the last two weeks or so chewing on this pre-pub and there's a point when no more chewing seems to be useful. So let's get to discuss it as well as possible.
Clare Bycroft et al. Patterns of genetic differentiation and the footprints of historical migrations in the Iberian Peninsula. BioRxiv 2018 (pre-pub. DOI:10.1101/250191
The key finding is clustering of the populations of the Iberian Peninsula as in this map (locations for the Spanish state are precise, for Portugal unknown and located at random, also shadowing for Portugal is uniform for all the country):
|Supp. Figure 1a|
The weirdest thing for me is that the Catalan-Alacant and Seville-León-Asturias cluster are strongly related in the cladogram. I'll discuss this below.
Another very weird feature is the presence of a group in Pontevedra province (Galicia) that is the most different of all, even more distinct than Basques. It is composed of many small highly endogamous subgroups. I do not have at the moment any explanation for this, honestly.
External influences: mostly "French"
When factored as made up external populations, Iberians are mostly French (or something that approaches that label), although "mostly" varies from c. 60% in the West to c. 90% in Gipuzkoa. This pattern of "Frenchness" reminds that of the distribution R1b-S116. Correlation is not causation but it is still correlation and when R1b-S116 seems to stem from somewhere France and arrive to the Peninsula at least as early as the Bronze Age (or maybe before but still undetected, terminus ante quem at Los Lagos, as discussed recently).
|Supp. Fig 5a|
The most affected population by this French influence are Basque1, which show no significant contribution from any other source (only very small from Italy1 and very tiny from North Morocco, see supp. fig. 7) but the authors say that (supp. info.):
Notably, the Basque-centred cluster has a markedly different profile from the rest. Firstly, it has much lower, or zero contributions from donor groups that contribute to all other clusters: Italy, NorthMorocco, and WesternSahara, and a very large contributionof 91% (88-93) from France. Additionally, the model fit for this cluster is strikingly less good than that for the other clusters (Supplementary Figure 4d), suggesting that Basque-like DNA is less well captured by the mixture of donor groups in this data set. Specifically the Basque share even more DNA with the French group than predicted by their mixture representation, which might reflect, for example, that the DNA the Basque share with present-day French is only a subset of modern French ancestry. This pattern is seen for other Spanish groups also, but to a much lesser extent.
|Area that demands urgent genetic research|
So it seems we may be dealing with some sort of "paleo-French" rather than modern Indoeuropeanized French.
All genetic roads lead to France, at least in Western Europe: it also happens in Great Britain and Ireland, and it is very apparent in the geographically sorted phylogeny of R1b-S116. And is also this area where we see the earliest signs of mitochondrial DNA "modernity": in Paternabidea (Navarre) and Gurgy (Burgundy), an area that demands much greater attention from genetic and archaeogenetic research than has received to this day.
The other major contributors are: Italy (mostly Italy1), with peaks of c. 20% and influencing mostly the South and Center, North Morocco, with peak of c. 10% in Portugal and a West and South distribution, and Ireland, with peak of c. 6% in Eastern Asturias and a broadly Western distribution.
|Italian contribution (Italy1)|
|North Morocco contribution|
What do exactly these contributor components mean? Hard to say, although part of the Italian and North Moroccan elements could well be related to historical episodes such as Roman and Muslim conquests. But only partly so,because the North African in Galicia just cannot be that high only from a Muslim conquest that was very limited in time, nor should we expect to be that much "Muslim" nor "Roman" in the remote and largely ignored area of modern Portugal: there must be more ancient origins, probably dating to the Neolithic, Chalcolithic or Bronze Age.
|minor West Sahara contribution|
And in the case of the North African component we may have a guide in a minor West Saharan contribution (at right), which may well reflect an older and "purer" form of North Africanness and which is againcon centrated in Portugal and Galicia, with extension to parts of the Central Plateau but does not affect the South, the area where we should expect most of the Muslim period's genetic influence.
We cannot trace a line in Portugal because of the uncertainty of the geographic origins of the samples but we can do it within the boundaries of Spain and that line suggests that the Muslim genetic influence could be intense by the Southern third and maybe all the way to Zamora by the Western part but should not be relevant in Galicia nor Asturias nor (inferred, uncertain) much of Portugal. That in these areas, the North African element is peculiar and looks older than the Emirate/Caliphate of Cordoba.
Speculating on the possible origins of the Iberian clusters
This part has given me a true headache. It is very hard to understand how these clusters formed and I will not pretend here that I have all the answers. The most strange of all is the affiliation of the Central-West and Eastern clusters.
The problem is not only the highly implausible relation between Asturias-León and West Andalusia, which the authors seem to believe product of historical colonization at the time of the Reconquista (13th century) but which makes no sense whatsoever because the Kingdom of Seville was never part of the barely autonomous Kindgom of León but an administrative division of Castile (of which León was by then just a dependency) and we should thus see at least some important influence of the Central (yellow triangles) cluster, which is dominant in Valladolid, Madrid and even the city (but not the countryside) of Burgos, and we do not see anything like that.
One possibility is of course that the components or some of them are not that real but I do not see any indication of that in the study, so, in wait of independent replication, I'll take them at face value.
So why then? I've been scratching my head until I could not think any further, I swear.
And this is my hypothesis, risky as it may be:
1. The essence of the split between the related Spanish components (excluding Galicians and Basques) and the Portuguese-Galician component could be at the Early Neolithic. When I mask the areas not or weakly affected by the Earliest Neolithic in the components map I get this:
... what seems to correspond odly too well to the first major split in the cladogram between the Portuguese-Galician (purple) component and the rest.
2. The expansion inwards may correlate with Chalcolithic and Bronze Age processes, which seem to be way too important everywhere and also in Iberia. So I used the Bell Beaker map I copied from Harrison (see here) as cutoff for another mask (radius are relative to Bell Beaker density circles in the reference map):
If so the split between the Central (yellow) and East (orange) groups (to which the brown and red and other groups are closely affiliated) could be related to this Bell Beaker period and derived Bronze Age cultures. The yellow or Central component could originate in Los Millares (Almería province) and spread northwards to Ciempozuelos (Madrid province) and from there to other areas with the Cogotas I culture of the Bronze Age.
The Purple (Western) component should be somehow related to Zambujal or Vila Nova de Sao Pedro (VNSP) culture of Portuguese Estremadura and spread northwards to tin-rich Galicia with the group of Montelavar already in the Bronze Age maybe.
The mysterious Red (Central-West) component could be related to some colonization of that area from the Bell Beaker dense area of Catalonia or the Denia district, or maybe even an older colonization, hard to say. What I know of that area in the late Prehistory is that it is ill-defined, partly for lack of research in the heavily farmed alluvial plain, and that it correlates with Southern Portugal but not fully, always showing a distinct personality, until it grows a clearly distinct personality in the Tartessian period, already in the Iron Age. It is also clear that the so-called Silver Road runs straight through that cluster and that it was important, and growingly so, in the Late Prehistory, having both commercial and religious significance and being clearly the main path of penetration of Phoenician influences inland, already in the proto-historical period.
While still caught with feeble pins, this Silver Road speculative explanation seems to make much better sense than the Reconquista hypothesis the paper appears to spouse and which I see nonsensical because the patterns observed are not as we could expect.
But of course it is always up to you to make up your own mind, I'm just offering some variant considerations that for me make some sense but that are by no means a well finished theory either, just better than the simplistic historical interpretation, which does not fit the facts too well.