A new and quite interesting study finds strong support for Upper Paleolithic (~ LSA) Eurasian inflows into the Horn of Africa and confirms that most of the populations of that region are in essence an ancient mix of West Eurasian and African ancestries.
Jason A. Hodgson et al., Early Back-to-Africa Migration into the Horn of Africa. PLoS Genetics 2014. Open access → LINK [doi:10.1371/journal.pgen.1004393]
Genetic studies have identified substantial non-African admixture in the Horn of Africa (HOA). In the most recent genomic studies, this non-African ancestry has been attributed to admixture with Middle Eastern populations during the last few thousand years. However, mitochondrial and Y chromosome data are suggestive of earlier episodes of admixture. To investigate this further, we generated new genome-wide SNP data for a Yemeni population sample and merged these new data with published genome-wide genetic data from the HOA and a broad selection of surrounding populations. We used multidimensional scaling and ADMIXTURE methods in an exploratory data analysis to develop hypotheses on admixture and population structure in HOA populations. These analyses suggested that there might be distinct, differentiated African and non-African ancestries in the HOA. After partitioning the SNP data into African and non-African origin chromosome segments, we found support for a distinct African (Ethiopic) ancestry and a distinct non-African (Ethio-Somali) ancestry in HOA populations. The African Ethiopic ancestry is tightly restricted to HOA populations and likely represents an autochthonous HOA population. The non-African ancestry in the HOA, which is primarily attributed to a novel Ethio-Somali inferred ancestry component, is significantly differentiated from all neighboring non-African ancestries in North Africa, the Levant, and Arabia. The Ethio-Somali ancestry is found in all admixed HOA ethnic groups, shows little inter-individual variance within these ethnic groups, is estimated to have diverged from all other non-African ancestries by at least 23 ka, and does not carry the unique Arabian lactase persistence allele that arose about 4 ka. Taking into account published mitochondrial, Y chromosome, paleoclimate, and archaeological data, we find that the time of the Ethio-Somali back-to-Africa migration is most likely pre-agricultural.
The study makes three different formal admixture tests (f3, Adler and D-stat), as well as a Rolloff simulation, in order to confirm these findings. This part is quite technical and therefore I am not going to discuss it further. Feel free to explore the extensive supplemental materials.
I will instead stop on what I know better, which is ADMIXTURE and FST distances, which are more visually amenable and ultimately tell the same story.
|Figure 2. Population structure of Horn of Africa populations in a broad context.|
ADMIXTURE analysis reveals both well-established and novel ancestry components in HOA populations. We used a cross-validation procedure to estimate the best value for the parameter for the number of assigned ancestral populations (K) and found that values from 9 to 14 had the lowest and similar cross-validation errors (Figure S2). (A) The differences in inferred ancestry from K = 9–14 are most pronounced in the HOA for K = 10–12, where two ancestry components that are largely restricted to the HOA appear (the dark purple and dark green components). (B) Surface interpolation of the geographic distribution of eight inferred ancestry components that are relatively unchanging and common to the ADMIXTURE results from K = 10–12. (C) Individual ancestry estimation for HOA populations (with language groups indicated) and surface plots of the changing distributions of the Nilo-Saharan (light blue) and Arabian (brown) ancestry components for K = 10–12. At K = 11, a new HOA-specific ancestry component that we call Ethiopic appears (dark purple) and at K = 12 a second new ancestry component that we call Ethio-Somali (dark green) appears with its highest frequencies in the HOA.
Above we have the original presentation of ADMIXTURE results for K=10-12. It must be said that the cross validation score is lowest (optimal) for K=12 but that this value is only slightly smaller than those for K=9-14, which make a plateau (fig. S2).
Therefore their use of K=10 and K=11 is justified, particularly because it is also interesting to turn off the old amalgamation reflected in the Ethiopic (Ari, Woloytta) and Ethio-Somali (Cushitic, Ethiopian Semitic) components, and that is done by using K=10 instead of the optimal K=12.
This issue is best perceived in the FST distances table (within text S1), which I include here with some convenient annotations:
The red-orange colored frames (as well as the red notes on the components) in the table above were added by me to better illustrate the meaning of these FST values:
- The red frames capture two groups of components with very low differences (<50): West Asia-Europe and West-East Africa.
- The dark orange frames indicate other two groups with quite low distances (<70): South-Central Asian and the West Eurasian core.
- The lighter orange frames indicate large clusters of middling distances (<125) of continental nature: Eurasian and African.
- Intercontinental FST scores are systematically larger, for example European-West African is 176, while European East African ("Nilo-Saharan") is 172, only slightly smaller.
It is quite apparent that there are three components that overflow these continental boundaries:
- The so-called Mahgrebi (North African) has some extra affinity with the Ethiopic (Omotic) component, and vice versa. These two components fall otherwise within my approximative continental boxes but they still show lower scores for all the other components of the other "box". This is consistent with their nature as Afro-Eurasian admixed components, each with its own proportions.
- The Ethio-Somali (Cushitic?) component is actually more intermediate than the previous ones: although its strongest affiliation is towards Eurasia and particularly with the North African and Arabian components, it also shows strong affinity with the core African components (East and West African, i.e. Nilo-Saharan and Niger-Congo). This is consistent with the other evidence in this study that reveals it as an ancient Afro-Asian mix.
I must mention here that some of the labels used by the authors are not at all the ones I would have chosen and this is particularly true re. the Nilo-Saharan (light blue) component, which peaks among the Sandawe (Aboriginal East Africans from Southern Tanzania, speaking a click language), the Anuak (Nilo-Saharan Ethiopians) and the Gumuz (other Ethiopians of quite dubious Nilo-Saharan linguistic affiliation). Hence I prefer to call it East African or East African 1.
The authors conclude with the following remarks (emphasis mine):
We find that most of the non-African ancestry in the HOA can be assigned to a distinct non-African origin Ethio-Somali ancestry component, which is found at its highest frequencies in Cushitic and Semitic speaking HOA populations (Table 2, Figure 2). In addition to verifying that most HOA populations have substantial non-African ancestry, which is not controversial –, , we argue that the non-African origin Ethio-Somali ancestry in the HOA is most likely pre-agricultural. In combination with the genomic evidence for a pre-agricultural back-to-Africa migration into North Africa ,  and inference of pre-agricultural migrations in and out-of-Africa from mitochondrial and Y chromosome data , –, , –, these results contribute to a growing body of evidence for migrations of human populations in and out of Africa throughout prehistory – and suggests that human hunter-gatherer populations were much more dynamic than commonly assumed.
We close with a provisional linguistic hypothesis. The proto-Afro-Asiatic speakers are thought to have lived either in the area of the Levant or in east/northeast Africa , , . Proponents of the Levantine origin of Afro-Asiatic tie the dispersal and differentiation of this language group to the development of agriculture in the Levant beginning around 12 ka , , . In the African-origins model, the original diversification of the Afro-Asiatic languages is pre-agricultural, with the source population living in the central Nile valley, the African Red Sea hills, or the HOA , . In this model, later diversification and expansion within particular Afro-Asiatic language groups may be associated with agricultural expansions and transmissions, but the deep diversification of the group is pre-agricultural. We hypothesize that a population with substantial Ethio-Somali ancestry could be the proto-Afro-Asiatic speakers. A later migration of a subset of this population back to the Levant before 6 ka would account for a Levantine origin of the Semitic languages  and the relatively even distribution of around 7% Ethio-Somali ancestry in all sampled Levantine populations (Table S6). Later migration from Arabia into the HOA beginning around 3 ka would explain the origin of the Ethiosemitic languages at this time , the presence of greater Arabian and Eurasian ancestry in the Semitic speaking populations of the HOA (Table 2, S6), and ROLLOFF/ALDER estimates of admixture in HOA populations between 1–5 ka (Table 1).
|K=12 detail for a fraction of the Horn of Africa and distribution of the four main components|