December 10, 2012

Romani autosomal genetics

French Gitanes (Roma)
CC by Fiore S. Barbato
If a few days ago I mentioned the study by Rai et al. of Romani Y-DNA, which locate their origins with great certainty in the NW reaches of the Indian subcontinent, specifically among the lower castes, now I must echo this other study, still in pre-publication stage, which deals with the autosomal genetics of the same European minority.

Priya Moorjani et al., Reconstructing Roma history from genome-wide data. arXiv 2012. Freely accessibleLINK [ref. arXiv:1212.1696]

The authors studied the nuclear genome of 27 Romani individuals from six populations of four European states: Hungary (three different populations), Romania, Slovakia and Spain. 

A reasonable complaint at this stage could be that the size of the sample is small and very specially too concentrated in a very specific area: the Middle and Lower Danube region. But, well, let's assume that is not too important. 

The authors appear to confirm the NW Indian ancestral affinities of the Roma, however it seems obvious that they have been heavily admixed with Europeans since their migration a thousand years ago. 

The tests performed on this regard find greater affinity to Romanians than other Europeans but no other Balcanic nor West Asian peoples were tested for, so some question marks remain open. Certainly it is a bit puzzling that with all the worldwide comparisons performed in this paper not a single West Asian population was included. 

There are hence some shortcomings in the sampling and analysis strategy (why to compare with tropical Africans but not with Iranians, Turks, Egyptians or Arabs?) but the study still deserves a mention. 

Principal component analysis:

STRUCTURE  analysis:


  1. This is an interesting global dataset, notice that the Maasai belong to an almost completely separate cluster, that is because no other population in the cline that exists between the Maasai and the Europeans is included in the gloabal dataset, this is evident in the global PCA posted, (note that the PCA you are showing here does not match the STRUCTURE run, since the PCA you are showing only includes Europeans, East Asians, South Asians and Roma)

    In any event, it shows what type of impact non-sampling of clinal populations, like Arabs and North Africans in this case, has on the outcome on STRUCTURE/ADMIXTURE results. They show K2-K7 results also....

    1. ... "note that the PCA you are showing here does not match the STRUCTURE run, since the PCA you are showing only includes Europeans, East Asians, South Asians and Roma"...

      That was an intentional choice because I found the global comparisons frustratingly pointless when we already know that Roma are an Asia-Europe mix. The more distinct populations you get in a PCA the less useful it is because there's only two dimensions to it after all.

      The PCA and the Structure run match each other reasonably well for the Eurasian portion. I am not even sure why would the authors include so many Africans and East Asians and instead totally ignore West Asia, which plays an important role in Roma history, or include so few South Asian populations.

      Of course, if your interest is the Maasai and not the Roma, then it's totally different.

      ... "notice that the Maasai belong to an almost completely separate cluster"...

      That also happens in other studies (Henn, Wagh), the differences are really very minor. They can be caused by sampling strategies indeed but also by different algorithm (notice that this study uses STRUCTURE instead of the usual ADMIXTURE) or even random fluctuations.

      Actually if my eyes are correct, we get lower appearance of Eurasian admixture among Maasai in the studies of Henn (K=6 and K=8) or Wagh (2% at K=6).

      Instead, Alkorta 2012, who also used STRUCTURE, got greater appearance of Eurasian admixture but at a K-depth where the Maasai/East African cluster was not defined yet. I can only imagine that you have this paper in mind.

    2. Wagh link is broken:

    3. I hate to side track the subject of this post (which is about the Romani) any further, but the Wagh and Henn papers did not have an extensive global sample, the extremes were limited to West Asia and Europe, unlike this and the Alkorta paper, which went all the way to the Extremes, i.e. East Asia + Amerinds.

    4. Don't worry about off-topic as long as it is interesting and not abusive.

      Anyhow, I do not think that there is any meaningful difference for Maasai alignment if they include East Asians or not, what matters is whether they include enough directly connected references which are mostly in Africa and (for the minority element) in West Eurasia.

    5. "I do not think that there is any meaningful difference for Maasai alignment if they include East Asians or not"

      Offcourse there is, I don't know if you forgot, but I already tried different Eurasian proxies on the all African Dataset and in all cases East Africans have different outcomes in-terms of composition. Also it is not just the inclusion of East Asians that would change the composition of the maasai but also what type of West Eurasian you use, using a French would have a drastically different component outcome on the Maasai than using a bedouin, for obvious reasons.....

      Lastly, if you pay close attention to the purple cluster in the Alkorta paper, notice its distribution and how it overflows well into Southern Europe.

    6. But what you did is very different: you replaced French or Palestinians (i.e. a representative of West Eurasia) by Japanese (a mostly unrelated population from the East).

      The Japanese can only be proxy for West Eurasian admixture in Africa up to a point and that's why the non-African component is reduced to about 1/3 when you use Japanese instead of West Eurasians.

      But here East and West Eurasians are both present and therefore you will not see any appearance (illusion) of admixture with East Asians at all.

      The key to your results is not presence (of Japanese) but absence or good representative of the source of admixture (West Eurasia).

    7. Erratum: "but absence or good representative" should read "but absence of good representatives".


