January 19, 2013

Y-DNA of Moldovans

Moldova is the easternmost generally recognized sovereign state in Europe speaking a Romance language. Achieving its independence from the Soviet Union in 1991, Moldova saw part of its territory segregated in the mostly unrecognized Republic of Transistria (multiethnic, under military control of Russia). There was some talk about joining the related Republic of Romania but this plan seems to have been abandoned for now.

Alexander Varzari et al., Paleo-Balkan and Slavic Contributions to the Genetic Pool of Moldavians: Insights from the Y Chromosome. PLoS ONE 2013. Open accessLINK [doi:10.1371/journal.pone.0053731]


Moldova has a rich historical and cultural heritage, which may be reflected in the current genetic makeup of its population. To date, no comprehensive studies exist about the population genetic structure of modern Moldavians. To bridge this gap with respect to paternal lineages, we analyzed 37 binary and 17 multiallelic (STRs) polymorphisms on the non-recombining portion of the Y chromosome in 125 Moldavian males. In addition, 53 Ukrainians from eastern Moldova and 54 Romanians from the neighboring eastern Romania were typed using the same set of markers. In Moldavians, 19 Y chromosome haplogroups were identified, the most common being I-M423 (20.8%), R-M17* (17.6%), R-M458 (12.8%), E-v13 (8.8%), R-M269* and R-M412* (both 7.2%). In Romanians, 14 haplogroups were found including I-M423 (40.7%), R-M17* (16.7%), R-M405 (7.4%), E-v13 and R-M412* (both 5.6%). In Ukrainians, 13 haplogroups were identified including R-M17 (34.0%), I-M423 (20.8%), R-M269* (9.4%), N-M178, R-M458 and R-M73 (each 5.7%). Our results show that a significant majority of the Moldavian paternal gene pool belongs to eastern/central European and Balkan/eastern Mediterranean Y lineages. Phylogenetic and AMOVA analyses based on Y-STR loci also revealed that Moldavians are close to both eastern/central European and Balkan-Carpathian populations. The data correlate well with historical accounts and geographical location of the region and thus allow to hypothesize that extant Moldavian paternal genetic lineages arose from extensive recent admixture between genetically autochthonous populations of the Balkan-Carpathian zone and neighboring Slavic groups.

Most interesting is without doubt the list of haplogroups:

Table 2 - Kharahasani is located to the South and Sofia to the North, the Romanian and Ukranian samples are both from nearby regions (Romanian Moldavia and Transistrian Ukranians).

My notes (see ISOGG for nomenclature):
  • The high diversity of haplogroup I (also in nearby Romania and Ukraine), including I1-M253, I2a1b-M423 and I2a2-M223 is consistent with the wider region being, arguably, ancestral to this lineage. However "Low Germanic" I2b does not show up, as doesn't "West Mediterranean" I2a1a nor Anatolian-Caucasian I2a*.
  • Among Neolithic-specific inputs, which are particularly important in the Balcans, Moldovans show notable (13%) presence of E1b1b1a1-M78 variants, especially the well studied E1b1b1a1b-V13, related now even by ancient DNA to European Neolithic flows. They also have some (2%) E1b1b1b2a-M123, an Eastern and NE African lineage found at low frequencies in Southern Europe.
  • Another clearly Neolithic lineage is G2a-P15, found among Moldovans only at very low frequencies (more common in the context of the Mediterranean Neolithic it seems).
  • Not yet documented by aDNA but also likely Neolithic in Europe is haplogroup J, found in Europe mostly as J2 (originating in Highland West Asia) but also, mostly in the Balcans, as J1 (originating maybe in Palestine). Moldovans show both at low frequencies (4% each).
  • Almost all the rest belongs to the largest European clade, R, mostly its Eastern variant R1a1a-M17 (30%). Western R1b1a2-M269 makes up 16%. 
  • Minor clades are H (Romani), T (South and West Asian, with extensions into East Africa and, thinly, in Europe), N (NE European and North Asian) and Q (probably from West Asia).
In the PC analysis (fig. 2, not shown) Moldovans appear intermediate between Balcanic and Central-Eastern European populations but rather leaning towards the latter.

In spite of their historical and ethno-linguistic connection Romanians and Moldavians do not appear to be particularly related in the genetic aspect:

The genetic relationship between Moldavians and Romanians deserves special attention, since these two groups speak practically the same language and share many cultural features. It is reasonable to assume that Moldavians and Romanians inherited genetic lineages, shared with other Balkan populations, from Vlachs who, in turn, received them from Paleo-Balkan tribes. However, Moldavians and Romanians do not form a cluster that would have separated them from the neighboring populations. Indeed, in the space of multi-dimensional scaling based on the RST distances between STR haplotypes, Romanian populations appeared scattered among the Balkan populations and did not cluster with the Moldavians (Figure 3). According to the AMOVA analysis, the degree of within-group differentiation among Moldavian and Romanian populations was significantly greater than genetic differences between either Romanians or Moldavians and the group comprised of the Balkan populations (Table 3). Moldavians and Romanians also appear dissimilar on the diagram of binary lineages (PC plot, Figure 2). Thus, sharing nearly the same language is not accompanied by specific genetic similarity between Moldavians and Romanians. Furthermore, Italian populations that share the Romance/Latin language with Moldavians and Romanians, show little genetic similarity with them. These results agree with previous genetic studies suggesting that the genetic landscape of southeast Europe had been formed long before the modern linguistic/ethnic landscape was shaped [16], [48].

Instead the genetic affinities of Moldovans lean strongly towards their Slavic neighbors from Eastern and Central Europe:

In contrast to Romanians and most other Balkan populations, Moldavians show a clear genetic similarity to western and eastern Slavs. This is strongly implied by haplogroup R-M17, which dominates the paternal lineages of the Slavs and is broadly represented in Moldavians. (...)
The noteworthy domination of R-M17 chromosomes in Moldavians compared to Romanians is due to the R-M458 subclade. Haplogroup R-M458 likely has its roots in western/northern Poland, where it has its greatest modern concentration and microsatellite diversity [49].

This supports my impression of R1a1a1b1a1-M458 being not spread by Slavic migrations (it is very rare among Balcanic Slavs but has an notable presence in Greek Macedonia instead) but much earlier, plausibly by Indoeuropean migrations which had a major sub-center in Poland in the Chalcolithic period.


  1. A clarification should be made whether the Moldovan samples were Romanian speakers only, or rather a cross-section of Romanian/Russian speakers.

    It would also be interesting to compare this Y-DNA analysis to an mtDNA analysis on the same samples - this may explain the language affinity between Romanians and Moldovans.

    1. If you read the original study, it's clear that they are ethnic Moldovans:

      The sample set comprised self-designated Moldavians from the northern settlement of Sofia and southeastern settlement of Karahasani, as well as Ukrainians from the eastern village of Rashkov in Transnistria (Republic of Moldova) and Romanians from the towns of Piatra-Neamt and Buhusi from eastern Romania (Figure 1).

      The Moldovan samples are from the North and South of the Republic of Moldova, the Ukrainian sample is from Transnistria (North) and the Romanian samples are from the region of Moldavia, near the Carpathian mountains.

      "It would also be interesting to compare this Y-DNA analysis to an mtDNA analysis on the same samples - this may explain the language affinity between Romanians and Moldovans".

      No luck, the study is focused only on this Y-DNA issue. Moldovans are anyhow somewhat intermediate between Eastern Europe and the Balcans. Also there does not seem to be often any particular correlation between genes and language. People can switch language almost as easily as they can switch clothes, political fealty or religion but they cannot change their genetics. For what I know, the persistence of Romance in Romania is in itself kind of "miraculous" matter, considering the country was part of the Roman Empire for less than two centuries (and only part of it for most of this time: West Wallachia, the Banat and Transylvania), and more so considering the many foreign invasions and the fact that all their Latin or Greek speaking neighbors switched language to Slavic, etc.

      All this does not probably owe to any genetic cause but rather to some other sort of ethno-political ones.

      One reason for these differences, considering that the end of Chalcolithic and beginning of Bronze seems to be the timeline of near-stabilization of the genetic landscape in other regions (Germany notably), could be that while the Eastern areas of the Cucuteni culture were conquered and plundered by early Indoeuropean invasions (various Kurgan-derived cultures) Romanian Moldavia was not, being one of the last areas where the Danubian-derived culture of Neolithic Central Europe survived (Foltesti culture). I'm not sure however and it'd be interesting also to contrast Moldavians with Wallachians and Transylvanians, as well as different populations within these regions (maybe mountain Moldavians are more Transylvanian-like, no idea). But again this belongs to some other study, yet to be made.

    2. "Ethnic Moldovans" is a political term invented in the 20th century. At best, an "ethnic Moldovan" is a Vlah(Romanian)/Slavic mix. It's not clear if the study labels the Romanian/Vlah population as Moldovan, or rather those that speak "Moldovan" (another political invention to denominate the Romanian speakers - "Moldovan" and Romanian languages are virtually identical). To conclude, the study does not even mention the syntagm you used - "ethnic Moldovan" - less so, what that would mean. It could be that they used individuals of Moldovan citizenship, which in essence makes my point.
      Regarding language affinity, there is a well-rooted theory which holds that children in mixed marriages learn and propagate the mother's tongue. Many times the invasions led to interbreeding of male invaders with local women. Their offspring obviously carried father's Y gene and mother's X gene. So what were they? Invaders, or locals? Obviously, any genetic research that delves into the topics addressed by this study is incomplete until an mtDNA research on the same sample is conducted.
      My concern is that these studies are based on incomplete statistical samples (read "non-significant cross-samples") just to support certain political agendas.
      Regarding the aforementioned language continuity theory, it may explain why Romanians kept their Latin-based language - along with other reasons, which require a lot more study in order to better understand the "miraculous" persistence of Romanian language.
      I'll end with a metaphor: when pre-historic people saw the first solar eclipse they thought it was a miracle.

    3. For what I understand from the paper or what I can read in other sources like Wikipedia the concept of ethnic Moldovan means a native Moldavian (Romanian) speaker from the Republic of Moldova (or nearby regions of Ukraine, or the diaspora), as opposed to other ethnic groups like Ukrainians, Gagauzes, Russians, Roma, etc. They make up 70% or 72% of the population of the Rep. of Moldova (a mere 32% in the secessionist Tranistria however).

      "the study does not even mention the syntagm you used - "ethnic Moldovan""...

      I'm pretty sure they mean that. The sampled individuals self-identified as "Moldovan" within the Republic of Moldova (not Tranistria nor Gagauzia, where the minorities are concentrated).

      In this map both Karahasani and Sofia are nearly 100% Moldovan, so there should be no doubt. At first I thought that you could know something I would not but it seems not to be the case: these people are self-defined ethnic Moldovans in nearly 100% ethnic Moldovan towns. There's no room for error.

      "My concern is that these studies are based on incomplete statistical samples"...

      Well, possibly so. It would have been interesting to have some more samples, as I said before.

      "... just to support certain political agendas".

      It would not be the first case but I do not see that clear in this one. Certainly you are wrong in your doubts about the "Moldavianness" of the two samples, which are separated by a wide geographic stretch, so they probably represent well Moldovans as a whole taken together.


    4. ...

      You may want anyhow to take a look to fig. 3 (possibly a best choice of image than fig. 2, now that I thing of it), where M-Sofia cluster (barely) with Balcanic peoples (closest to Romanians-Buhusi) and M-Karaheni culster (again borderline) with Central-East European peoples (closest to Slovakians?) It is a different way of analysis, where some level of distinctiveness between Romanians is also found, with Moldovans and Ro-Buhusi clustering rather towards Central-East Europe and other Romanians, including the sample from Piatra-Neamt, rather in the opposite side of the graph (between Serbians, Croatians and Macedonians).

      This may well mean that within the wider Romanian People there are some differences with some groups being more strictly "Balcanic" and some other groups (incl. Moldovans) less strictly so and tending to Central-East Europe instead. This kind of variability should not surprise anyone, at least it does not surprise me the least, considering the troubled history and prehistory of the area, which yourself acknowledge, and that ethno-linguistic identity is not necessarily linked to genetic homogeneity, at in uniparental lineages, which may have different sources, often older than the ethnogenesis itself (in this case prior to the romanization of Dacia, for example).

      Notice that the genetic pool of ethnic Romanians may be more homogeneous in the nuclear (diploid) genome because people marry much more often within their ethnicity than with others. But the Y-DNA (and possibly also the mtDNA) are in most cases probably from times long before Romanians existed as such ethnicity, and surely even before Dacians. It should be obvious that at some point people living in what is now Romania and Moldova changed language to Latin (Vulgar Latin). But they were in essence the same people as when they spoke Dacian (regardless of a few possible colonists). The same probably happened before when they became Dacian-speakers and maybe several other times in the deeper past.

      The fact that they have dual (and locally different to some extent) Balcanic and Eastern-Central European affinities fits well with the Neolithic and Chalcolithic prehistory of the region, with the main source being Balcanic (Neolithic) and a secondary source being East-Central European (later Neolithic, Chalcolithic and Metal ages). These differences may well be deeply rooted and, as you suggest, being gender biased. Sadly we do not have mtDNA data here to make any such fact-based deduction.

    5. Maju, your logic is reasonable - your choice of words may be sometimes inappropriate. The problem with your approach is that you make assumptions and extrapolations which don't hold from a pure formal logic perspective.

      The conclusion you're drawing is that PERHAPS the Moldovans have those genetic traits, when in fact all this study says is "the specific 125 Moldovan males have these specific genetic traits when compared to the specific 53 Ukrainians and specific 54 Romanians". This is basic statistics, when non-representative sampling is used.

      Unless you're "Moldovan" or Romanian, you simply can't know more than me on this subject. Based on the fact you think Dacians "changed" their language to Latin shows you're definitely neither.

      I don't want to get into a sterile debate, especially I don't want to counter argue statements like " I thought that you could know something I would not but it seems not to be the case" and "you are wrong in your doubts".

      I hope at least I gave you an angle that's worth reflecting on. I will continue to check your site - for which you deserve congratulations - with the hope that in the future more meaningful studies will emerge - I know it will take time, as genetic tests are expensive in most parts of the world and it also takes some educating and buy-in. Best wishes.

    6. Of course I'm not Romanian nor Moldavian. It's in my profile. But being something does not automatically grant anyone super-knowledge on that ethnicity's history. A lot of people have assumptions about their own ethnicity's genesis (or other aspects) that are incorrect. Daily first hand street knowledge is not guarantee of knowledge of the past.

      So rather than claiming that I know a lot on Romania, that I do not, I am doubting (to some extent at least) your quality as "authority" on the matter, especially in the historical and prehistorical aspects, and also about general population genetics.

      Living in, say, Bucharest surely grants you direct knowledge of its streets, shops, people, language and media, much more than I can get from reading online or my only brief stance in the transit area of the airport in the Ceaucescu era (delicious pastries but horrible ersatz chocolate). But it does not grant you automatically any super-knowledge on Dacians or Roman Dacia, or even the macabre details of the life of Vlad the Impaler. That depends on what you (or I or whoever) has studied/researched on the matter.

      I will admit that Dacians and Romanians are not my specialty but at least I know something about the Neolithic and Chalcolithic of the region, and I am sort of an little "authority" on population genetics, especially on European ones. Instead I do not see that you seem to know too much on anything, just that you have precautions and also some wrong ideas on how ethnogenesis happens (in most cases at least it's a matter of survival: genes and lives are more important than languages and banners, so people switches sides once and again - some don't but martyrs don't usually leave a genetic legacy).

    7. "I don't want to get into a sterile debate"...

      Me neither: I love productive debates but sterile ones are not even worth the name "debate".

      One of the problems may be that what we could well call "the Balcanic ideology" on nations is about the (largely false) idea of biological inheritance. I may not know Romania but I've spent some time in other parts of the Balcans (fractured Yugoslavia, also Greece) and this kind of thought was clearly behind the ideas of ethnic cleansing in Croatia and Bosnia and of the Slavic-Albanian conflict. It's probably not privative of the Balcans but it's certainly a bit more exaggerated over there. And it's also a very non-Latin way of thinking: surely the parts of Europe where that kind of ethnic ideology is weaker (by a lot) is in the Latin countries (not sure Romania but Italy, Spain, France... are all assimilationist and not genocidal in their ethnic ideology).

      In practical terms the genocidal idea of ethnogenesis is almost always wrong (with some exceptions, some obvious other more obscure probably). Especially in the Metal Ages (including Antiquity and the Middle Ages) the issue was to obtain serfs or slaves for the aristocrats, not killing everyone (who would serve the victory banquet then?) So assimilation by elite dominance was the rule. That's why Balcanic Slavs are absolutely closer to Romanians or Greeks than to other Slavs (and that's something that appears in many other studies, not just this one). There was never a mass migration of Slavs to the Balcans and genetics proves it: some warlords moved, maybe associated to localized real migrations, but the bulk of the people remained the same more or less.

      In modern times the "Germanic" idea of "essential peoples", "races", became more dominant in many parts of the Europe but the reality is that Germans themselves are a complex mix rooted not in the Iron Age "migrations" but in much older times. What links them is a language and idea of common identity rather than just blood ties, which in some cases may be closer to Polish, to French or to Danish, for example.

      "I hope at least I gave you an angle that's worth reflecting on".

      Sadly you have not given me any specific data but mostly subjective opinions. Opinions are legit but must be founded on something not just thoughts ("thin air" as they say). I hope you improve in this aspect and your future comments are more enriching.

      Sorry for being so blunt but guess what: it's "national character": "things clear and chocolate dense", they say. Or maybe it's just a poor excuse. Whatever.

      Anyhow, I'm glad that you found my site interesting in spite of all.

  2. I don't want to look as accusatory as @Cabirul, but if the study was meant to evaluate cross-border similarities, I believe it is somewhat preferable to consider the last few hundred years for accuracy, when sampling. What I mean is it is important to look into Romanian migration due to political circumstances. Therefore the Moldovan Romanians living in Piatra-Neamț might have the greater differences (as stated) from the ones sampled, because the core (of the perpetuated population of the Moldovan Principality at least) is not (historically) perpetuating in the studied region of the Republic of Moldova. The ones who were sampled in Rep. of Moldova are (most probably) descending from Romanian speakers in the Nistru (Dniester) basin in greater (also involuntary) genetic contact with the Slavic populations. To explain myself – sampling DNA from individuals with a strong lineage in the Center of Rep. of Moldova (heavy forested area in the past) might render closer genetic relationship with the Romanians in the west since they were able to better isolate themselves from non-Romanians in comparison to the ones living in the Dniester basin.
    I am a Romanian from Rep. of Moldova and I acknowledge the fact that, on the overall, the eastern Romanians have more genetic similarities with eastern Slavs than other Romanians (with eastern Slavs), one could note the complexion and hair color differences. Yet in the same time, the population in Rep. of Moldova has a very complicated background, ranging from Non-Romanian speakers with Romanian lineage to fervent Romanians/Moldovans with Turkic/Slav origin (shifts in less than a hundred years). Therefore self-identification should have been replaced by the lineage criterion. One way to simplify it would be surname study for the yDNA test. Here’s the tricky part – a lot of Romanian speakers in the Southern part of Rep. of Moldova have Gagauz/Turkic surnames, similarly some Northerners bear Romanized Slavic surnames. Regardless of the declared native language, surnames ending in any “u” (but avoid any “ncu”, “nco”), or modified surnames with Romanian meaning “ceban”, “cebanov” etc (shepheard = “cioban(u)”), “mocan(enco)” and other Soviet crimes against surnames, might qualify as samples (avoid “oglu” as turkic). Please take into consideration that Sovietization led to a great number of individuals from the current ethnic “minorities” having both of their parents as native Romanian Speakers. For the sake of history – a certain number of Romanian speakers were relocated from the left bank of Dniester (between Dniester and Bug) which was never governed (in the recorded history) by a Romanian state/authority and they might have a strong genetic relationship with Slavs.
    Nevertheless I respect the study and I can still see valuable data there.

  3. What those above are trying to observe is that isn't correct to assume that Moldovans are different from Romanians, because they cluster closer to others, due to the fact that eludes few important historical aspects. There is a large number of minorities in the Moldovan Republic - 30%, mostly placed during the 200 years Russian and Soviet occupation of the area, while at least a 3rd of the indigenous Moldovan population has been shifted to all over the Soviet Union. There are also mixed marriages that occurred in the last 200 years of Russian occupation, facts that led to an obviously high M-17, compared to the Moldovans from Romania.
    Knowing the historical facts should have noticed that the highest in Moldovans from the 3 areas is the Proto-European I-M423, which is in its place of origin.
    Also clear that M-458 isn't Slavic but Central Eastern European, predating Slavic migration, while R1b was also in the Indo-European mix of the area, predating Slavic and Celto-Germanic arrivals.
    It is correct to state that Moldovans are genetically close to West Balkan and Central East Europeans, rather than South and West Slavs, because the populations in the areas such as parts of Ukraine, Slovakia, ex Yugoslavia, Bulgaria and Macedonia, at least in the last 1000 to 3000 years, were Thracian and Geto-Dacian.
    As you well observed the large Slavic migration didn't in fact occur, but warlords moved along with their subjects and the enslaved populations from the Ukraine and South Poland areas, over a majority of indigenous people. In some cases even overlapping already assimilated Getae populations from the North and North West of Carpathian (such as 6th century Iranian White Croats over Carpi and Costoboci) resettled in NW Balkans over autochthons of same or similar origin. Such is the case of displaced Dacian Carpi, Costoboci, TyraGetae, ThyssaGetae, Samo-Getae, Daco-Bastarne, Celto-Dacians of Central East Europe, Getian-Illyrian tribes of West and North West Balkans, Balkan tribes of Daco-Moesian and Thracians, including the well known tribe of Bessi, recorded for centuries in Central Balkans, that also gave the name to what is today Moldovan Republic and is known by Romanians (and Moldovans alike) as region in the larger Moldova, under the name of Bessarabia = land of the Bessi. Tribe identified by all historical sources as Dacians, Vlachs and between themselves as Rumans.
    Populations stated above by the 5th century started to be named by foreigners as Vlachs, Vlachos, Volohi, Vlasi, Blachi, Lah, Olah, Ulak, Kara-Ulak, Mauro-Vlachs and as the newly settled populations advanced into their lands, withdrew mostly in the mountainous areas of the Carpathian Mountains and the Balkans.
    History and genetics come to confirm that current Moldovans are one of the indigenous populations that inhabited the area and the fact that a romance language continued to be spoken far north and east of the Roman-Dacian borders, as Daco-Romanian, as well as similar tongues spoken South of the Danube by Vlachs, the genetic similarity of mostly Proto-European and Neolithic Balkans (aside of the newly settled 15-30% Slavs), many cognates with other romance languages that aren't to be found in Latin, stand proof that Daco-Romanian is a Getian, Thracian language and Latin is also a Thracian Language.


Please, be reasonably respectful when making comments. I do not tolerate in particular sexism, racism nor homophobia. Personal attacks, manipulation and trolling are also very much unwelcome here.The author reserves the right to delete any abusive comment.

Preliminary comment moderation is... ON (your comment may take some time, maybe days or weeks to appear).