For what they were... we are: On the plausible ages of human Y-DNA haplogroups

March 3, 2013

On the plausible ages of human Y-DNA haplogroups

When I discussed yesterday the new A00 Y-DNA lineage, you may have found strange that I was accepting the rather extreme age estimates when I am usually very critical of "molecular clock" age estimates.

The reason is that my main criticism is against the way too common such estimates (for Y-DNA) based on a handful of STR markers, which do not seem but a pointer and not the real stuff. The real stuff is in the SNPs instead but the problem is that, right now, we know only a few of the many that must be lurking in the actual Y chromosomes that all men carry in their cells. As the exact amount of SNPs in any given lineage is unknown, there is no way of counting them and, that way, establishing the relative chronology of the Y-DNA phylogeny.

However this will change soon if is not already doing so, because full chromosome sequencing is every day cheaper and that is what has allowed, for example the 1000 Genomes Project. A few weeks ago, an open access paper by A. Van Geylesteen et al., emphasized this immediate future in which our knowledge of the Y-DNA tree will almost literally explode.

But the fact is that there is at least one person who has been already working in the open with that perspective in mind. Terry D. Robb, at his site mostly dedicated to haplogroup I1, published some months ago a (partial) Y-DNA tree of Humankind (update 10 or this PDF) that calculates haplogroup relative ages based on nucleotide differences between the Y-chromosome sequences of the 1000 Genomes Project.

It is still not rocket science but it is getting quite close. A problem may be that some haplogroups are too thinly represented (actually only R1b, O3 and E1b have large enough samples to be fairly safe about them internally) but while this may be a problem for coalescence ages (because maybe key sublineages are not represented), it should not be at all for divergence ones, i.e. the node where they separate from their closest relatives. So we cannot be certain that the apparent coalescent relative age of, for example, haplogroup D (n=17) is correct but we can be quite certain that the relative age of its divergence from its "brother" E at the DE node is good.

The other and main problem is calibration. And this is the only (albeit important) aspect where I disagree with Robb's method. He insists on scholastically using academic references from only the population genetics literature (which systematically produces too recent "ages") and ignores archaeological references altogether. Therefore I have taken his graph and modified that part as follows:

click to expand

As you can see, I calibrated by equating age(CF) = 80,000 years ago. This is based on Petraglia 2007 and other materials that establish that there was modern human presence in South Asia since c. 80,000 BP, before and after the Toba ash layer (74 Ka BP). A minor doubt would be if that date would better correspond with Y-DNA F (which in my calibration above shows up right after the Toba event) but that would have made all ages even older (not really a problem for me but I rather like better this calibration).

The result is somewhat shocking because it pushes the age of A0 to c. 265,000 years ago, making it effectively pre-Sapiens. Relatedly, it would push the age of A00 (assuming everything else in the Méndez paper is correct) to c. 450,000 years ago. But after the initial surprise... why not? After all most of us have Neanderthal admixture and they must have diverged a million year ago or earlier. And some peoples even have minor Homo erectus admixture most likely, diverging some 1.8 Ma years ago probably.

So, well, these are my reasons and this tree by Terry D. Robb, with my own chronology as above, is probably the best and most realistic estimate around of Y-DNA haplogroup ages.

Enjoy.

19 comments:

Mike KeeseyMarch 3, 2013 at 10:28 PM
"And some peoples even have minor Homo erectus admixture most likely, diverging some 1.8 Ma years ago probably."

Eh???

Oh, are you referring to the hypothesis that Denisovans had some H. erectus admixture (notably in the mitochondrial lineage)? Did any of that survive into living humans? (I know the mitochondrial lineage didn't.)
ReplyDelete
Replies
kalupiteroMarch 3, 2013 at 10:36 PM
The 1000Genomes Project used 1x coverage. Usually, the sequencing is done at 30x coverage, keeping in mind I barely know what I'm talking about here. I think each "coverage" misses a lot of SNPs, maybe half of them, that's why 30x coverage is necessary to ensure reduction of errors to a minimum acceptable level. I presume 1x coverage is vastly cheaper than 30x coverage and is what allowed the Project to sequence a whopping 1000 chromosomes, while other studies struggle to fully sequence the 23 chromosomes of just 1 person. The conclusion is that the 1000Genomes Project must have missed a huge amount of SNPs. It would be best to select a few carefully chosen y-chromosomes and then do a proper 30x or 60x coverage of them.

Dienekes made some calculations also when the 1000Genomes data came out last year. Using a mutation rate of 1.25 x 10^-8 per 25-year generation, he got a 6400 date for the separation of R1b-U106 and R1b-P312. He picked the mutation rate from a study by Kong about mutation rate variation in fathers of different ages. It was an important study, because the just released study of y-dna A00 which prompted you to write this post also stated they used the Kong mutation rate. But looking anew at the Kong study, I think the mutation rate is actually slower than 1.25 x 10^-8 per 25-year generation. The study says "average father’s age of 29.7, the mutation rate is 1.20 × 10−8 per generation", which I believe would mean the 25-year generation rate would be just 1.00 x 10^-8. If true, Dienekes original estimate of 6400 years would be more like 8000 years.
ReplyDelete
Replies
kalupiteroMarch 3, 2013 at 10:55 PM
The A00 lineage is extremely easy to identify by haplotype, having many STR values that are almost "illegal" in all other haplogroups. Maju pointed out that it's only been found in 2 people, one from FTDNA where the existence of the A00 lineage was first detected and another from coastal Cameroon. I presume this other sample is from the smgf.org database, because there are in fact 9 samples there, all from Cameroon, who are unquestionably members of this clade. I'd like to know more about how they detected this other sample and where they got it from.

Finally, there's yet a 3rd sample that hasn't been noticed yet. It's from yhrd.org. The sample is from "France, Paris [French]". About half of these samples are North Africans and sub-Saharan Africans, so it makes perfect sense. So unfortunately, we have at least 3 occurences, but only 1 of them has a precise regional location in Africa.

Regarding the extreme rarity of A00, after looking at what must have been about 15,000 sub-Saharan y-dna samples, from yhrd.org, smgf.org, and FTDNA, and finding only the above noted 3 occurences (I still have to see what's the deal with those 9 Cameroonian samples, are they all relatives, are they all from a single village), A00 has a frequency of 1 in 5,000! But actually, that's not so exceptional, I have seen many clades in Europe with a frequency of just 1 in 3,000 or less, such as the rarer R1a* haplogroups, for instance.
ReplyDelete
Replies
andrewMarch 4, 2013 at 1:18 AM
A fairly accurate archaeologically calibrated chronological date of 265,000 years ago is close enough to the archaelogically earliest date for H. Sapiens to be plausible. A date of 450,000 years ago really points strongly to archaic introgression, the only time ever that it has been documented in a uniparental line if accurate. Given the suggestion of archaic admixture from both 13,000 years old possibly hybrid skeletal remains and from traces of even possibly a couple of archaic admixture episodes in African in autosomal genetics based on linkage disequalibrium analysis, one in the region where we have samples, and the other in Southern Africa, at about the time time frame, the case that this is introgression, while not rock solid, is fairly plausible.

ReplyDelete
Replies
Václav HrdonkaMarch 4, 2013 at 2:53 PM
Can anyone tell me why the is the generation rate always 25 years? Did some of you made your own family tree? I have 10 generation in my family tree and the average generation is 33 years. You can say that I am strange, but please, take a look at Confucius family tree, it has 83 generations, its average generation age is also over 30 years. So my opinion is that the age of chromosome Y tree is underestimate due to the underestimation of the average generation age.
ReplyDelete
Replies
terrytMarch 5, 2013 at 4:41 AM
"When a non-sapiens gene enters the modern human gene pool via an interspecies hybrid individual who has an archaic hominin father and a H. Sapiens mother, this is introgression"

I think to call something that apparently separated from the 'human' line a mere half million years ago can hardly be regarded as a separate 'species'. 'Subspecies' at most. And I menetioned in the other post that I think the modern human Y-DNA originated in West Africa in the first place. The A00 represents exactly what it looks like, an ancestral form of the modern Y-DNA.
ReplyDelete
Replies
KristiinaMarch 10, 2013 at 6:14 PM
Terry, may I ask you if you are the same Terry D. Robb who has been working on haplogroup I1 or just some one else with the same name? If you are the author, I would like to pose you a question, if you don't mind.
ReplyDelete
Replies

Add comment

Please, be reasonably respectful when making comments. I do not tolerate in particular sexism, racism nor homophobia. Personal attacks, manipulation and trolling are also very much unwelcome here.The author reserves the right to delete any abusive comment.

Preliminary comment moderation is... ON (your comment may take some time, maybe days or weeks to appear).

Pages

March 3, 2013

On the plausible ages of human Y-DNA haplogroups

19 comments: