You aren’t related to most of your ancestors.
It’s counter intuitive and shocking, but it’s true. For the most part the human genome is finite and each parent only passes along half of their genetic material to their offspring. While it’s theoretically possible for the exponentially increasing number of ancestors going backward to pass along an exponentially decreasing amount of genetic code going forward, the human genome does not double in size every generation, so the proportion of material any one ancestor passes along necessarily halves every generation without inbreeding.
Parents pass along whole chromosomes, half of them in fact. Grandparents are represented by only parts of chromosomes, adding up to about 25% of all genes a piece. Great grandparents pass along 12.5% and their parents are down to 6.25%. This contribution keeps decreasing by half, so pretty quickly non-repeated ancestors have very small contributions.
There are 3 billlion base pairs in the human genome, which means that maximally 3 billion ancestors could contribute to your DNA if it were possible for each of them to give you but a single base pair. This is impossible because genetic material is not chopped so finely when it’s passed along, genes travel in paragraphs, and chapters — not in words and especially not a single letter at a time. But even if it were possible, this would mean that you have just over 30 generations of ancestors worth of information. This would place us at the High Middle Ages (1000-1300 AD) when the total global population wasn’t above 1 billion people, let alone 3 billion, so all human pedigrees experience pedigree collapse.
This question came up in a recent conversation with Dave from Prickeared, and I went searching for an answer on just how beefy the chunks of DNA are when they are passed along and what we could estimate using this information to assess when ancestors start dropping off as actual genetic contributors. John Hawks posts an answer:
In practice, even though we have billions of nucleotides, our DNA cannot follow billions of genealogical lines. Recombination over 30 — 40 generations does not divide chromosomes down to individual nucleotides. In the medium term, most human DNA is separated by recombination hotspots into lengths of around 50 kilobases. Across very short spans of 30 generations, DNA is for the most part inherited in chunks of hundreds of kilobases or longer. So dividing six billion nucleotides by 50 kilobases yields a number of around 120,000 ancestral lines at most from which any individual inherits his or her DNA. Recombination will increase this number somewhat further and further back in time, but not nearly so fast as the doubling of possible ancestral lines in every generation. This means that the vast majority of your ancestral lines more than around 17 generations ago have left no DNA to you whatsoever.
Hawks uses 6 billion nucleotides instead of 3 billion base pairs (but because nucleotide pairing is deterministic, G always pairs with A, T with C, there couldn’t be independent inheritance of both nucleotides). This difference in mental model doesn’t change the results as he is considering “kilobases” of information being the base unit and this measures individual bases, instead of base pairs; you’d get the same result if you said 3 billion nucleotide pairs divided by 25 kilo-pairs.
In humans, 17-30 generations takes us back much further than we can hope to document fully, but that’s not the case in dogs. I can trace my dogs back this many generations and the very first Border Collies in the ISDS stud book can be reached coincidentally between 15 and 30 generations. The pedigree collapse in dogs, though, is much more significant than humans and those foundational dogs don’t appear as only one line, the pedigrees actually go back to them thousands of times.
For example, Dublin goes back to Old Hemp between 14 and 64 generations, and 2.5% of his DNA is expected to be the same as Hemp. This is because there are over 1.8 million paths from Dublin back to Old Hemp instead of just one. All of this in just over 100 years.
* * *
Comments and disagreements are welcome, but be sure to read the Comment Policy. If this post made you think and you'd like to read more like it, consider a donation to my 4 Border Collies' Treat and Toy Fund. They'll be glad you did. You can subscribe to the feed or enter your e-mail in the field on the left to receive notice of new content. You can also like BorderWars on Facebook for more frequent musings and curiosities.
* * *
Interesting, but I’m confused. How do hot spots and ~50 kilobase pair chunks relate to alleles and genes? Of the ~50k base pairs, how much is junk?
Also, the significance of pedigree collapse eludes me. I’d think that, statistically at least, drop out of some ancestors has no effect on the genotype . . . so long as the drop out is random (Law of Large Numbers).
Think of a roll of two ply toilet paper. Individual sheets are genes, made of two pieces of paper, each of those is an allele. The dotted lines between sheets are the hot spots… places where the chain of genes are easy to break (and recombine) so you’re swapping groups of genes instead of breaking the genes on the ends of these chunks in half or something. That wouldn’t be nice.
Pedigree collapse guarantees inbreeding, which in the case of disappearing ancestors means that it takes more generations for inbred-upon ancestors to disappear. Disappearing ancestors is much like genetic drift where some alleles disappear from a population while others become fixed.
I love the toilet paper analogy, but it doesn’t address the source of confusion. Are you saying the dotted lines are both the hot spots AND the place where genes (two sheets together) and alleles (on the single sheet) separate? Or are some alleles/genes bound together on the same square of TP? In my very crude understanding of genetics, I understand that the latter is, in many cases, true within the MHC. No?
As for the disappearance, I’d guess the probability of disappearance is proportional to the fractional contribution . . . so a dog with 1/1024’th fractional contribution is 8 times more likely to be dropped than one with 8/1024’th fractional contribution. But this dog is still represented in the population, despite being collapsed out of 7/8th of the pedigrees. As I understand probability and statistics, that matters at the phenotype level, but not at the genotype level (unless the population is small).
Yes, the dotted lines are both the points of separation between groups of genes and these points are called “hot spots.” Some genes ARE “bound” together in clumps, they very often travel together, etc.
This is a decent explanation:
http://www.jax.org/news/archives/2010/hotspot.html
I do believe you are correct that the MHC has large, contiguous regions on the same chromosome meaning that it often travels together in a big clump. Probably why allergies and autoimmune are so heritable and travel in families often.