Finding Cousins (DNA)

No comments

Autosomal DNA testing is the “family finding” DNA method.  In my case, as an adoptee, it’s a search for close relatives.  However, for anyone who wants to flesh out their family tree, this testing will identify cousins near and far based on genetic matches.  Should those cousins either post their family tree or otherwise be willing to discuss their heredity, then blanks can be filled in.  Add on the research tools from Ancestry.com and a number of other genealogical web sites, and quite specific details can be added (births, deaths, marriages, census data for occupations, draft registrations, obituaries, etc.).

To understand the challenge, let us begin with the 22 pairs of chromosomes that are shared from parents to their children, 50% from each parent, who in turn had 50% from each of their parents.  Viewed another way, a child has roughly 25% of the DNA from each grandparent.  Or 12.5% of the DNA from each great grandparent.  And so on.  So, somewhere in the past is an ancestor whose genetic material has been gradually dissipated through the generations, but also mixed with others as each child marries.  All of this provides a mathematical model for predicting a relationship based on the percentage of DNA shared with another person.  That sounds easy, but the number of matching DNA segments, the length of those segments (called centimorgans - cM), and the longest shared segment all factor into the algorithms which try to predict the number of generations to a common ancestor from one cousin to another.  Still, the following chart shows the straightforward percentages of DNA shared by others to yourself.

550px-Cousin_tree_(with_genetic_kinship)

Numerically, we would share 25% of our DNA with an aunt (or uncle), and less than 1% with a third cousin.  That’s a tiny bit, so you have to account for laboratory margin of errors as well, not to mention by whose protocol is followed regarding minimal strand length to be worth counting.  That’s all done by the DNA company, and they do state their base criteria.   So, let’s see.  Ancestry.com has identified 6 persons who are estimated to be potential 3rd cousins, another 497 that are 4th cousins, and another 14,000 who are more distant.  Only, as you may suspect, none of that is exact.  A third cousin might be a second cousin or a fourth cousin, and any of those fourth cousins might be sixth cousins.  It’s a game of probabilities, hampered by genetic mutations and variations in how much DNA is passed on from prior ancestors – despite the simple explanatory math, the sharing is not all in even, predictable increments as the chart would indicate.

Family Tree DNA (FTDNA) hedges their bets more than Ancestry.com.  It seems some people do not understand the  uncertainties or get angry when the predicted relationship is demonstrably inaccurate.   FTDNA indicates 115 matches regarded as 2nd – 4th cousins (see how they avoid saying 3rd cousin?), and another 1900 3rd-5th or further related cousins. 

Databases vary as well.  There are some people who have tested at both of these companies or who have shared results from Ancestry on the FTDNA website, as I did.  There are also other testing sites (23andme.com, dna.land) as well as one unaffiliated with a commercial interest (gedmatch.com) where data can be uploaded by the user for comparison against whomever else chooses to do likewise. Across all these databases, that’s a lot of cousins.    They’re all blood relatives; it’s just that the relationship is a mystery – even if one is not adopted.  But it does get confusing when you find ancestors of cousins – in about the era where you’re hoping to find a common match – and the people live in:  VA or NC (where I wish they all resided), TN, KY, IL, WI, WV, SC, GA, MS, TX, IN…  All that just means that the relationship goes even farther back to whatever sorry bloke undetermined ancestor gathered the family in the wagon and headed west.  Or south.  Or north. 

The third cousin cousin statistic of  0.781% equates to 53.13 centimorgans (cM).   I have zillions of matches who share about that amount.  However, the variance around that total shared length is not a narrow one.   For example, second cousins tend to share 212.5 cM, but in extreme cases can actually share as little as 47cM or as much as 760cM.  So, the math is only helpful to a point. 

Here’s a look at the exercise before me.  DNA suggests that my closest match is a second cousin, once removed.   The “removed” part of it means that the person’s relationship is either one generation fewer or greater than my own from a common ancestor.  I have determined that this cousin is older and therefore belons in the right hand column of the chart below, rather than the adjacent.  That would mean that her great grandparents are my great great grandparents (which I’ll denote as ggg).  

 

Easy!  Up the ladder and right back down. Only… each generation has two parents who each have two parents… so that’s actually 4 pairs of potential ggg’s.   This goes back to the 1800’s, so it shouldn’t be any surprise that they were farmers.  Based on census records, it seems they each averaged about 8 kids (because farms need farmhands), who each had 8 kids… and, it’s a nightmare of tracking people.  If eight were assumed per generation, that’s 2,048 possible parents.  Oh, and then there’s that statistical error of margin which may actually place the relationship at least one generation further back, so multiply by 8 again.   I would much prefer that a second cousin or closer just happen to take the DNA test and show up on my relative list. 

It’s helpful that so many people have publicly viewable family trees, and my closest match, although her tree is not public,  shared a good bit of information with me.   That helps.  It does not help, however, that my parents were most likely born in the 1940’s.  Census data is very helpful in showing members of households (as I’d like to pin down candidates based on my original birth surname and the county in which I was adopted), but the 1940 data is the most recent available, and these are not made public until 75 years later.  So, in 2025, I’ll get some more clues from the 1950 census,  perhaps.  The other difficulty is that the common convention in family trees is to remove the names of anyone still alive.  Privacy.  Makes sense.   But it leaves (living) dead ends in the search, unless an obituary of elder family members can be found or noted relationships on sites that catalogue cemetery tombstones.  Either of these can lead to the names of surviving family members.  Otherwise, there is a host of varied websites made for family research.  And, generally, Google.

Thus begins the process of triangulation, looking at family trees of related cousins hoping to find a common ancestor, usually done with a 3rd cousin or closer as records are fairly good through the last 200 years.  The DNA websites help as well.  Not only will they tell you who your cousins are, but which cousins share DNA themselves.  You don’t know whether they link to your maternal or paternal side, but it’s a starting point if they happen to have family trees available.  Given the masses from which to choose, this is at least a smarter approach to finding that ancestral starting point for the time consuming process of finding descendants.  This task isn’t hard, per se.  It just consumes a lot of time which ultimately ends up in either finding a person or not finding a person.  If not, there is not much else to do except note the dead ends and hope for better results elsewhere.

There is also the “just ask them” approach.  Some will respond, and of those, some will respond helpfully.  But no one has the time to sift their own tree in your interest, really, so you have to present them with a starting point, such as surnames you’ve frequently come across. 

So, if someone were to ask me, “Why aren’t you blogging more frequently?”  This is one of those amusements which consumes my time.  And movies. And games.  And TV shoes.  And music.  And work.  And sleep. And a brewery visit.  

No comments :

Post a Comment