Sunday, October 21, 2012

Understanding Autosomal DNA Testing


       Good Day Everyone. How is everyone doing doing?  In this document, I am going to provide an introduction to Autosomal DNA Testing. By far, out of the three basic DNA genealogical tests, an autosomal DNA test is the most popular. The purpose of this document, is to provide a clear and easy understanding of an autosomal DNA test. Before taking autosomal DNA test, one should understand the nature of DNA. Your DNA is you. It doesn't change. You can change your name, address, and etc. With DNA, that's not the case. Your DNA will tell on you. In other words, an autosomal DNA will reveal the truth. For some people, that could be good. For other people, that could place a person in positions that may not be comfortable. Secrets, which could be damaging, may inadvertently get revealed. In an age where it's simple to send off a DNA sample, and get results back quickly, it's important to understand. So please keep that in mind with a test of this nature.

Let's begin with two important basic principals.


Basic Principals 
  1. The first principal is that when two or people share or match significant regions of DNA, they share a common ancestor in their past. It is from that common ancestor that the shared DNA segments or regions are inherited. Since this is an autosomal DNA test, the common ancestor could be a male, female, or a pair of ancestors such as one's parents.    
  2. The second principal is that the more DNA you share with someone, the more closer you are to that person. This means your shared common ancestor(s) lived in a more recent time. For example, a brother and sister's last common ancestor is their mother. On the other hand, two first cousin's last common ancestor would be their grandmother. As we are going to see, this is going to be important.  
Science and Autosomal DNA Test Basics
An autosomal DNA test is a DNA test that is designed to discover and identify relatives and ancestors that are or were living within a genealogical time period. By genealogical, we mean within the last 100 to 300 years. The reason the test can only go back that far has to do with a natural process called recombination. (Recombination will be explained in a separate document). There are three basic autosomal DNA tests on the market. The first is Family Finder which is managed by Family Tree DNA. The second is called DNA Relatives (previously called Relative Finder) which is managed by 23andME. The third is called AncestryDNA which is managed by Ancestry.com.

Humans have 46 chromosomes. The first 44 chromosomes are called the autosome chromosomes. An autosomal DNA test looks at these first 44 chromosomes. The test works by identifying linked DNA segments along any of the first 44 chromosomes. These linked DNA segments are then compared to other individuals. If two or more individuals share the same linked DNA segment, then they are declared a "match". The linked DNA segments are composed of DNA markers known as SNPs (called snips).

Let's take a look.

DNA is composed of four bases called A, T, C, and G. A basic DNA segment would be something like this -> "CATG".  Now suppose a DNA sequence changes from CATG -> CATA. In this case, a base "G" changed to a base "A". This can happen if DNA copies itself and a mistake occurs. The base A is what is referred to as a SNP. SNPs are the foundation of an autosomal DNA test.

Let's see why and how!!!!


Methodology 
An autosomal DNA test works by identifying a consecutive number of shared and linked SNPs that lay in a row on any of the 44 chromosomes. SNPs are powerful. They are used because they change very very slowly over time. In other words, when you inherit your DNA from each of your parents, the SNPs generally are passed to you unchanged. Because of this slow change, when you and another person both share a number of SNPs on the same chromosome, then that DNA segment must of been inherited from a single source, a common ancestor.

Because each of us has two parents, we each receive a single SNP from our mother and father like this -> AT. This single pair of two SNPs are generally associated with an number called a Reference SNP ID. For example rs1234 -> AT. The "rs1234" is the reference SNP ID. Now remember that of the 44 autosome chromosomes you have, you get 22 chromosomes from your mother and 22 chromosomes from your father. Each chromosome you inherit from each parent actually sits as a pair with a SNP sitting at the same position and location on each of the chromosomes in the pair. You can see like this below.


Chromosome Pair Number 1

Chrom1-> CCCCCCCCCCCCA
Chrom2-> AAAAAAAAAAAAAT


The reference SNP ID actually is used to reference the position of the two SNPs on each chromosome. That position will be the same. As you can see, the SNP pair -> AT sits at the end of each chromosome.

Moving forward, rs1234 -> AT, the SNP "A" could have came from your mother on say chromosome 1, and the SNP "T" could have came from your father on say chromosome 2. In other words, chromosome 1 = "A" & chromosome 2 = "T".

Let put this is in a table to make it easier to see.
  • Line Ref ID  Chrom   Child  Mom Dad
  • 1     rs1234  1,2         AT     AG   CT
To make things easier, I put the SNP and chromosome numbers in bold to indicate which SNP and which chromosome the child received from the mother. The way to read the above table in respect to the child is "On line 1, we have a child with reference SNP ID of rs1234 on chromosome 1 with a SNP value of A and on chromosome 2 with a SNP value of T."

In other words, the child has received a SNP value of "A" from mom, and a SNP value of "T" from dad.

In this example, basically the child has inherited its chromosome 1 from its mom and chromosome 2 from its father. Remember that the child's parents also have two SNPs as well. Mom and Dad each inherited a SNP from their respective parents.

Now let's look a number of SNPs on each of the chromosomes.

  • Line Ref ID  Chrom  Child Mom Dad
  • 1     rs1234    1,2      AT    AG   CT
  • 2     rs3454    1,2      TC    TG    CA
  • 3     rs5674    1,2      CC    CG   CT
  • 4     rs6745    1,2      TA    TG    AA
  • 5     rs4688    1,2      GC   GT    CA
In the above table, we see a number of reference SNP ID's, 5 to be exact.  A table like this yields a consecutive number of SNPs arranged on each of the child's unique chromosomes 1 and 2. For example, let's start at line 1 and go through line 5, recording the SNP values the child has received from its mother (in bold). Line 1 -> A, Line 2-> T, Line 3 -> C, Line 4 -> T, Line 5 -> G.

This basically means chromosome 1 of the child has SNPs -> "ATCTG".

By the same token, from the father, chromosome 2 of the child has SNPs -> "TCCAC".

The child has received five SNPs from mom and five SNPs from dad. Our total is 10 SNPs from both parents. Tests like Family Finder or Relative Finder work with tables like this shown above.

Now let's assume there is a woman named Alice Smith. Alice Smith has taken either the Family Finder or Relative Finder test. Alice Smith has table similar to one shown above.

Alice Smith has on her chromosome 1, the same sequence of SNPs -> "ATCTG".

An autosomal DNA test would then flag both the child and Alice Smith as a "match". In other words, both Alice and the child are related!!!  Both Alice and the child have inherited the sequence of SNPs from a single source, a common ancestor.


Remember that the child inherited a total of 10 SNPs from both parents. Alice matched to 5 out of the 10. Alice matched to half the total SNPs which came from the mother's side. Because of this, an autosomal DNA test is also called a Half Inherited By Descent (HIBD) or IBD test. This makes sense because a match (ignoring certain expections) is going to be related to you on only one side of your family. This means a match is going to have the same SNPs that either your mother or father passed to you. That usually means half.

In addition, it was mentioned in the above example, that the child inherited its chromosome 1 from its mother and chromosome 2 from its respective father. In reality, an autosomal DNA test doesn't know which chromosome or SNP came from which parent. It has no way of knowing. To the test, the only knowledge it can have is when to or more people have a number of "matching" SNPs. In order to know, you must test a parent, grandparent, or a close relative. If that ancestor or close relative, matches as well, then you know which side of the family your match is on.

Also, we saw in the simple example, that Alice matched to 5 simple SNPs to the child. In reality, tests like Family Finder will declare a match if there are in the range of 500 to 700 SNPs that a person has in common with another person. Of course there are other factors such DNA segment length, noise, and other factors that an autosomal DNA test must consider as well. To make things easy, companies like FTDNA or 23andME will lump sum the numerous factors, into a unit of measurement known as the centiMorgan.


CentiMorgan (cM) 
If there is anything to take from this document, then the centiMorgan is something you may want to focus on. Now the exact definition of the centiMorgan can be a little tricky and hard to understand. It requires knowing about recombination and that's for another discussion. To make things easy to understand for everyone, let's look at the centiMorgan as a unit of measurement that represents DNA segment length, number of SNPs, and etc, all rolled into one. The centiMorgan basically gives us a way to compare apples to apples or oranges to oranges.

Based on current evidence and thinking, anything considered above 20cM is definitive evidence of common ancestry within a genealogical time frame. In other words, if you share at least 21cM with a person, then you are related to that person within a genealogical time frame. At FTDNA, the Family Finder test only reports matchings above the 20cM level. Between 20cM and 10cM is considered probable evidence of common ancestry. 23andMe's Relative Finder appears to report above 7cM. 

A good way to confirm if someone is related to you is to test multiple family members. This way you can know if low cM amounts such 11cm or 7cM indicate a shared common ancestor.

From parent to around 2nd cousin once removed, there are a number of characteristic ranges of centiMorgans that ancestors and relatives will share with another. For example, I personally share 3379cM of DNA with my mother, and 3362cM of DNA with my father. These amounts are fairly normal. This represents 50 percent of the studied SNPs across my autosomal chromosomes. If you do the math -> 3379 +3362 = 6741. If we look at my mother's contribution -> 3379/6741 = 50.12%. My father's contribution -> 3362/6741 = 49.87%.

Since we all have four grandparents, we share 25% of DNA with a single grandparent. As an example, my paternal grandmother Juliette Turner shares 1763.32cM with myself. If we do the math -> 1763.32/6740.46 = 26.16%. These numbers are fairly consistent. Here is an unofficial chart with all the cM listings.



It should also be mentioned that the centiMorgan numbers shown in this chart, starting at siblings down to cousin, represent FULL relatives. Full relatives share two of the same ancestors. Half relatives share a single parent, grandparent, ancestor etc. This means that half relatives share half the amount of DNA that full relatives would share. This means that you would essentially take the cM numbers listed above and slice them in half. For example, you and your aunt should share roughly around 1600cM to 1900cM of DNA. If it's discovered that you actually share say 700cM with your aunt, then your aunt is actually a half relative. This would mean your parent and your aunt are half siblings, only sharing one (not both) of their parents.

Sensitive information such as this is what an autosomal DNA test can reveal. Depending on the case, it may have not be known to the family members that the aunt was a half relative. This is why tests of this nature should be firmly understood before taken. The ramifications of newly discovered information such as this can be damaging.

Now lets look at one final property of an autosomal DNA test - Coincidental matching!!!

Identity By State (What a coincidence!!!)
It's pretty clear that when two or more people share a significant amount of DNA, a relationship is revealed. That's the basic principal in all DNA tests. However reality is not always clear cut as that!!  Within a population of people, two or more people may share amounts of DNA due to mere coincidence and chance. At very low levels of DNA (1cM for example), two or more people may randomly share DNA. Sometimes this can be attributed to the test itself. In this case, the term "noise" is used. The overall general term that is used is IBS.

     IBS stands for Indentity By State. IBS is a term that refers to the matching of DNA via mere chance and coincidence and NOT common ancestry. In a population of people, two or more people will always match DNA via pure chance.  IBS is what you want to eliminate from a DNA test. All DNA tests have to deal with IBS and take it into account.

     IBD stands for Identity By Descent. It refers to DNA inherited via common ancestry. IBD matchings are real and that's what you want to focus on.

     Companies such as Family Tree DNA and 23andMe use thresholds to declare a match. The reason for this is so IBS matching can be eliminated. The problem is that at low cMs, there is no clear cut way of knowing what's actually IBS or IBD. A low cM such as 7cM or 8cM could be IBS (non real) or could be IBD (real). What is known, is that the lower the cM amount, the more IBS comes into the picture.

     The current thinking and evidence shows that cMs greater than 20cM is definitive of common ancestry. Between 10cM and 20cM is probable common ancestry, and lower than 10cM falls into the range of IBS.

     Well that's it for autosomal DNA testing. The important concept to remember is that autosomal DNA testing reflects relationships within a genealogical time frame.

As always, it has been a pleasure!!!!!!!!!!

Thanks
Steve




14 comments:

  1. All information on this blogger are enlightening Congratulations!

    ReplyDelete
  2. Thanks Steve!! I've been trying to understand part of this by looking at a lot of websites that try to explain autosomal testing, and your's has been the easiest for me to follow. I'm sure I'm technically challenged regarding the autosomal DNA, so I appreciate your blog.

    I'm wondering if I'm thinking right? You said as follows: "In reality, tests like Family Finder will declare a match if there are in the range of 500 to 700 SNPs that a person has in common with another person. Of course there are other factors such DNA segment length, noise, and other factors that an autosomal DNA test must consider as well."

    Let's say I compare myself to two person's who show as second Cousins, and let's say they each have a segment over 10cM with adequate SNP's at Chromosome 5 and each have a segment over 10 cM with adequate SNP's at Chromosome 10, but the segments do not overlap. Am I right that the segments for the two should overlap for there to be a relationship between the 3 of us?

    Thanks!! Tom

    ReplyDelete
    Replies
    1. Hi Tom. Sorry for the late response. All three could still share the same common ancestor even though all three don't overlap.

      The reason would be recombination which chop up and shorten DNA segments. Triangulation is not a required property for common descent between 3 people.

      I have an example between my dad and two of his first cousins when you line both of them against my dad's chromosomes

      Hope this helps
      Steve

      Delete
  3. Thanks! this is thoroughly enlightening

    ReplyDelete
  4. Thanks, you simplified autosomal testing for me. The cM listing chart was very helpful.

    ReplyDelete
  5. Wow your great! Thanks for posting those links in the group this is really helpful and clear I was about to give up on all this.

    ReplyDelete
  6. Thanks so much for posting this, At least this was simple enough for a beginner to learn what they are doing which is me

    ReplyDelete
  7. Would you be able to help me? My 1st cousin and I share 922cm with the longest being 90. Our mothers were half sisters, sharing the same mother. We want to know if we have the same father.

    ReplyDelete
    Replies
    1. Hi chacha. Yes first cousins that share 922cMs is consistent with with full first cousins which indicate the siblings are full meaning both siblings had the same mother and father

      So the answer to your question is yes - both sisters had the same father as well

      Hope that helps

      Steve

      Delete
    2. Thank you Steve. What we're trying to figure out is if my cousin and I have the same father. I called a paternity DNA place and they said there's no way to tell which genes come from where since we're already genetically connected through our mothers. It's so complicated.

      Delete