Thursday, September 12, 2013

Procedure For Finding Shared Matches Using Excel

Finding Shared Matches In Excel.

Good Day Everyone,

   I wanted to present a simple way use Microsoft Excel to compare and find shared matches between two or people. The initial steps first requires you to have Excel 2007, 2010 on your Windows machine. Then you need to go on 23andMe to the Countries of Ancestry Page and grab any two persons Ancestry Finder csv files.

Here is how you get to the Ancestry Finder csv file for single person.
a) Login to 23andMe account
b) Then at top -  My Results -> Ancestry Tools -> Countries Of Ancestry
c) On the Countries Of Ancestry Page - click drop down window for each person. Pull down web page and on bottom - double click the blue button that says - "Download.........Ancestry Finder File"
d) save csv file to your computer

Here is how you create the spreadsheet
a) double click each csv file for each user. Excel opens up. 
b) then copy the column that says matches for a particular person into another spreadsheet. Do the same with another person. The result should be a single spreadsheet with a minimum of two columns that you are comparing.

Here is how to run VBA code to compare columns
1) Open up new excel spreadsheet with names of matches of two or more people.

2) In the new excel spreadsheet - hit ATL and F11 key. This opens the visual basics editor to run code.

3) In the visual basic editor - on the toolbar - look for small green arrow pointing to the right. Looks like a small green triangle. click this green triangle

4) This opens a small window. Give your script a name and click create button.

5) erase code in the window and replace with this code:

Private Sub CommandButton1_Click()
Dim CompareRange As Variant, To_Be_Compared As Variant, x As Variant, y As Variant
str1 = InputBox("Enter Column Name to be Compared")
str2 = InputBox("Enter Column Name to Compare")
str3 = InputBox("Enter Column Name to put the Result")
Range(str1 & "1").Select
Selection.End(xlDown).Select
Set To_Be_Compared = Range(str1 & "1:" & Selection.Address)
Range(str2 & "1").Select
Selection.End(xlDown).Select
Set CompareRange = Range(str2 & "1:" & Selection.Address)
i = 1
To_Be_Compared.Select
For Each x In Selection
For Each y In CompareRange
If x = y Then
Range(str3 & i).Value = x
i = i + 1
End If
Next y
Next x
End Sub

6) Press the small green triangle button again. this runs the code and you will be prompted to enter the row letters that you want to compare and what column to place the results in

7) The result is your spreadsheet will have a new column with shared matches.

Here is the URL with the instructions starting at line that says: "Find duplicate values in two columns with VBA code"


Thanks
Steve

Wednesday, May 29, 2013

Understanding Correlations and Debunking Misconceptions In DNA Genealogy

     Good Day Everyone - How is everyone doing. Fine I hope. In this tutorial, we are going to get a firm grasp on the basis of genetics as it relates to DNA genealogy. The reason for a return to the basics is to debunk certain misconceptions that seem to be currently floating around in the general public concerning DNA Genealogy.  Debunking a misconception is important. It's important because when a consumer purchases a product, he or she has expectations that the product will fulfill. When those expectations are fueled by misconceptions, the consumer may develop unrealistic expectations that naturally don't get fulfilled. It's sad and unfortunate, but it's a common practice to use a misconception to sell a product. In science, this is a common occurrence and is easily rectifiable with a return to the basics. 

With that in mind - let's begin our discussion.

DNA - The Basics
DNA Genealogy is very popular today. In an age where technology allows a person to send in small DNA sample and get results quickly, many expectations are formed. This is of course logical. If you pay for a product you expect results. However no matter how popular DNA Genealogy is, DNA Genealogy is still based on a science. That science is Genetics. In genetics - the pot of gold is DNA.

DNA stands for deoxyribonucleic acid. DNA sits inside nearly all the cells of a single living thing. DNA carries and transmits biological information from parents to offspring. This is the fundamental principle behind the science of genetics. In fact - in genetics there is an equation called the Central Dogma Of Life

                             DNA -> RNA -> Protein
                             (Central Dogma Of Life)

The above equation is how all life proceeds. Since life has been here on the planet, this is how life, biologically speaking, works. When dealing with genetics, a feature, concept, or any defined entity must somehow fit into the above equation. Technically in genetics, something must be an attribute of a genetic mutation for it have an basis in genetics.

For example, sickle-cell anemia is an inheritable disorder that can be passed along from a parent to its offspring. The reason is that a single mutation in a gene causes this disorder. In other words - sickle-cell anemia is an attribute of a genetic mutation. Sickle-cell anemia is an inheritable trait and it fits into the fundamental principle of genetics.

It's at this point where misconceptions can arise. What are those misconceptions? Let's a took.

Misconceptions In DNA Genealogy
Misconceptions in DNA Genealogy generally stems from a misunderstanding of the basics in genetics. Many of the present misconceptions generally take the form of some defined construct that's NOT inheritable and NOT an attribute of a genetic mutation. The most popular misconceptions in the general public specifically deal concepts such as race, religion, nationality, and ethnicity as having some genetic basis. For this tutorial - let's focus on ethnicity. 

Ethnicity is a socially defined category based on common culture or nationality. For example - the term "African-American" - typically refers to a group of people whose ancestors were apart of the West African slave trade during the 1700s. You can expand the definition of ethnicity as it's deemed fit, but no matter how you look at it, ethnicity is socially defined and self determined. Ethnicity is a social construct devised by humans. If a person wants to place themselves in a different ethnic designation over night, then he or she can do that.

However ethnicity is NOT an inheritable trait. In other words - ethnicity is not an attribute of a genetic mutation. Your ethnicity is NOT reflected in your DNA. There is simply no way that it can be reflected in your DNA. Long before humans came along and devised social constructs, life evolved the ability to transmit features from parent to offspring within the DNA molecule, not transmit social designations within the DNA molecule. You can change your ethnic or racial designation, name, location, and religious affiliation. However you can't change your DNA.

Another popular misconception deals with the word "Ancestry". Like so many words, a word can have different meanings in different contexts. In genetics, ancestry means common descent. Ancestry in genetics means when two or more individuals share a unique feature that's derived from a common ancestor. The problem that occurs is when the term ancestry is given a social tone and then used in a scientific and objective arena. 

These misconceptions mentioned above are very popular within the general public. Why the popularity? Well that's where the term correlation comes in. Let's take a look.

Understanding Correlations: Beauty And The Beast
If you have taken statistics before, then you probably have come across the term "correlation". Correlations are like a doubled-edged sword. On one hand, a correlation can be useful. On the other hand, a correlation can dangerous and misleading.

Simply put, a correlation is a casual relationship or association between two or more variables. Correlations are a major reason for the popularity of many of the misconceptions that exist in genetics and thus DNA Genealogy. Many of the so-called BGA, Admixture, Or Ethnic Population tests on the market are based on correlations. That's why those tests are very convincing.

A simple example of a correlation is between time and highway traffic. In many major metropolitan cities across the US, highway traffic tends to occurs at specific times of the day. For example in Chicago, Illinois, Interstate 94 is a major highway system that leads in and out of the downtown area of Chicago. Interstate 94 experiences a consistent and heavy amount of traffic between the times of 6am-9am in the morning and 4pm-7pm in the evening. This happens so regularly at the above times, that the term "rush hour traffic" is used to label the phenomena.

Rush hour traffic is a simple example of a correlation. Here we have a casual relationship or strong association between time (one variable) and traffic (a second variable). Correlations can be useful in certain situations. Let's read and find out why.

Understanding Correlations: The Beauty
A correlation can be useful because it can have strong predictive power in certain circumstances. For example - in our simple example above, if highway traffic consistently occurs at 6am-9am every morning, then one can logically predict that since a highway will experience major traffic, one can avoid it. Many of us subconsciously use correlations on a daily basis to make predictions in order to adjust our behavior accordingly.

However correlations can have a dangerous side as well. Let's see why!!!! 


Understanding Correlations: The Beast
Correlations can be very dangerous and misleading. This is especially true in science. If you have ever heard the term "Correlation Does Not Imply Causation", then you know why correlations can be dangerous. The danger from a correlation is when the casual relationship between variables is perceived of as a direct or cause-effect relationship.

Here is an example of some dangerous logic -> "The heavy rush hour traffic is caused by it being between 4pm-6pm."

Going back to our simple traffic correlation example, if heavy highway consistently occurs at a specific time, then one may actually believe that time actually causes the traffic. This of course is not true. Time does NOT cause the traffic. The traffic is caused by the fact that most people have current work hours that end at a time between 4pm - 6pm. The result is that many people between those hours simply head to the highway which actually causes the congestion and traffic.

This is why correlations can be quite dangerous. A consistently confirmed prediction from a correlation can lead to a false belief that one variable is the result of another variable. When dealing with correlations - what you want is to identify the cause of the casual relationship between the variables. That's the key. It's important to understand that there is a big difference between a casual or associative relationship versus a direct relationship. For example, there is a direct relationship between high blood pressure and salt. There is no correlation between salt and high blood pressure. Salt actually causes high blood pressure. 

Now that we have a solid understanding of correlations, let's turn our attention back to DNA Genealogy

Correlations In DNA Genealogy
If you are wondering if correlations exist in DNA Genealogy - then you are correct. In fact, correlations in genetics is a major reason for the spread and popularity of many of the misconceptions that were mentioned in this tutorial. In genetics, correlations take the form of known and studied DNA markers that are strongly associated with certain defined ethnic populations. This association is actually the basis for many of the so-called Admixture or Ethnic Population Tests such as Docadad Admixture, and for companies such as DNATribes, African-Ancestry, etc to market a product.

A good example of a correlation in DNA Genealogy is between a haplogroup and ethnic population. A haplogroup is a population of people that share a unique set of DNA markers on either the mtDNA or Y-chromosome. For example - the Y-DNA haplogroup known as Q-M3 has a strong association with the ethnic group known as Native Americans. In fact - the association is so strong that it can be used a strong predictor in certain cases.

Another example of a correlation is between an AIM and a geographic region. AIM stands for Ancestry Informative Marker which is basically a DNA marker that's present at a high frequency in a population.  Certain AIMs are strongly associated to certain populations that have a known geographic origin. For instance - the Duffy Null allele is an AIM that has nearly a 100% frequency in Sub-Saharan Africans. AIMs are the basis for BGA tests such as Population Finder or Ancestry Composition. 

With such powerful correlations in genetics can someone's ethnicity, race, religion, or geographical point of origin be determined from their respective genetics?

Dangers Of Correlations in DNA Genealogy
The answer to the previous question that was asked two sentences above is a simple no. A correlation may seem very persuasive but it's nevertheless still a correlation. No matter how strong a correlation is - a casual relationship is NOT a direct relationship. Simply put your ethnicity, race, religion, or etc is NOT a product of your genetics. In genetics, the golden rule is that a defined entity must be an attribute of a genetic mutation. If the golden rule is not there, then it doesn't hold water in genetics. 

It's understandable why it can be hard for someone to separate their ethnicity or any social construct from their respective genetics. A correlation can generate an illusion of a direct relationship when there actually is NOT such a direct relationship. The situation is made worse when you have various companies advertising such fallacies. For example, the term "genetic ethnicity" is used by certain organizations in order to sell a product. However, a solution in dealing with a correlation is step back and understand the reason for the casual or associative relationship

In this case, why is there an strong association between certain genetic markers and certain ethnic populations?

The reason has to do with an ethnic population's martial and reproductive patterns not their genetics. Let's assume a unique genetic marker arises in a population via a mutation. If the members of that population reproduce with only members of the same population for an extended amount of time, then an associative relationship will form. This is how a genetic marker can become associated with a population. The result is that a correlation will form. This is especially true if the population retains a small size over time.  An example of this is with the ethnic group known as Native Americans. The Y-DNA haplogroup known as Q-M3 has a high frequency and strong association among Native American males. This is due to the strict, martial practices displayed over a long period of time. Many Native American males mated with only Native American women over time and simply never deviated from that practice.

Another way to expose a correlation is at the prediction level. For example - the Y-DNA haplogroup known as E1B1A has a strong association and frequency among African-American males. Going on correlation logic - a male who identifies himself as African-American should possess the E1B1A haplogroup. My 2nd cousin is Lewis Lamar. His ethnic designation is African-American and yet his Y-DNA haplogroup is R1b1a. 

With that we will end our discussion on correlations and misconceptions in DNA Genealogy. I hope this tutorial has shed some light on the dangers of correlations in certain circumstances as well demystifying some prevalent misconceptions. So the next time you purchase a genetic ethnicity test, tell them you want your money back LOL!!!!!!!!!!

Take Care
Steve Handy

Sunday, January 13, 2013

Handy And Curd Family Connection

It's said that good things comes to those who wait. This may be true in DNA Genealogy as I have stumbled upon another discovery that was made on the Handy side of my family. This discovery was, not surprisingly, confirmed via DNA Genealogy. The difference being in this case, the new DNA Genealogical services of Ancestry.com lent a helping hand. Recently, Ancestry.com have entered the DNA Genealogical arms race with their new product - AncestryDNA. Let's take a look!!!!!

It appears that the Handys have Scottish ancestry.
Recently I took an interest in uncovering some of my surname ancestry. My last name is Handy. It was known that the Handy lineage, for which I am descended from, hail out of Nashville Tennessee. The earliest male Handy that was known was William Henry Handy (1881-1947). Shown in the picture toward your right are my uncles, father, and grandfather - William Ernest Handy Sr (1921-1994) shown far right. William Sr's brother, Clarence Handy (1922-1992), is shown in center with tie. William Henry Handy was the father of both Clarence and William Sr.


William Henry Handy (1881-1947)
Shown toward your left is William Henry Handy (1881-1947). Not to much was known about Henry Handy. Henry was born in Nashville Tennessee. Henry eventually migrated to Chicago, Illinois and worked for Chicago Steel Mills. Early in life, Henry Handy met and married Alberta Woodard in 1918. Other than that, not to much was known about Henry Handy. I was determined to gather information about Henry Handy's past. Therefore I turned to his SSN application.



SSN Application of William Henry Handy
The eFOIA act is a wonderful law. Called the Freedom Of Information Act - it ensures public access to government records. When a person becomes deceased, their respective SSN is released into the public. You can then order the deceased SSN application. The reason for this is to get the parents of the deceased. Toward the right, is the SSN application of William Henry Handy. If you notice, Henry Handy gave the identities of his parents - Owen Handy (1862-1916) and Emma (1865 - ?). 


Notice that Henry Handy didn't give the last name of his mother. This is likely due to the fact that Emma's last name wasn't known at the time. It's actually Emma, and her ancestry, is what this blog article is about. Let's take a look!!!!

DC of Owen Handy (1862-1916)
The original goal was to uncover the strict paternal Handy ancestry. In other words, I was trying to discover the earliest known Handy male ancestor in my surname lineage. This has changed because currently, I don't have any information on Owen Handy's parents. That's okay because valuable information was learned in the process.  Shown toward your left is the DC of Owen Handy, The informant was his daughter - Hannah Handy-Hudgkins. Henry Handy apparently had siblings. There was Hannah (1892-1945), Ira (1896-1910), and Jim (1902-1944). 




Marriage Cert of Owen Handy and Emma Lanius.
Determined to gather history on Owen Handy, I did a search on Owen Handy on Ancestry.com. What I found out was that there was only a single marriage certificate associated to Owen Handy. Owen Handy married a woman named Emma Lanius. 

If you remember - on Henry Handy's SSN Application, Henry Handy apparently could not recall the last name of his mother Emma. I then came to the conclusion that the Emma mentioned in the SSN application and the Emma mentioned the above marriage certificate, were the same woman. (As a side note, on both Ira and Jim Handy's DCs - Emma's last name of Lanius is fully stated). As we are going to see, the DNA evidence is going to help confirm Emma Lanius as an ancestor. Now let's look at Emma Lanius and her ancestry.

Emma Lanius aged 16 and Family
Shown toward the left is Emma Lanius, her siblings, and her parents. This was shown in the 1880 Census. Emma was 16 years old. The snapshot photo was taken from Ancestry.com. The actual photo is information on Emma Lanius's mother - Jane Curd. More on that in a second. Not much is known on Emma Lanius outside of her marriage to Owen Handy in 1882 and the children she bore by Owen Handy. One interesting fact is that Emma Lanius's younger sister, Mary Lavinia Lanius, did meet and marry a man named William Bridge. They both migrated to Texas where their descendants reside today.  

Emma's parents were Matthew Lanius and Jane Curd. The maiden name of Curd is confirmed by the 1865 marriage certificate of Mattew Lanius and Jane Curd in the Tennessee Wilson County Area. (I will post it on the bottom of the blog)



Notice two things before we move on. First - Jane Curd-Lanius was designated as being mulatto. In the old days back in the south, the term "mulatto" loosely meant that your father was European and mother was Negro. Second - Jane Curd's father was born in Tennessee. We will see why that's important shortly. As a side note, Matthew Lanius's mother - Sallie Lanius (aged 50) is shown as well. 

Curd Relatives and Neighbors 

Before the DNA evidence came along, a valuable trick commonly used is to view and investigate the neighbors that lived near known ancestors and relatives. In the old days, many relatives near next door to each. In the 1880 census photo shown toward the right, there is a James A. Curd (1809-1876) and his family living one door from Jane Curd-Lanius. This same James A. Curd is present in the 1870 Census, living a few doors from Jane Curd. 

It turns out that James A Curd was a known slave owner in the Wilson area at that time. In fact, his brother Price Curd (1808-1883), was an even bigger slave owner. I was coming to the conclusion that Price Curd was the father of Jane Curd. James A. Curd could be ruled out because he was born in Virginia, whereas Price Curd was born in Tennessee. In addition, James Curd only has a record of owning two male slaves in the 1840 Census. (At one point - Price Curd owned over 19 slaves in one year)

If you remember from above that Jane Curd's father was noted as being born in Tennessee. Both James and Price Curd had siblings. Their sisters can easily be ruled out as a parent to Jane Curd. The younger brothers of James and Price Curd were either deceased before Jane Curd's birth in 1845 or much too young (1833) to be a parent. This leaves Price Curd as the likely parent of Jane Curd. In fact, let's take a look at the DNA evidence which confirms the connection between the Curd and Handy families.


DNA Evidence linking Curd and Handy Families
AncestryDNA is newest autosomal DNA testing service that's currently on the market. It's owned by Ancestry.com. I submitted a sample of my DNA. AncestryDNA provides matches who are essentially cousins. One of my matches is woman who goes by the username of MidgeEstes. Shown on the left is the DNA match. One of the nicest features with AncestryDNA is that you can link your DNA account to a pedigree tree. 

As you can see, one of the shared common surnames is Curd.


 One of MidgeEstes ancestors was Elizabeth "Betsy" Curd (1738-1821). Elizabeth Curd was the great-grand aunt of Price Curd. Price Curd's great-grand father, John Curd, and Elizabeth Curd, were siblings. This means that their father - Edward Curd (Bet 1650-1670) is the common ancestor to MidgeEstes and myself. 


Pedigree Of Edward Curd 
It appears that Edward Curd was born around 1650 in Scotland. He died in Henrico, Virginia in 1742. The amazing thing about this is the area of autosomal DNA that these DNA tests look at - generally isn't expectant to retain DNA from a 400 year period!!! Each generation you go back, you lose a percentage of DNA due to a natural biological process called - recombination

For MidgeEstes and myself to possess these type of autosomal DNA segments from an ancestor that lived over a 400 year period is amazing.



1860 Slave Census Record of Price Curd
Shown below and toward the right are slave census records of my presumed ancestor - Price Curd (1808-1883). It appears that Price Curd owned many slaves. In the Wilson District Area of Tennessee between the years of 1840-1880s, there were many recorded African-American Curds - whom he and his brother James A Curd are likely the fathers.

In this photo shown toward the right, Price Curd owned 19 slaves alone.




1840 Slave Census Record of Price Curd
Shown toward the left is the 1840 Slave Census record of Price Curd. In this record is likely the mother of Jane Curd (1840 - ?). In this photo, there are two African-American females are at or near age of 23. 

As a side note - Price Curd's In-Laws were the Eatherlys. Price Curd's daughter - Emily Curd, married a James J Eartherly. On the 1882 marriage certificate of Owen Handy and Emma Lanius, there is a John Eatherly whom married and signed the certificate. It's very likely both James and John Eartherly were related.  

As always - it has been a pleasure. Please leave all comments below. 


Marriage Certificate Of Matthew Lanius and Jane Curd

Thanks - Steve Handy