Family DNA Land
Search Text Case Sensitive Exact Words Include Comments
List of Bookmarks
I loaded my children’s pedigree into DNA.LAND to get some better imputation (so taking hundreds of thousands of markers and “filling” with millions based on known associations). Below are the new ancestry inferences for:
-My son
-My daughter
-Me
-My wife
-Son/daughter’s paternal grandfather
-Son/daughter’s paternal grandmother
-Son/daughter’s maternal grandfather
-Son/daughter’s maternal grandmother
Follow @razibkhan

RSS











That is very interesting.
Have you had whole genome sequencing for the members of your family, or only SNP tests?
Since you have a pedigree, have you looked for inheritence of insertions and deletions? I was very surprised to see a few very giant deletions of ‘very important’ genes within my family.
interesting to see the percentages varying between the kids
My wife is a full blooded North Korean, but DNA tests code her as having a minority of Japanese ancestry (possibly due to cryptic Japanese ancestry since few Koreans would openly admit to having Japanese ancestry, or possibly due to similarity of some North Korean populations to part of the Japanese genome and a lack of North Korean samples to distinguish the two).
But, while one of my children has roughly proportional Japanese-Korean proportions to my wife, the other rates as more Japanese than Korean according to the DNA tests.
I’m always suspicious of these claims about 2% of ancestry, that these slivers should probably be labeled “ambiguous” instead. In particular, your daughter is listed as 2% Indus Valley and 1.2% Nganasan, while neither is listed for you. So it seems that the system is inconsistent. Do you think that these pieces were labeled WestEurasian:Ambiguous and EastAsian:Ambiguous for you?
If the system is inconsistent between you and your daughter, it is probably incorrect in one of the two places. I had been assuming that the small stretches were overconfident. However, both of your parents are ~4.5% Indus Valley, so maybe it does have enough information to label your daughter 2% Indus Valley and it was underconfident for you.
1) mixed people get confusing results sometimes at lower fractions
2) low fractions, be suspicious
3) the genetic distance between constituent elements matters. 1% SS-african in a person of n. european background means a lot more than 1% southern european in same person
The differences between kids is very interesting.
My wife is a full blooded North Korean, but DNA tests code her as having a minority of Japanese ancestry (possibly due to cryptic Japanese ancestry since few Koreans would openly admit to having Japanese ancestry, or possibly due to similarity of some North Korean populations to part of the Japanese genome and a lack of North Korean samples to distinguish the two).
But, while one of my children has roughly proportional Japanese-Korean proportions to my wife, the other rates as more Japanese than Korean according to the DNA tests.
I tried to load my genome data into DNA.LAND, but it was rejected with a message to the effect “file too large” (it’s a 146MB gzip; unzips to 448MB of text). Is there something I can do? Where would I go for an ancestry report?
thin the SNPs in plink. use 23andme SNP file or something. then reupload.
If the system is inconsistent between you and your daughter, it is probably incorrect in one of the two places. I had been assuming that the small stretches were overconfident. However, both of your parents are ~4.5% Indus Valley, so maybe it does have enough information to label your daughter 2% Indus Valley and it was underconfident for you.Replies: @Razib Khan
rules of thumb
1) mixed people get confusing results sometimes at lower fractions
2) low fractions, be suspicious
3) the genetic distance between constituent elements matters. 1% SS-african in a person of n. european background means a lot more than 1% southern european in same person
These results must be taken with a grain of salt. But they do make some sense. I uploaded my 23andme genome and found the following.
South Asian 68%: Dravidian 57%, Gujarati 11%
Central Asian 31%: Indus Valley 22%, Indo-Iranian: 9%
Dravidian is based on two Dravidian populations, Tamils from Sri Lanka living in the U.K. and Telugus from India living in the U.K. as well as Bengalis in Bangladesh
Since I am a Tamil Brahmin, it makes sense that I am more “Dravidian” than “Gujarati”. My affinity with “Indus Valley” rather than “Indo-Iranian” also makes sense.
“South Asian” are the groups higher in ASI and are what are found in what is now India. There is some difference between the people in the west from the people in the south and east.
“Cental Asian” are the people higher in ANI and are from what is now Pakistan. Sindhis and Pathans (“Indus Valley”) are different from Balochis, Brahuis and Makaranis (“Indo-Iranian”)..
100% West Eurasian:
70% Dravidian
24% Indus Valley
3.8% Balkan
2.2% ambiguous west eurasian
I'm a south indian, kerala specifically, from the "syrian christian" ethno-religious group. I'm not surprised about the amount of dravidian being so far south, but I am surprised at the amount of "indus valley" and not even a iota of gujarati, it amost looks like the northweststern indian ancestors hopskipped over central india and went straight to the south. Is this typical of south indians? I also was surprised at the minor "Balkan"component, being syrian christian I was looking for a minor southwest asian component (some of our community believe we're descended from Jewish converts along with local dravidian converts, but I am looking more toward Assyrian/Aramaics input due to a Nestorian influx).
I just wanted to see if you or anyone might be able to shed light on the typical south indian profiles, since I haven't seen any other south indian profiles mentioned aside from Iyer (thank you by the way for posting yours) since he's a brahmin though he would of had more northern origins....
This is very different from my 23@me profile which didn't seem to even try to differentiate the ancestries and just basically said I was south asian (with a few miniscule other ancestries)
West Euroasian: 100%
1) South Asian: 54%
a) Dravidian: 34%
b) Gujrati: 20%
2) Central Asian: 39%
a) Indo Iranian: 25%
b) Indus Valley: 11%
c) Kalash: 2.4%
3) Mediterranean Islander : 6.9%
Between 23andme, Harappaworld, Gedmatch and now this i have managed to totally confuse myself.
23andme shows 0.2 percent west african which i inherited from my father. It makes sense as there was probably infusion of west african dna via makrani dna in sindhis. However it does not show up in the above as it is masked under indo iranian. In harappaworld dna i get 0 for West african and 1% in east african results. I am guessing this is mislabeling.
23andme also shows 0.3 north european with 0.2 percent scandinavian all of which i inherited from mother’s side who has it at 0.8 percent. I reasoned this is probably yamnaya dna which my mother’s moghul/turkic ancestors inherited. Harappaworld gives me 5% NE Euro so that also kind of made sense. But now if the above results from dna.land are correct this all might be Kalash dna.
23andme does not show anything related to mediterranean. It does show middle eastern component in my father and mother’s results but i didnt inherit any from either side (dodged the bullet? ). Harappaworld also shows med dna at 3% which makes me think dna.land result showing 6.9 is credible. I had not paid any attention to this component earlier but now i am curious. I have Y-Haplogroup J2. Could this 3%(harappa) or 6.9%(dna.land) mediterranean dna be something that came with migration of J2 into subcontinent? If i am not wrong J2 is present in South India and North India as well. Would this mediterranean component show up for some of those individuals as well?
P.S: Razib or anyone who might know how to do this, is there anyway for me to accurately determine the exact Y-haplogroup subclade of J2 using my genome file? 23andme only shows J2. Perhaps what i have is just the basal type but i’d like to be sure.
First, use the 23andme to YSNPs converter, and then put those results into the ISOGG Y-Tree addon for Google Chrome. Then in your case just go to their J page (http://isogg.org/tree/ISOGG_HapgrpJ.html) and you should see more granular results.
For example I went from G2a on 23andme to G2a2b1 (G-M406) which is much more meaningful, but of course not as good as getting the whole Y sequenced. :)Replies: @sindhiyoda
My Father:
West Eurasian – 100%
-North/Central European – 92%
-Southwestern European – 5.6%
–Southwestern European – 4.4%
–Sardinian – 1.2%
-Kalash – 1.3%
-North Slavic – 1.3%
My Mother:
West Eurasian – 100%
-North/Central European – 77%
-Southwest European – 19%
-Northeast European – 3.5%
–North Slavic – 1.9%
–Finnish 1.5%
Me:
West Eurasian – 100%
-North/Central European – 94%
-Southwestern European – 6.5%
Have you had the chance to form any opinion about the quality of the imputed data? I have to admit I have only limited confidence about how well one can impute human genomic data until we have many more humans fully sequenced across many ethnic origins. I could see these datasets ending up with better imputation if DNA.LAND was able to treat your childrens’ data as phased given the availability of your, your wife’s and both of your parents’ data.
1) South Asian: 54%
a) Dravidian: 34%
b) Gujrati: 20%
2) Central Asian: 39%
a) Indo Iranian: 25%
b) Indus Valley: 11%
c) Kalash: 2.4%
3) Mediterranean Islander : 6.9%
Between 23andme, Harappaworld, Gedmatch and now this i have managed to totally confuse myself.
23andme shows 0.2 percent west african which i inherited from my father. It makes sense as there was probably infusion of west african dna via makrani dna in sindhis. However it does not show up in the above as it is masked under indo iranian. In harappaworld dna i get 0 for West african and 1% in east african results. I am guessing this is mislabeling.
23andme also shows 0.3 north european with 0.2 percent scandinavian all of which i inherited from mother's side who has it at 0.8 percent. I reasoned this is probably yamnaya dna which my mother's moghul/turkic ancestors inherited. Harappaworld gives me 5% NE Euro so that also kind of made sense. But now if the above results from dna.land are correct this all might be Kalash dna.
23andme does not show anything related to mediterranean. It does show middle eastern component in my father and mother's results but i didnt inherit any from either side (dodged the bullet? ). Harappaworld also shows med dna at 3% which makes me think dna.land result showing 6.9 is credible. I had not paid any attention to this component earlier but now i am curious. I have Y-Haplogroup J2. Could this 3%(harappa) or 6.9%(dna.land) mediterranean dna be something that came with migration of J2 into subcontinent? If i am not wrong J2 is present in South India and North India as well. Would this mediterranean component show up for some of those individuals as well?
P.S: Razib or anyone who might know how to do this, is there anyway for me to accurately determine the exact Y-haplogroup subclade of J2 using my genome file? 23andme only shows J2. Perhaps what i have is just the basal type but i'd like to be sure.Replies: @sprfls
Yeah, the 23andme Y-hg assignments are really general / dated. The best you can do right now with the raw data is to plug it into the ISOGG tree, which you can do using some free tools at http://www.y-str.org/p/tools-utilities.html
First, use the 23andme to YSNPs converter, and then put those results into the ISOGG Y-Tree addon for Google Chrome. Then in your case just go to their J page (http://isogg.org/tree/ISOGG_HapgrpJ.html) and you should see more granular results.
For example I went from G2a on 23andme to G2a2b1 (G-M406) which is much more meaningful, but of course not as good as getting the whole Y sequenced. 🙂
Here’s a few more.
Wife’s maternal grandmother – 3/4 West Sicilian + 1/4 Calabria:
West Eurasian – 100%
– Mediterranean Islander – 61%
– Sardinian – 13%
– Arab/Egyptian – 10%
– Central Indo-European – 8.2%
– North/Central European – 5.2%
– Italian – 1.6%
– Ambiguous – 1.2%
Wife’s mother – 1/2 West Sicilian + 1/4 German (Alsace) + 1/4 Irish:
West Eurasian – 100%
– Mediterranean Islander – 35%
– North/Central European – 32%
– Balkan – 29%
– North Slavic – 1.9%
– Indo-Iranian – 1.4%
– Ambiguous – 1.2%
Wife’s father – 1/2 West Sicilian + 1/2 Prussian/Polish:
West Eurasian – 98%
– South European – 48%
— Balkan – 47%
— Italian – 1.6%
– North Slavic – 24%
– Mediterranean Islander – 12%
– North/Central European – 8.6%
– Arab/Egyptian – 2.6%
– Gujarati – 1.7%
– Ambiguous – 1.0%
North African – 2.3%
Wife – 1/2 West Sicilian + 1/4 Prussian/Polish + 1/8 German + 1/8 Irish:
West Eurasian – 98%
– Balkan – 46%
– North Slavic – 23%
– Mediterranean Islander – 17%
– North/Central European – 11%
– Ambiguous – 1.3%
North African – 2.1%
First, use the 23andme to YSNPs converter, and then put those results into the ISOGG Y-Tree addon for Google Chrome. Then in your case just go to their J page (http://isogg.org/tree/ISOGG_HapgrpJ.html) and you should see more granular results.
For example I went from G2a on 23andme to G2a2b1 (G-M406) which is much more meaningful, but of course not as good as getting the whole Y sequenced. :)Replies: @sindhiyoda
Much appreciated! I will give it a try.
It’s very interesting to see the results for three generations of a family like this. It’s also clear that having the results for the grandparents available can shed some light on some of the more mysterious tiny percentage results. For example, look at Razib’s daughter. It lists her as having 1.3% Nganasan ancestry, (the Nganasan being an east Siberian Samoyed group) a result not shared with her parents or any other member of her family. What a random-looking result. But wait, Razib’s mother has 1.6% of ambiguous Northeast Asian ancestry, and she’s the only one with that result. What do you want to bet that the 1.3% “Nganasan” ancestry in Razib’s daughter is this exact same ancestry segment? Razib must carry the same segment somewhere in that 3% Ambiguous East Asian ancestry he has.
In light of that, a random-looking low percentage result doesn’t look quite as random, apparently it’s just that with such a small sliver of DNA, the program has a hard time matching it up with a particular sample group.
West Eurasian 100%
• North/central European 62%
• North Slavic 28%
• Sardinian 4.3%
• Indo-Iranian 3.3%
• Ashkenazi 1.7%
• Ambiguous 1.2%
I’m skeptical. I’m not sure how they’re getting 3.3% Indo-Iranian. In 23&me I have no Ashkenazi (although my maternal grandfather had a name that is often Jewish), and the only thing outside of Europe was .1% West African (small but for some reason they were super confident, at least last time I checked) and .1 East Asian (not as confident, but then I had 23&me relatives on my mother’s side who were up to 1.6% Southeast Asian even on the conservative setting.)
Razib, your results are very interesting for me, I am Assamese and using my Geno 2.0 next generation/FTDNA raw data, the below were my results:
West Eurasian 83%
South Asian 57%: Dravidian 53%, Gujarati 3.6%
Central Asian 21%: Indus Valley 18%, Indo-Iranian 3.1%
Northeast European 4.3%: Finnish 3.2%, North Slavic 1.1%
Mid Turkic: 1.5%
East Asian 16%
Southeast Asia – 6.6% ( Similar to Dai, Lahu, which explains my partly Ahom heritage)
East-Turkic – 5.2% (probably explains my Tibeto Burmese or Mongolic related heritage possibly from Sutiya community)
Central Chinese – 3.7%
Native Oceanian 1%
I was wondering whether you have Ahom genomic raw data from the 1000 genomes project as I am curious to compare with my results, including Geno 2.0 next generation. Also to add, my Y DNA haplogroup is R1a1 while my MTDNA haplogroup is M13C
I also would like to share my Geno 2.0 next generation results: South Asia – 42%; Central Asia – 26%; Southeast Asia – 21%; East Asia – 4%; Finland and Northern Siberia – 5%; North Africa – 2 %
I think checking the quality of imputation is trivial, since you can just mask for example 50% of your known data and see how well the imputation algorithms perform. Haven’t worked with imputation in years, but remember some supp materials that claimed imputation algorithms gave very good results for this kind of test. Not my field, but this kind of reasoning sounds very convincing to me.
I can only imagine my own DNA nightmare!
Having a South Asian background thanks to British divide and conquer strategy in the Caribbean since the 1840’s, with various elements of mestizaje/mulataje in the mix and now in USA looking like any other alleged 3rd World miscreant immigrant has it advantages and disaadvantages.
God Bless ‘Merica!
Razib, it seems your mother has more Ahom admixture than your father as Southeast Asia is closer to Dai while Cambodian/Thai is more austro-asiatic as per dna.land definitions. Even though it has not been scientifically proven, there is a belief in Assam that our Ahom ancestors originated in Mongolia and then moved to Yunnan where they probably admired with the Dai population and adopted their culture. This is still hearsay and needs to be proven or debunked based on scientific reasoning.My east Asia mix includes daiic as well as east Turkic while in your mothers case it is daiic and nganassan possibly
@Razib:
For Dravidian, they say that the reference population is telugu, srilankan tamils. Does it mean that for those groups, it will be 100% Dravidian?
Also, Why is Gujarati a separate category?
Aren’t they Dravidian + central asian. Example I got Dravidian + central asian. In Central asian mainly Indo-Iranian and Kalash.
In earlier tools like harrappadna and gedmatch tools, it was ASI + central asian + european + west asian etc. Basically they used the ancient population to deduce the admixture. This one uses the current population to give the admixture. Given that the above Dravidian population itself is ASI + west Eurasian.
This is little bit confusing unless they give explanation as to what they mean by each group.
South Asian 68%: Dravidian 57%, Gujarati 11%
Central Asian 31%: Indus Valley 22%, Indo-Iranian: 9%
Dravidian is based on two Dravidian populations, Tamils from Sri Lanka living in the U.K. and Telugus from India living in the U.K. as well as Bengalis in Bangladesh
Since I am a Tamil Brahmin, it makes sense that I am more “Dravidian” than “Gujarati”. My affinity with “Indus Valley” rather than “Indo-Iranian” also makes sense.
“South Asian” are the groups higher in ASI and are what are found in what is now India. There is some difference between the people in the west from the people in the south and east.
“Cental Asian” are the people higher in ANI and are from what is now Pakistan. Sindhis and Pathans (“Indus Valley”) are different from Balochis, Brahuis and Makaranis (“Indo-Iranian”)..Replies: @JJ
Hi Razib, this a little late to be posting to this thread, but I did DNA.land using my 23&me data and my results were:
100% West Eurasian:
70% Dravidian
24% Indus Valley
3.8% Balkan
2.2% ambiguous west eurasian
I’m a south indian, kerala specifically, from the “syrian christian” ethno-religious group. I’m not surprised about the amount of dravidian being so far south, but I am surprised at the amount of “indus valley” and not even a iota of gujarati, it amost looks like the northweststern indian ancestors hopskipped over central india and went straight to the south. Is this typical of south indians? I also was surprised at the minor “Balkan”component, being syrian christian I was looking for a minor southwest asian component (some of our community believe we’re descended from Jewish converts along with local dravidian converts, but I am looking more toward Assyrian/Aramaics input due to a Nestorian influx).
I just wanted to see if you or anyone might be able to shed light on the typical south indian profiles, since I haven’t seen any other south indian profiles mentioned aside from Iyer (thank you by the way for posting yours) since he’s a brahmin though he would of had more northern origins….
This is very different from my 23@me profile which didn’t seem to even try to differentiate the ancestries and just basically said I was south asian (with a few miniscule other ancestries)