Best and worst of the worst: A survey of some ancestry testing companies. Introducing the term “parental privilege.” In all cases, emphasis added.
This is an opinion piece on some examples of the “state-of-the-art” (such as it is) in commercially available ancestry testing; this is not meant to be comprehensive. I’m not going to discuss online accusations that the companies fudge results to “stick it to racists.” For the most part I’ll discuss the actual product, with a few words here and there about certain other issues.
Before we begin, let’s take a look at “movement” Type I droolcup commentary:
I think that it is weird how some people clutch at microscopic bits of DNA and pretend that they are something they are not. My DNA is 90% Southern England, 5% German and 5% Norwegian. The two 5%s, is due to those Vikings raping and pillaging everywhere. Actually, at least 30% is down to the hunter gatherers who followed the retreat of the ice. I do not desire any exotic mixture
Alfred the Great Tostig
Your DNA is very similar to mine. The admixture that we have is from kindred, white races. So we are pure.
Yeah, Alfred. How about this instead – you derive from ethnies well represented in your testing company’s parental population database, ancestral components labeled as “European.” You are essentially being compared to yourself. Congratulations.
The perfect “historical” example of this was DNAPrint Genomics using Hapmap CEU – essentially Anglo-Mormons from Utah – as the parental population for “European,” followed by Pennsylvania German-Americans getting significant levels of “East Asian admixture.” By those standards, Mitt Romney was undoubtedly a pureblood. So, tell me, Alfie – if the PA Germans had been used to define “European” instead of Utah Mormons, what do you think would have happened to all that “East Asian admixture?”
I’m sure the Type I peanut gallery response to that question would be: (((((crickets)))))
For information aimed at Normies see the following:
Also, see Dienekes’ criticism of these tests and their use of parental reference populations for “training data.” Note that many companies include customer data as part of their parental populations, and that data is not verifiable to the extent data derived from academic publications are. Are customers’ reported ancestries accurate?
Please keep in mind I am focusing here on ancestry, not health data, and I am focusing on autosomal ancestry data, not NRY or mitochondrial DNA, single locus markers that I have zero interest in. If you are into health data, 23andMe provides some of that, although people have complained about the accuracy of such data (stories about that are found online), and several companies provide NRY and Mito data that seem reasonably accurate.
The following comments are based on my reading about, and analyzing, the tests and some online results, based on my own scientific knowledge, the population genetics literature, what is known about human population history, as well as logic and common sense. The viewpoint is informed by a concern for politically relevant EGI, rather “movement” obsessions about “purity.”
I have long criticized 23andMe, which is an absolutely terrible test – it seems like the most popular such test among both Normies and Nutzis, it has majestic flaws, it is constantly misinterpreted, and in my opinion the company’s lack of transparency about certain realities borders on fundamental dishonesty. As one example of the latter, consider:
23andMe amusingly explains “unassigned” this way:
It is also possible to see a percentage of your DNA listed as “Unassigned.” There are two reasons why a piece of DNA might have unassigned ancestry:
The piece of DNA matches many different populations from around the world.
The piece of DNA does not match any of the reference populations very well.
That’s amusing because obviously the second explanation is what makes sense; the first is absurd. The whole purpose of the test is to identify and distinguish ancestral components. Since humans share 99+% of their DNA, excuse #1 should in theory apply to everyone equally. If the riposte is that we are talking about specific haplotypes stretching across chromosomal fragments, and those are not so widely shared, then how could it be possible for a stretch of DNA (chosen for examination for its value in assigning ancestry) to be so widely shared as to be “unassigned” to begin with? And why would most of this “unassigned” show up only at the highest, most conservative, confidence level? How come this “matches many populations” fragment is assigned to specific populations at the 50% interval? And why is it that the “unassigned” just so happens to show up for those individuals for whom parental population coverage is relatively lacking? Coincidence?
It’s obviously point #2 – the fragment doesn’t match the limited reference populations. At the most conservative setting they admit this is “unassigned,” but at the lower settings they just pick whatever population samples they have available at the time that may be slightly more similar to the fragment than are others. To say that 30% or 40% or 50% or whatever of someone’s genome, which is being used to distinguish ancestry in the first place, is common to many populations is ludicrous if you think about it – it is just coming from a population that may be more intermediate in the clinal genetic range than the parental populations they use. If the DNA fragment is one that is similar between “many populations” than how can they distinguish it in some people and not others? Simple answer – the “privilege” of being a member of a population well represented as a parental population. This is what I term “parental privilege” in ancestry testing – some people derive from ethnies well represented as parental populations, so those individuals get good matches and relatively “pure” results. For these lucky people, suddenly the “shared by many populations” (faux) problem no longer exists. This company makes it sound (for point 1) that, hey, it’s just a generalized piece of DNA, but then they ignore that it just so happens – a coincidence no doubt! – that people who derive from certain parental populations can have that fragment very easily assigned. It is “common to many populations” only when no matching parental population can be found in their database. “Parental privilege” is analogous to a form of affirmative action in ancestry testing.
A perfect example of parental privilege can be read here. Note that, with the most conservative confidence level for 23andMe this person was getting only 0.3% unassigned ancestry. 0.3%! Meanwhile, other people, at the same confidence level, get in the range of 30-50% – two orders of magnitude higher! How can you compare the accuracy of those sets of results? It is absurdity. Someone who at the highest, most stringent, confidence level offered by the company, has only 0.3% of their ancestry unassigned is obviously getting much more dependable results than someone who has 30% or 40% or 50% or whatever high level of their ancestry as unassigned. Amusingly, this person is still not satisfied by the fact that they are essentially being compared to themselves; thus:
The lack of defined reference samples from specific countries within the British Isles also sometimes gives confusing results. I have one project member with seven of his eight great-grandparents born in Wales and one great-grandparent born in Devon. At AncestryDNA he comes out as 64% Irish and 12% Great Britain, 12% Scandinavia and 11% Trace Regions. At Family Tree DNA his ancestry is reported as being 97% from the British Isles and 3% from Finland and Siberia.
Well, others have it worse. And then read this, an excerpt:
One could thus reasonably infer that, rather than ancestry, commercial DNA test results represent current geographic distribution of various population groups living wherever they happened to be living when the companies collected their samples. A customer’s DNA matches inform them of not where their ancestors might have come from in the past, i.e., their ancestry, but rather current geographic distribution of similar patterns of DNA bits that each company happens to probe for. Is that DNA-based heritage or ancestry, i.e., hints of ancestral people customers may have descended from? Sounds far from it.
I agree and that is one of the points I make in this post; one flaw in these tests is using extant populations to model past “admixture” events, and this is particularly comical since even “pure” extant populations are mixtures of earlier groups. Further, to use extant populations in a fair manner you need broad parental populations, which as we have seen, and will see more below, do not exist.
Getting back to the “unassigned” issue, online comments by customers speculating that “unassigned” regions may be due to “admixed regions” also fails as a logical excuse. First, it doesn’t consistently fit with the idea of a fragment being read as shared by groups (excuse 1) – being shared by different groups is not necessarily the same as being a mixture of those groups. Second, on an even more fundamental logical level, it fails because all extant ethnies are mixtures of past groups, but are being definitively assigned as a specific ethny if that ethny is well represented as a parental population today. If an ethny is a parental population, then a similar fragment is assigned to that ethny, regardless of the ethnic history that created that stretch of gene sequences. So, this excuse basically conflates with the more realistic excuse 2 – insufficient coverage of parental populations. It’s not a mysterious piece of DNA widely shared but yet distinct for any group represented in the parental populations, and it cannot be shrugged off at the same time as admixture. It is poor coverage. Adding more parental populations will create matches to those regions even at a 90% (“conservative”) confidence level, and make all the results more accurate and realistic.
As stated above, using extant ethnies to estimate deep ancestry is not going to give consistent results, as we in fact observe. Then there is the problem that people take seriously the labels companies give to particular ancestral components – as if a name is more than just a convenient label and instead carries some deep and objective meaning about the underlying objective ancestry. If a company decided to label some type of ancestry “Martian” does that mean people with that ancestry are descended from little green men? True enough, the labels do have some meaning on the broad scale. No company is going to label Sino-Japanese ancestry as “European.” They have some standards. But as we dig deeper, the correlation between names and objective meaning starts to fall apart.
An example of the labeling problem is that some of these companies label Ashkenazi ancestry as “European” while ancestral components that are part of the European genepool (e.g., enriched in Southern Europe), entering from the Neolithic to Bronze Age to Classical Age (and to some extent possibly from more modern intrusive invasions and migrations) are labeled “Western Asian” or “Southwest Asian” or “Middle Eastern” or “North African” or “Turkish/Caucasus” due to similarity to gene sequences found in modern populations from that region. Greg Johnson’s admixture component, if real and not an artifact, is likely from such an ancient source.
Thus, a problem here is that Jewish genes are labelled “European” while parts of the European genepool are labeled as something else. Again, labels are not the things themselves; dependent on the biases and parent populations of a given company, a given ancestry can be assigned to different continental population groups. For example, why not label Ashkenazi as Middle Eastern? Why not invent a “Neolithic” or “Mediterranean” label (instead of “West Asian” or “Southwest Asian”) for component autosomal ancestries enriched in those regions where J2 NRY is common? One labeling scheme is as justified as the other. If the idea is “we label based on where the ancestries are most enriched in modern times,” then great – last time I looked Israel is not in Europe. Inconsistent much?
Also, if we were to assume some, any, or all, of this purported admixture is real (and only at the highest confidence levels should it really be possibly so considered), and if we note that current populations are being used as the parental populations, the it is clear that “Western Asian” or “Southwest Asian” almost certainly tracks with the dispersal of J2 NRY and would most likely be ancient Neolithic, Bronze Age, and perhaps Classical population movements. Later invasions would be Berber-Arab and would track with North African/Arabian ancestral components – although some of these can be ancient as well, particularly with contacts between Southern Europe (particularly Iberia) and North Africa in ancient times (as well as the modern Moorish intrusive elements). We should not conflate “West Asian” with “North African” – these are not the same racially or historically. Consistently with these tests, ancestral components like Anatolian genetics seem to track with J2 NRY, so it appears it is an ancient component, and showing up in European populations because of poor parental population coverage
And again we come back to the issue of parental populations. European ethny “X” – not a parental population – is characterized by a test as having some degree of “admixture” compared to the parental populations available. However, if “X” itself is used as a parental population, then individuals from “X” will see most (and in some cases all) of the “admixture” disappear, since they are being compared to the consensus of their own ethny.
The riposte to that would be that “that is an unfair obfuscation of the underlying genetic realities.” Perhaps. But why can’t the same be said about other groups used as parental populations? As noted above, when DNAPrint Genomics was using CEU Utah residents as the European parental population – basically using Anglo-American Mitt Romney types as the reference population for “European” – some German-Americans (*) were getting “East Asian admixture.” Most tests today have Germans as one of the parental populations, so few if any Germans are getting any such admixture. So, groups used as parental populations are “privileged” (see above) in the sense that members of such groups, or genetically closely related groups, are going to get minimal to no “admixture,” as they are being compared to their own ethny. The riposte to that would be that “well, we ‘know’ from ‘racial history’ that some groups are more admixed than others, so the choices of parental populations makes sense.” Perhaps, but that is mostly subjective, and when based on genetics data it is circular reasoning.
Objectively, we could just use raw genetics data for genetic kinship analysis – by its nature kinship analysis includes all sources of autosomal genetic variation, including admixture – but people seem not to want that and/or companies refuse to offer it. In any case, the companies can’t be so stupid as to not know that the choice of parental populations directly affects the results. They (with the one exception below) just don’t want to admit it.
One point I’d like to make is that although 23andMe’s “chromosome painting” has some advantages if done correctly – identifying chromosome blocks and the timing of putative admixture events – the key point is “if done correctly.” In most cases, just looking at SNP frequency data is going to be much more dependable, because the higher-level analyses are increasingly dependent on proper parental population samples (as well as an overall proper methodology). If you misidentify several SNPs out of the many used, well, that’s not good but not “fatal” to reasonably accurate results, as long as the rest of the SNPs are more or less correctly characterized. But if you misidentify an entire chunk of someone’s chromosomes, then you are going to markedly alter their ancestral composition. I trust it is clear why this is so – it is the difference between misidentifying individual alleles vs. misidentifying haplotypes that cover significant portions of the genome. The latter situation amplifies the error because the error constitutes such a large percentage of the ancestral calculation, while the former error is relatively minuscule.
So, I’d trust the data based on SNP frequencies more given equal parental population representation. That doesn’t mean the SNP frequency data are correct – the company may have made errors in that as well – but we are talking about relative probabilities here.
In summary, 23andMe gets a C for people who have good parental population coverage, they good a F for those who do not, so the overall grade for 23andMe is a D. And that’s not good. It’s terrible in fact. By comparison, evaluating DNAPrint’s test by today’s standards, it would be a D- or F, while by the standards of its own day it was maybe a C+ or B-. In a relative sense, 23andMe is far worse, and in a gross sense, it is at best only marginally better. It’s a disaster. I’m not impressed by DNATribes either – parental population coverage there is relatively good, but…STR analysis? An F for them. They’ve announced they are going out of business – should they be upgraded to an A for that? In my opinion, DNATribes is/was even worse than 23andMe, and we can only hope 23andMe follows DNATribes’ lead in closing up shop.. And, I’ll give FamilyTree DNA an F – F for FBI. Genetic privacy matters. Enough said about that. What about other tests?
We can consider AncestryDNA, yet another substandard test. If you look at their website, they make it sound like they have a really large number of parental populations, for example “see all regions” at their website. However, when the customers get their results, it is the same old story with the standard reference populations. True enough, the company will tell customers, in a qualitative sense, where the more specific place of origin of their majority ancestry most likely is, but that’s it. The more specific subregions are not being used as reference (parental) populations, and they are not directly used in a quantitative sense to give the ancestry proportions. The company’s website is therefore in my opinion highly misleading.
As a positive, they give errors bars, which is a plus; however, the range they give is sometimes extremely broad. Results can vary over a range of 10-20%, etc. That’s not very precise, and demonstrates why these tests cannot be used to determine exact cut-offs. A person “100% pure” may actually be, say, 85%, and a person “85%” may actually be 100%.
From an online forum about this company’s test:
September 13, 2018 at 10:00 am
Thank you for the kind words. I’d love to hear how your new results compare to your tree. Northern Europeans seem to be quite happy with the new estimates; southern Europeans less so.
I can’t say I’m surprised. Look at the reference populations. In general, the sample sizes for Southern Europe are less than that of Northern or Eastern Europe. Italy has the most at 1000, but that is less than France, “Germanic Europe,” England/Wales/”Northwestern Europe,” as well as “Eastern Europe”/Russia. Population genetics studies have shown greater genetic heterogeneity in Southern Europe than the North. So, good coverage is particularly important in the South. Consider that the “movement” likes to tell us how Northern and Southern Italy are radically different, racially speaking. If that is so, then those regions should have their own separate reference populations. Or are they really similar? You can’t have it both ways. If Lombards and Sicilians are less similar than are Norwegians and Swedes to each other (the company has Norwegians and Swedes as separate reference populations), then the different Italian subgroups should have their own reference (parental) populations. On the other hand, if those Italian groups are so similar that a general “Italy” category is sufficient, then all the fetishists should stop foaming at the mouth over intra-Italian differences. Again, you can’t have it both ways. In general though, a test that distinguishes Norway from Sweden, and England from Scotland, should probably break apart places like Italy into subregions – which would be more honest given how they advertise the test on their website.
Actually, even some of those well represented regions have problems. What is “Germanic Europe?” Why not Germany alone? Why not separate North and South Germany? Different regions of France? Separate Russia from other Eastern European nations? England vs. Wales vs. “Northwestern Europe?” And Ireland and Scotland combined? Why? Now, as I have said, areas with greater genetic heterogeneity require more coverage, but, still, e,g., the English and Welsh are not identical and should not be lumped together as such.
I also read where the newest version of the test (like 23andMe) uses haplotypes rather individual SNPs. If you do that, you MUST have excellent coverage for your reference (parental) populations. An error is misidentifying an entire chromosome block is going to be a lot more damaging than getting scattered SNPs incorrect That amplifies the problem of insufficient reference population coverage and is another explanation why Southern European results have gotten worse after the change.
So, AncestryDNA gets a D/D+ for overall results, which would have been upgraded to D+ for giving error bars (however broad), but because they are (in my opinion) misleading customers as to what the reference base actually is and how detailed it is for subregions, they get downgraded to a D.
Now we will consider another terribly flawed and incompetent test – the National Geographic Geno 2.0 (Helix) test, which uses Next Generation Sequencing, is purported to be designed to look at “deep ancestry,” but that make the error, consistent with other companies, of using extant, narrow, parental populations as proxies for “deep” ancestry, which is a major flaw. Their “reference populations” are extremely limited (as usual – the typical “parental privilege”), the labels they give ancestral components are strange, and the website is reported by some customers to be difficult to use. We will consider the various versions of the Geno tests, of which Helix is one.
Putting aside this person’s (somewhat dated) opinions of the tests (keeping in mind she derives from populations that may have better parental coverage – even at that time – than others), I find it interesting that a person who is predominantly of Northern European heritage has a substantial contribution of “Mediterranean” and “Southwest Asian” ancestral components as measured by one (older?) version of the National Geographic “deep ancestry” test. Granted that there is an unknown component in her genealogical ancestry, still, I believe that these data – to the extent they are in any way meaningful – likely represent Neolithic (and perhaps Bronze Age) influences. In other words, these components – including “Southwest Asian” – are a natural part of the European genepool, albeit represented to different degrees in different parts of Europe. Of course, I disagree with their “Mediterranean” category that lumps together genetically and historically disparate groups; however, in that case, it may represent a common thread (Neolithic?) of these groups, with the rest of the total ancestry of these groups being different. In any case, once again, we see the danger of taking labels literally, and also the problem of using current extant parental populations to represent ancient ancestral components.
See this. We note several things here. There isn’t a good range of parental populations. We note that all European populations – including Northern European populations – are bring represented as being composed of different ratios of Northern European, Mediterranean, and Southwest Asian ancestral components (with some populations having low levels of other ancestries). Thus, different ethnies are represented as diagnostic ancestral components. Also, some of these populations are considered by 23andMe as distinct, discrete “pure” populations but are here represented as mixes of various ancient ancestral components.
Here is yet another (“next generation”) characterization of reference populations with their respective ancestral components. We notice three crucially important things. First, many of the populations are the same as in the original list (discussed above) but the ancestral components are different. The same populations, with the same gene sequences, are being represented differently with alternate sets of ancestral components (each component given descriptive labels by the company). Thus, how a population’s ancestral components are represented, and how those components are labeled, can change over time; differing between various versions of a test and of course varying between different company’s tests. Second, again we see that European populations are composed of different components, they are all “admixed” to some degree based on the ancient components identified by the test. Third – and this applies to both versions of the National Geographic reference populations – what is considered mixes here would be considered “pure” in 23andMe, demonstrating how concepts of “purity” differ with what reference populations are used, how companies decide how to represent those populations, and what labels are used for description. Thus, in 23andMe, “European” includes “Greek/Balkan” as a category, as that is represented as part of their parental population base. In theory, someone genetically similar to 23andMe’s Greek/Balkan reference population could be “100% Greek/Balkan” and hence “100% European” – while that same ancestry in the National Geographic test will be shown as a mix of different ancestries, mostly European but some non-European. It’s the same gene sequences, the same ancestry, but interpreted in widely divergent ways by the companies and the tests. What one company labels “pure” another company – digging deeper in the ancestral mix – considers to be “admixed.” It’s all relative, not something definitive and set in stone. There’s nuance and interpretation, shades of gray, not black and white. And both Nutzi fetishists and Normie ignoramuses cannot understand this.
Ancestry results are not something that can be interpreted as absolutes, they are dependent upon methodology, parental populations, labels given to ancestral components, all leading to whether the company is assaying more recent ancestry, or “deeper” ancient ancestry. The “purity” myth is on display here, since “100% pure” ancestries in one test will be represented as mixtures of components in a different test. Labels and interpretations are not the same as objective reality. And this is a crucially important point. The ancestral components themselves are certainly made up of mixtures of earlier population groups. For example, with respect to the “Eastern European” component, which most possibly reflects Slavic ancestry, the company states (emphasis added):
The large Eastern European component is typical for the region, and is itself a genetic composite of years of migration through the region.
So, again, this is something “movement” fetishists don’t understand – the ancestral components that they perceive as “pure” are themselves mixtures from earlier times, mixtures containing components that may well trace from outside of Europe. That is the nature of human biological reality. There is no “purity.” Instead, there are greater or lesser degrees of genetic similarity and difference.
If “Eastern European” does in fact reflect a basic Slavic ancestry, and if these results can be trusted, then it is interesting that Balkan South Slavs like the Bulgarians are heavily Slavic, only a few percentage points less than Russians and Poles, and more than the Czechs, all groups typically considered “more Slavic” than are Balkan groups. So, there may well be evidence for a common Slavic ethnoracial foundation for all these groups. Also note that Romanians are more “Southern European” than are Bulgarians, despite the fact that Romania is just to the north of Bulgaria, and based on simple gene flow you’d expect the results to be the opposite. Maybe there is something to the idea that there is a significant “Latin” “Roman” component to the Romanian ethny in addition to Slavic and other elements. What about “Diaspora Jewish?” Described as a distinct category here (and in 23andMe more specifically as “Ashkenazi Jewish”), academic population genetics suggest that this is in actuality a combination of Middle Eastern and European genetics. Once again we see a category that is either a single distinct “pure” ancestral component, or a mixed component, dependent on how it is analyzed and interpreted.
What about statistical significance? Confidence levels? Error bars? And, more fundamentally, what was the reason for changing the ancestral components between the different versions of the tests? Whatever the reasons, there’s no explanation that I find satisfactory; the overall attitude of all these companies tends to be “trust us, we’re the experts,” and the customer base accepts that, with some grumbling from those more skeptical and better informed. None of these companies provide the nuanced interpretations and more detailed explanations that I am providing with this post.
The National Geographic test does tell customers the two groups they are most similar to. Fine, but not enough. There needs to be a complete list, with quantitative measurements of genetic kinship.
In summary, although some of the ideas behind the National Geographic test are interesting, the test itself is as bad as 23andMe (or worse). The basic problems are the same – lack of sufficient reference populations, lack of nuanced understanding of the meaning of the ancestral components, lack of real statistics, and the subjective labels given to ancestral components. If we couple this to a bad website, lack of explanation, and changes between all the different versions of the test (without sufficient explanation), this test is lucky to get a D, and not a D-.
Then we have LivingDNA, which has a leftist anti-racist narrative behind its founding, and which has received some criticism from customers online (but, then, of course, all these companies have their share of dissatisfied customers). The results from this company seem to be slightly more plausible than that generated from 23andMe, which isn’t saying much, but suffers from the same basic problem – individuals from ethnies likely not well represented in their parental database get skewed results. I say “likely” because the company provides remarkably little information (that I can find) on their methodology and parental population database, but given the results they generate and given the general history of companies having weak representation of certain ethnies, it’s a fair bet that this company also exemplifies “parental privilege” for certain ethnies. So, basically, it is a real bad test, only slightly better (if that) than 23andMe and National Geographic.
They also exhibit the curious results that a person of 100% genealogical ancestry X turns out to be a mix of X, Y and Z – despite the fact that, e.g., Y and Z are known to be components of X. This is the same problem with all of these companies. It may well be that X is not represented well in their parental database; hence, the problem. That is more likely that the X person is really so much Y and Z that it presents in addition to the Y and Z inherent in X. Of course, the companies of course explain none of this nuance to their customers.
Indeed, a major weakness of this company (besides their politics and the questionable results) is the relative lack of information they provide about the test itself, and about the results, to their customers. On the one hand, it’s a weakness, but then, given that much of the information provided by other companies is questionable at best and bogus at worst, maybe being reticent is a positive. Addition by subtraction, so to speak.
No surprise of course that results from this test can very markedly differ from that obtained from, for example, 23andMe. Who expects consistency, what with different methodologies, parental population databases, gaps in those databases, labels given to ancestral components, etc.? Don’t expect careful statistical analysis either. We certainly can’t have that!
I note that they say results can be “refined” in the future as their database expand, a tacit admission that they do not presently have good coverage of certain ethnies. That also emphasizes the impossibility of utilizing precise cut-offs as the always-fuzzy boundaries are ever-shifting.
So, with all these weaknesses, balanced out by (possibly) marginally more plausible results than 23andMe, this company gets a “healthy” D+ for their efforts. Really, I could have given them a D, but they seem to be relatively new, so I’ll be generous for now, and we’ll see if they improve or get worse (more likely). I do not like their politics, but I’m not grading them on that. I’ll expect them to ruin their test with “upgrades” the same as every other company; in that case, they would then get the D (or D-) they likely really deserve.
Getting back to inconsistency of results – as we can read in various online articles and blog posts, people who use multiple companies typically get markedly divergent results. The main ancestry is usually similar but after that it all falls apart. Now, if the tests and their interpretations were all sound and consistent, how could that be possible? The answer of course is that with different sets of narrowly defined parental populations with insufficient coverage and different ways of breaking down ancestral components and different approaches to labeling those components, of course the results will be different. And, lacking sufficient information, as well as statistical information, how can we say one result is more accurate than another? The only thing we can go on is how well the results match what academic population genetics data say about the ethny or ethnies making up a person’s genealogical ancestry. If that’s the case, then why take the tests? Just go to the published papers. And, laughably, the companies do not even give customers remotely similar calculations for percentages of Neanderthal ancestry. What is it? Do they use different caveman reference populations? One company uses Fred Flintstone and the other uses Barney Rubble?
The deCODEme site used to have a free, good (albeit qualitative) kinship comparison based on 23anedMe data – ranking relatedness to a global ethnic groups, arraigned by continent, and those results seemed reasonable, but it seems no longer offered. The original 23andMe site used to have a more quantitative estimate of relatedness at the continental and sub-continental (e.g., Northern vs. Southern European) level, as well as a PCA plot, but unfortunately they did away with that in favor of material less politically relevant (or not relevant at all).
I suppose if someone has the money to try every testing service they could look into it, and try all the companies, for personal interest. Again, this essay is not meant to be a comprehensive analysis of every company; I may have missed a test that is particularly good or bad. This post is instead meant as a brief and cursory survey of some of the main current competitors in the field, coupled with some general commentary on the tests themselves.
In any case, I agree with Johnson here. Past “Old World” admixture is part of the European genepool. Certainly, we can always strive to improve the genetic situation (e.g., eugenics), but we are what we are. We have to look to the future, not the past.
Grades for (autosomal) ancestry testing companies:
National Geographic: D
DNA Tribes: F
Others are not worth mentioning or I have insufficient data.
The patterns is of very low grades, reflective of the reality that the overall state of current commercially available ancestry testing is poor. And just as the companies claim that the data they present to their customers may change and become more “refined” with more parental population coverage, so may the grades I give these companies change (likely for the worse, given their poor performance heretofore) and become more refined with more data as well. So,expect grade updates in the future. Also, new companies may come into existence and those may be evaluated as well.
The most urgent need is proper parental (or “reference”) population coverage. Nor more “parental privilege” affirmative action for some groups and not for others. Either add more parental population coverage or have the integrity only to offer the tests to customers who match the reference profiles. Otherwise, it is all a misleading fraud.
In addition, these tests need to be interpreted in a relative (e.g., greater or lesser degrees of different ancestral components comparatively speaking) rather than an absolute e.g., definitive results with hard cutoffs, concerns about “purity”) fashion. Given the realities of uncertainty and methodology, even a good test would need to be interpreted in such a fashion, much less the mediocrity we have to deal with. Of course, Nutzis will remain incapable of understanding any of this.
Really, what is needed is genetic kinship assays on all populations, comparing individuals and populations to each other, but I suppose such a biopolitically relevant metric is nothing we should expect any time soon (or ever).
One could argue that ancestry testing as it exists today could be, at best, an amusing personal hobby for individuals, if it wasn’t being politicized by actors on both the Retard Right (see quotes at the beginning of this post) and the Loony Left (LivingDNA’s anti-racialist agenda, deCODES’s “gotcha” of Watson, and the Cobb setup debacle). But we live in an age where everything is politicized, for better or worse. In that case, we had better focus on genetic kinship, which is politically relevant with respect to EGI.
But instead we’ll have more juvenile ignorant blustering from entitled Nutzis basking in their “parental privilege” affirmative action ancestry results.
Needless to say, I was very, very surprised with the results of my DNAPrint “geographic ancestry” test results when I received it, and it showed a 21% East Asian content and 79% European instead of a 100% European which I had expected. In discussing this with AncestrybyDNA lab personnel I have learned that surprisingly to them some other PA Germans tested have had similar significant high teen, low 20’s% East Asian content results. At present they have no clear explanation as to why.
The “clear explanation” seems obvious in retrospect. Compared to the Romenyite parental population for “European,” some Germans would appear to be 4/5 Romney and 1/5 Chairman Mao. If the parental population had been “PA Germans” then all those folks would have been “100% European.”
I’ll say it again for the mentally slow: The results of ancestral component testing is going to absolutely and directly depend on the choice of parental populations.
I need to summarize the whole “parental privilege” problem for the Nutzi crowd. I’ll try to make it as simple as possible. Let’s consider it first in outline form.
1. A company defines a particular ancestral component as “European.”
2. The reason for that label is that the ancestral component is defined by a parental (or reference) population (or populations) that is European.
3. But why does the company label a particular parental/reference population as European? Well, it is because the population is historically tied to Europe, it derives from a nation or region within the boundaries of Europe, the population came into existence as a distinct group within Europe. All of which essentially matches much of what I define as an indigenous population.
4. Very good. So an ancestral component is European if it is derived from, or defined by, or represented by, a population that is European. European populations tend to possess ancestral components that are “European” because those components are defined from an analysis of European populations.
This is saved from being circular reasoning by the fact that the initial definition of a population as European is not based on the ancestral components (that are themselves defined as European because they come from populations labeled as European), but instead because of the historical existence of the population within Europe, as an indigenous population of Europe, so defined.
5. OK. But, if population groups A,B,C, and D are all historically European ethnies, if they all historically exist and existed within specific regions of Europe, then why should A and B be among the parental populations that define European ancestral components, and not C and D? There is no reason to privilege A and B over C and D. The only practical reason is that the company simply doesn’t have any, or enough, samples from C and D, while they have many samples from A and B.
6. Because of this deficit of C and D, and presence of A and B, individuals of ethnic background A and B, or ethnies very similar to A and B, are essentially being compared to themselves in the test. If A and B define the ancestral components of “European,” and your ancestry is A and/or B (or something similar), it stands to reason you will test out as being close to, or at, “100% European,” with the subpopulation being A and/or B. Again, you are essentially being compared to yourself.
On the other hand, individuals from C and D are being compared against a standard defined by A and B. So, individuals from C and D will be represented as “mostly A and/or B” but with some “E and F”- with “E and F” being ancestral components labeled as from other, non-European, populations that happen to be well-defined in the parental population database.
7. On the other hand, if C and D were included as parental populations, then their ancestral components would be included as “European” and results for individuals of C and D ancestry would be similarly “European” as for A and B, with the subpopulations in this case being C and/or D.
And in the rare cases in which testing companies decide to be honest, they admit the reality of “parental privilege” – although of course they do not term it as such. Thus, we read:
On the old Decodeme site (login was required, so no URL available), the following was admitted (emphasis added):
The reference population samples were obtained from the HapMap project – they are:
1) European Americans from Utah – who most likely have a majority of north European ancestry
2) Yoruban Nigerians
3) Chinese from Beijing and Japanese from Tokyo.
The characteristics of these reference population samples and the clinal nature of human genetic variation (i.e. the fact that people typically become gradually more different as you travel further from your country) have several minor implications for the interpretation of the results. For example, a deCODEme user with a majority of ancestors (during the past >2 generations) from south-east Europe, will typically see higher percentages of African and Asian ancestry than a deCODEme user whose ancestry is mainly from north-west Europe. The difference will be small, but present.
So, deCODEme at least had the honesty that populations not represented in the parentals would exhibit artefactual “admixture” due to clinal differences in gene frequencies. As to what level of difference is “small” they do not say, but keep in mind that another company was stating that close to 9% “admixture” was close to the levels of statistical significance.
Here’s the response from our scientist who developed the algorithm underlying ancestry painting: “There’s no case that I’ve seen where 9% Asian ancestry does not indicate genuine East Asian or Native American ancestry. I’ve looked at order thousands of individuals of known ancestry, that approximately cover the gamut of human diversity. Thus I would regard 9% as a reliable indication of East Asian or Native American ancestry. That said, 9% is close to the threshold above which the following statement can be made, so it is still theoretically possible, albeit very unlikely, that the prediction is not true.
If that is so, and then you add to that the extra uncertainty due to “parental privilege” what are we talking about here as potential error for non-privileged populations? 10%? 15%? More? In some cases that falls with the errors bars provided by companies like AncestryDNA!
Now, of course, there really is some (modern, historical) admixture in Europe, higher in some regions than in others. But the amount of real admixture is much lower than what would suppose from looking at commercially available ancestry testing that inflates admixture for the reasons explained above – an inflation that, by some happy coincidence, just so happens to be compatible with the leftist political views of the companies, their founders, and their employees.
While single locus markers are absolutely useless on an individual basis, they do have some utility for populations, with results averaged out over large sample sizes. Such data suggest that real admixture in Europe tops out at about 5%. And much of that is non-European Caucasian or Central Asian. More divergent sub-Saharan African or East Asian admixture is going to be significantly less than 5%.
So, in the end, the real reason why something like the post linked here is essentially correct is that the typical “movement” activist is too stupid to understand all of the points made in my post that you are currently reading here at EGI Notes. Even when the companies themselves admit that “parental privilege” is real, even when the companies admit the fairly large statistical error, and even when confronted with the obvious logic that someone essentially compared to themselves is going to be, by necessity, ”pure,” the Nutzi retards still won’t get it. Or, maybe it is not that they are too stupid, but that they lack the incentive. After all, those who benefit from affirmative action rarely criticize the program; the same applies to “parental privilege.” Let some testing company start using, say, Sardinians as the reference population to define “European,” and all the Nutzis suddenly start getting “exotic mixture,” and I’m sure they’ll all cry bloody murder. All of a sudden, everything written here, and all the open admissions of the companies themselves, will become crystal clear and acceptable.