More Testing Follies and Other News

More 23andMe fails and other news.

As background, read this.  Also read this.

Prepare for an unexpected shock – Sallis is proven right once again.

Over the last year or two, companies such as 23andMe have been updating their customers’ ancestry results; in almost all cases that has been as a direct result of expanding their parental (reference) population sample database with all sorts of non-European samples. They do this (concentrating on the expansion of samples from outside Europe) even though they have grossly insufficient coverage from various parts of Europe (particularly the South and East) and even though most of their customers are of European descent. 

In the months since I posted the above linked criticisms, I’ve been studying online forums in which customers discuss their results, including the more recent updates, as well as looking at statements by the companies themselves, and also material forwarded to me by correspondents.  

The problems accompanying these updates, combined with the pre-existing problems of the tests, essentially completely confirm my previous criticisms and interpretations of these ancestry tests, particularly with respect to the issue of “parental privilege.”

In these updates, in general, the changes in ancestral proportions perfectly mirror the additions of parental population samples that are likely inappropriate for the customers in question (based on their actual, proven genealogical ancestry). Thus, customers who have poor parental population coverage of their actual ancestry exhibit increased ancestral proportions for precisely those (genealogically non-ancestral) parental reference populations that had their numbers increased in companies’ databases.

Therefore, and 100% consistent with my past criticisms, the results are completely dependent upon the choice of parental populations, and the degree to which particular parental populations are represented in the databases. More of a certain parental population shifts ancestral proportions precisely in that direction, causing customer results to fluctuate wildly dependent on the parental population choices.

In addition, with these updates, the “unassigned” percentages for the conservative estimate (90% confidence) markedly increased for these same customers (in these cases specifically 23andMe, which provides confidence levels and “unassigned” portions of the genome – other companies do not generally do so), clearly demonstrating that the updated results are less accurate than the preceding. 

As a model of this, look at the first example here. Consider a scenario in which  the testing company refuses to add (more, if they have any) “green” parental population samples, but significantly increases the representation of “yellow” (but not “blue”) samples in their database. What happens? “Green” individuals are suddenly shown to exhibit a much greater percentage of “yellow” ancestry – which is purely a consequence of the shifting representation of different groups in the parental population database.  What if the number of “yellow” was decreased, and “blue” increased?  Then the “greens” would be more “blue.” But, here’s the rub – if significant numbers of “green” were introduced, then the “blue-yellow”” “greens” would – presto! – be represented as mostly “green.”

Again, my criticisms have been 100% confirmed as legitimate by the direct correspondence between expansion of certain parental populations in the databases and the increased ancestral proportions for those same populations among customers who lack proper parental population representation.

An equally valid conformation of my criticisms is that for many of these customers, the updated ancestral proportions have been accompanied by an ever-increasing “unassigned ancestry” percentage when considering results at the (more proper) “conservative” (“90% confidence”- which itself is a bit too low) estimate levels – often increasing to ludicrous levels. If “updates” markedly increase the amount of unknown ancestry at reasonable confidence levels, then this is strong evidence that the updates are providing ancestry estimates that are less accurate than those preceding.  How could it be otherwise? By introducing parental populations that are more distant from customers’ actual ancestral backgrounds, in the context of refusing to increase the appropriate parental population representation for those customers, of course the results will be less accurate, with less of the genome being reliably assigned at higher levels of confidence. The more the parental populations are unrepresentative of the customer, the less likely they will fit the data at the highest confidence level – hence, “unassigned ancestry.”

Anyone getting over 20-25% “unassigned” at the 90% level should view their results with extreme skepticism.  What if it is over 40%?  That is in my opinion essentially useless.  And what about levels exceeding 50% (!) – and some of them (believe it or not) do?  That is in my opinion a tragicomic embarrassment. That’s what one could expect if one tried to represent, say, Russians using some English reference samples and an increasing Japanese reference database. That the company actually releases data with such high “unassigned” levels is shocking.  If person A has an “unassigned” (at 90% confidence) of, say, 5-15% (or less) and someone else has 40-55% (or more) – how can you possibly equate the validity of those two sets of data?  In some cases, the differences are at the level of an order of magnitude.  

Note to testing companies: More references samples from Europe. Many, many more, covering ALL areas.  Most of your customers are of European origins.  You need high level coverage from throughout Europe, all of Europe, before you do your SJW sampling of other areas to satisfy the diversity-mongers.  Get all of Europe covered before you handle those Egyptians, Tibetans, Nepalese, Martians, Neptunians, or whatever. Your customers are your customers, not SJWs screeching about “diversity” in reference populations. You want “diversity?”  First start with Europe.

ALL of your customers should have “unassigned” in the low range at 90% confidence – not just those with “parental privilege.” And even for those latter customers, who are much better off than the others, the results are still suboptimal.  Consider Derbyshire’s data, which is not fully matching his actual ethnic ancestry; however, at least Northwest Europeans fall within the correct sub-region, even if national-ethnic affiliations are not always on target. The swarthoids and slavoids often do not get even that.

For now, 23andMe may be useful for the raw data (that can in theory be used for kinship analysis, which is biopolitically relevant) as well as the health data. The ancestry testing is laughable.  And, by the way, the “timeline” feature is a bad joke, based as it is on the flawed “chromosome painting” and consequent ancestry estimates. Note to company geniuses: Just because you model someone’s ancestry with your limited and inappropriate reference parental samples, does NOT mean their actual ancestry derives from those sources, so that you can “time” when that non-existent ancestry entered their ancestral line (shown to be ludicrously – and objectively mistaken – recently).

Going back to the Russian (23andMe) customer scenario, let’s model it differently for the sake of illustration. In one scenario, there are no Russian reference (parental) samples, only Germans and Central Asians.  At 50% confidence, the Russian would likely be represented as mostly German but with a significant Central Asian ancestral component.  At the low level of 50% (!) confidence, some chromosome fragments would seem slightly more Central Asian than German and would be assigned thus – it’s only at the coin-flip level of confidence, remember.  At 90% confidence, likely 40-50+% of the chromosome fragments, and hence the ancestry, would be “unassigned” – since at that more reasonable level of confidence, many of the chromosome fragments do not at all match either German or Central Asian. Of the remainder, most would be German, with a small minority of Central Asian. What if the Central Asian reference population was suddenly increased with more samples – increasing the chances that at 50% confidence a match was more likely with some new Central Asian sample than with the original German parental samples?  The Central Asian proportion of the Russian customer’s “results” would be increased at 50% confidence, and the “unassigned” would increase at 90% confidence – the latter occurring because these new results are actually less accurate than the preceding. Thus, at 90% confidence, the chromosomal fragments are not matching these new Central Asian samples. What if the parental populations were Sardinian and Central Asian? Likely the Central Asian component would be larger at 50% confidence than with the German and Central Asian parentals, since Russians are more genetically distant from Sardinians than they are to Germans. And here, with Sardinian parentals, the “unassigned” at 90% confidence would be even larger than with the German parentals.

Now, let’s do another scenario.  Here, there is a large and very comprehensive Russian parental population – many reference samples from ethnic Russians from all parts of Russia. What happens then? This same Russian customer – the same individual with the same genome – is now represented as being overwhelmingly Russian (and since Russian would be considered “European” by the company labeling, the customer would be so labeled), with only smaller amounts of other ancestries (since the customer may not be an exact fit to the co-ethnic reference samples). Note that the results from the two scenarios would be completely, utterly different. Also, in the latter scenario, at 90% confidence, the “unassigned” percentage would be low, since there would be a good fit between the Russian customer’s chromosome fragments and a large and comprehensive Russian reference population.

Consider another scenario.  Imagine if “German” was defined only by samples from North Germany. A Bavarian at 50% confidence might be mostly German but with a strong minority of other ancestries, with a hefty “unassigned” at 90% confidence. If “German” was subsequently redefined to also include many South German/Bavarian samples, then the Bavarian would see his German results greatly increase and his “unassigned” decrease.  

This isn’t rocket science or nuclear physics.  When you identify ancestral components by comparison to reference samples, then the composition of those references will of course determine the outcome of the ancestry determination. The accuracy of that determination can be ascertained by how much of the ancestry is “unassigned” at higher levels of confidence.

Once again: Wrong, wrong, they’re always wrong.

An amusing comment that I’ve found online (emphasis added): 

So basically the ancestry DNA test claims I’m 58% Great Britain! I am not even from Great Britain, I’m German I live in Great Britain though

Whew!  It’s good he doesn’t live in Uganda, imagine what results he would have gotten then!

In all seriousness, AncestryDNA may be the worst test out there…either that or 23andMe…both are borderline D/F grades in my opinion, absolutely horrid. AncestryDNA specializes in providing bizarre data points that overlap with zero. 23andMe isn’t much better. They’re competing for last place, putting a lot of effort there. Probably using the raw data for health-related issues may be the best use of that nonsense.

The lack of proper parental populations for Europe is a major problem.  I believe that this is a fundamental reason why the results for European-derived peoples seemingly get worse and more absurd every time that these companies “update” their tests. These companies seem to be going “PC” and adding reference populations from non-White, non-European populations; and since results are modeled based on the available reference population samples, the more non-White references you add, the greater the probability  of assigning ancestral components to those populations. Indeed, there seems to be a correlation between the politically-motivated stress on adding “diverse” parentals and increasingly absurd results. We need more parental populations from Europe – where most of the people using the rests derive their ancestry from. 

Let’s take an example. Imagine a testing company wants to determine the ancestral proportions of Iraqis. They model the “admixture” under four scenarios. One – a large reference population from Iraq; many Iraqi samples as parentals. Two – few samples from Iraq, but many samples from Jordan, Germany, and Ghana. Three – the same as two, but with the addition of a large number of reference samples from South Asia. Four – the same as two, but with the addition of a moderate number of samples from Turkey and a large expansion of the samples from sub-Saharan Africa. Now, under those four scenarios, will the results from a given set of Iraqis be the same, or even very similar? Hardly. They would be markedly different. Only when there is a significant number of reference samples from the specific population of the person or persons being tested will the results be reasonably accurate, and even then the results can be altered when there are significant changes in the types and numbers of other reference populations used to model the “admixture.” These are facts that cannot be responsibly evaded by the testing companies, although they’ll like to pretend that this is not a factor.
The current state of commercially available ancestry testing means that such testing is virtually useless for significant numbers of European-derived people. Actually, less than worthless, as the results are absolute incorrect. Again, the major advantage of this testing is using the data to make an “end run” around the paternalism of the medical community and getting a handle on health issues – assuming that the data are accurate, which is an issue that needs to be confirmed if something “bad” is discovered.

Genes and Health in Der News

In all cases, emphasis added.

But, but, but….I thought we were all exactly the same:

The inclusion of diverse ancestries in the present meta-analyses allowed us to identify two loci that would have been missed in meta-analyses of European-ancestry individuals alone. In particular, the lead variant (rs141588480) in the SNTA1 locus is only polymorphic in African and Hispanic ancestries, and the lead variant (rs190748049) in the CNTNAP2 locus is four times more frequent in African-ancestry than in European-ancestry. Our findings highlight the importance of multi-ancestry investigations of gene-lifestyle interactions to identify novel loci.

Comparing admixed Latin Americans to the Finnish population isolate: 

Most population isolates examined to date were founded from a single ancestral population. Consequently, there is limited knowledge about the demographic history of admixed population isolates. Here we investigate genomic diversity of recently admixed population isolates from Costa Rica and Colombia and compare their diversity to a benchmark population isolate, the Finnish. These Latin American isolates originated during the 16th century from admixture between a few hundred European males and Amerindian females, with a limited contribution from African founders. We examine whole-genome sequence data from 449 individuals, ascertained as families to build mutigenerational pedigrees, with a mean sequencing depth of coverage of approximately 36×. We find that Latin American isolates have increased genetic diversity relative to the Finnish. However, there is an increase in the amount of identity by descent (IBD) segments in the Latin American isolates relative to the Finnish. The increase in IBD segments is likely a consequence of a very recent and severe population bottleneck during the founding of the admixed population isolates. Furthermore, the proportion of the genome that falls within a long run of homozygosity (ROH) in Costa Rican and Colombian individuals is significantly greater than that in the Finnish, suggesting more recent consanguinity in the Latin American isolates relative to that seen in the Finnish. Lastly, we find that recent consanguinity increased the number of deleterious variants found in the homozygous state, which is relevant if deleterious variants are recessive. Our study suggests that there is no single genetic signature of a population isolate.

Alon Ziv weeps.  In this case, the more admixed populations, with their bottlenecks and consanguinity, have significant stretches of homozygosity and more deleterious alleles than the more isolated Finns.  So, “increased genetic diversity” does not necessarily equate to fewer deleterious alleles.  And all of this doesn’t even consider outbreeding depression from breaking up coadapted gene complexes.

Alcohol consumption, SNPs, and ancestry:

Alcohol consumption is a complex trait determined by both genetic and environmental factors, and is correlated with the risk of alcohol use disorders. Although a small number of genetic loci have been reported to be associated with variation in alcohol consumption, genetic factors are estimated to explain about half of the variance in alcohol consumption, suggesting that additional loci remain to be discovered. We conducted a genome-wide association study (GWAS) of alcohol consumption in the large Genetic Epidemiology Research in Adult Health and Aging (GERA) cohort, in four race/ethnicity groups: non-Hispanic whites, Hispanic/Latinos, East Asians and African Americans. We examined two statistically independent phenotypes reflecting subjects’ alcohol consumption during the past year, based on self-reported information: any alcohol intake (drinker/non-drinker status) and the regular quantity of drinks consumed per week (drinks/week) among drinkers. We assessed these two alcohol consumption phenotypes in each race/ethnicity group, and in a combined trans-ethnic meta-analysis comprising a total of 86 627 individuals. We observed the strongest association between the previously reported single nucleotide polymorphism (SNP) rs671 in ALDH2 and alcohol drinker status (odd ratio (OR)=0.40, P=2.28 × 10-72) in East Asians, and also an effect on drinks/week (beta=-0.17, P=5.42 × 10-4) in the same group. We also observed a genome-wide significant association in non-Hispanic whites between the previously reported SNP rs1229984 in ADH1B and both alcohol consumption phenotypes (OR=0.79, P=2.47 × 10-20 for drinker status and beta=-0.19, P=1.91 × 10-35 for drinks/week), which replicated in Hispanic/Latinos (OR=0.72, P=4.35 × 10-7 and beta=-0.21, P=2.58 × 10-6, respectively). Although prior studies reported effects of ADH1B and ALDH2 on lifetime measures, such as risk of alcohol dependence, our study adds further evidence of the effect of the same genes on a cross-sectional measure of average drinking. Our trans-ethnic meta-analysis confirmed recent findings implicating the KLB and GCKR loci in alcohol consumption, with strongest associations observed for rs7686419 (beta=-0.04, P=3.41 × 10-10 for drinks/week and OR=0.96, P=4.08 × 10-5 for drinker status), and rs4665985 (beta=0.04, P=2.26 × 10-8 for drinks/week and OR=1.04, P=5 × 10-4 for drinker status), respectively. Finally, we also obtained confirmatory results extending previous findings implicating AUTS2, SGOL1 and SERPINC1 genes in alcohol consumption traits in non-Hispanic whites.

Jews and Europeans have, apparently, been enemies from the very beginning.

As members of Der Movement agonize over those dastardly “Big Pharma products” violating our precious bodily fluids via injection (the horrors of vaccination!  Louis Pasteur the cryptic Jew!  Jew doctors!), the real threat to White health is that that the average White has a BMI rivalling that of a black hole singularity. That is why diseases like Type 2 Diabetes are increasing in frequency, including among the young. But, hey, those needles are real scary and all.  Big Pharma!  Big Pharma!  Pass another Big Mac, please.