Tales of Fst: Sallis vs. Lewontin

A small scale yet informative analysis of genetic variation.

We are all aware of the “more variation within groups than between groups” argument against the biological validity of race.

Now, I believe – or at least hope – that honest population geneticists (albeit very few if any exist) know better not to make absurd claims about Lewontin’s “finding” – that it “makes race meaningless” or that “people are more closely related to members of other races than members of their own race.”  At least they won’t say that among themselves, in their publications, or among other types of academics, but maybe they’ll still try to fool the rubes; after all, from my personal experience most population geneticists are anti-White SJW leftists.

The problem is more with your rank-and-file leftist, your Tim Wise types, your opinion writers, “anthropologists,” openly political population geneticists (the majority who are apparently dishonest), writers of “popular science,” politicians, bloggers, anti-White activists, etc. who make absurd comments about “more genetic variation within than between.”  Not only do they foolishly proclaim that it invalidates the race concept by making distinctive grouping impossible – that is absurd as Edwards so cogently pointed out – but they are even in error on a more fundamental level.

You hear these people make the most bizarre claims – that “more variation within than between” means that “Whites are genetically more similar to Blacks than they are to other Whites”- comments that reflect a complete misunderstanding of the concept (to be fair, those “academics” who have for decades championed Lewontinism to the rubes have, in my opinion, intentionally attempted to promote such a misunderstanding for political reasons).

You see, the basic problem is that these people think there is something special – in the negative sense – about classifying people by race (or ethnicity) that creates the Lewontin finding.  Because there is more genetic variation within “races” – for example, more variation within Whites than between Whites and Blacks – they think that means that if you were to compare one random group of Whites to another similar group of Whites then there would be more genetic variation between those groups of Whites than within those same groups (ignore the gaps of logic in this implicit, or sometimes overt, leftist “argument).  In other words, they say or imply, something like this:

Race is such a bad way to divide people, it is so wrong and meaningless, that WHEN you divide people by race THEN you get the result that there is more genetic variation within groups than between them.  [Implication: this difference in the apportionment of variation occurs as a result of binning people by race].  If we were to bin people randomly, arbitrarily, or by how “closely related they are independent of race” (whatever that means), then there would be more variation between than within groups, but when we use this stupid artificial racial boundary we see more variation within.  Indeed, the fact that binning people by race creates a situation that genetic variation is greater within the group proves that race is an invalid concept – how can a grouping that creates “more genetic variation within groups” be better than random groupings or aracial groupings that do not (we assume) do so?

You see, this is the implied message.  Race (and ethnicity) are negatively “privileged” groupings that create the Lewontin “finding” – after all, that’s how he reported it, and after all, that’s how it’s been discussed for decades, through the lens of racial classification.

My argument has been that this is a complete misunderstanding.  See this.  Excerpts, emphasis added:

With respect to Lewontin’s well known “there is more genetic variation within groups than between groups” we need to clarify whether the 85:15 split has any meaning other than the fact that the bulk of human genetic variation is randomly distributed. 

Comparing Danes vs. Nigerians: 85% variation within each group and 15% between.  The same would be observed with Japanese vs. Iranians. 

What if you considered a mixed group of Danes + Nigerians as a single population, and the same for Japanese + Iranians?  If you then apportioned genetic variation between D+N vs. J+I you would still get more variation within than between. 

If you went in the opposite direction, and considered Japanese from Tokyo as one population and Japanese from Kyoto as another population, the same within/between distinction would hold.  If you compared one Japanese family to another, you would also see more genetic variation within the group (family) than between families. 

As has been pointed out previously by others, a significant amount of genetic variation is found within single individuals; thus, if you were to compare one Japanese individual to another,~ half the genetic variation would be found within the single individual. 

For any set of human groups, one would expect to find more genetic variation within the group than between groups.  

Hence, the “within group” component of genetic variation is found within any defined set of individuals, and is randomly distributed among individuals.  It cannot be used to assert that members of an ethny are more dissimilar than to other ethnies, nor can it be used as a legitimate argument against the reality of genetically distinct population groups. 

And this doesn’t even touch upon the fact that with respect to many phenotypically relevant traits under selective pressure, racial differences in allele frequency is so great that there is actually greater genetic variation between compared to within groups.

Thus, most genetic variation is randomly distributed among individuals irrespective of classification. It has nothing to do with race (or ethnicity).  Racial classifications are not – as the leftists slyly imply – in any way special in exhibiting more variation within than between.  ALL and ANY human groups – even random, arbitrary groupings of people from within the same race or ethnic group, will show the same pattern of more variation within than between.  You can mix up groups of different races and get the same result.  You can create any arbitrary groups of individuals, in endless combination, and no matter how you do it, you will always get more variation within then between.

I doubt Lewontin and all the other academics who have foisted his “finding” on the masses were/are so stupid as to not realize this. They must understand that any and all human groupings, no matter how random or absurd, will show the same pattern.  Then, I suspect, knowing this, they decided to specifically choose racial classification as an example in order to trick people to believe that race is invalid, and do so for political reasons.

In actuality, the reality is the opposite, the genetic variation argument actually supports race, since the portion of genetic variation that is between groups is greatest when you bin people based on this concrete biological concept, and the between group variation portion is smaller (or in some cases virtually non-existent) when you bin people by random, or other arbitrary, methods.  Dividing Whites from Blacks is when you get the greatest amount of variation between, NOT dividing Whites from other Whites.  There was never reason to expect that human genetic differentiation was so extreme that the differences in genetic variation between groups would be greater than the unstructured variation found within groups.  If that was so, we would be totally different species, rather than variations (no pun intended) of one species.  

Let’s look at some data, but first, some comments on methods.  I have criticized Fst (and is variants) before – it is a lousy metric for measuring genetic distance, kinship, etc.  What it is – a measure of relative genetic variation

The fixation index is a measure of how populations differ genetically. One derivation of the fixation index is FST = (HT – HS)/HT, in which HT and HS represent heterozygosity of the total population and of the subpopulation, respectively. This derivation measures the extent of genetic differentiation among subpopulations. The value of FST can theoretically range from 0.0 (no differentiation) to 1.0 (complete differentiation, in which subpopulations are fixed for different alleles).  

A simple visualization of this idea is that of two squirrel subpopulations that are physically separated by a canyon and therefore cannot interbreed. Each subpopulation is homozygous for one allele of a SNP (in other words, each individual of one subpopulation might have a C at that position, while individuals from the other subpopulation have a T). The heterozygosity of the total population (HT) would therefore be 0.5. The heterozygosity of each subpopulation (HS) would be 0.0 (because every member of the subpopulation is homozygous). The calculation of FST in this oversimplified case would be (0.5 – 0.0)/0.5 = 1.0. In other words, 100% of the genetic variation of this population is between subpopulations, with zero variation within subpopulations.  

While a value of 1.0 for the fixation index is theoretically possible, such value in reality is usually much smaller. In general, high FST values reflect a low level of shared alleles between individuals in the sampled population and the total population. Conversely, low FST values indicate that members of the subpopulation share alleles with the total population. The proportion of individuals in a population that carry a certain allele varies over time and is influenced by the forces of migration, genetic drift, and natural selection.

But this is exactly the point – when discussing Lewontin a measure of relative genetic variation is exactly what we need, the weakness of Fst for kinship is a strength when tackling Lewontin.  In other words, we can use Fst to measure that portion of genetic variation that is between groups, with the balance being than within the groups.  For getting a precise measure of kinship, genetic similarity and difference – Fst is suboptimal.  For measuring within/between genetic variation, Fst is exactly what you need (and can give a crude estimation of distance).

After all, consider what Lewontin did – from the Wikipedia article linked above:

In the 1972 study “The Apportionment of Human Diversity”, Richard Lewontin performed a fixation index (FST) statistical analysis using 17 markers, including blood group proteins, from individuals across classically defined “races” (Caucasian, African, Mongoloid, South Asian Aborigines, Amerinds, Oceanians, and Australian Aborigines). He found that the majority of the total genetic variation between humans (i.e., of the 0.1% of DNA that varies between individuals), 85.4%, is found within populations, 8.3% of the variation is found between populations within a “race”, and only 6.3% was found to account for the racial classification. Numerous later studies have confirmed his findings.[5] Based on this analysis, Lewontin concluded, “Since such racial classification is now seen to be of virtually no genetic or taxonomic significance either, no justification can be offered for its continuance.

Let’s consider “1000 Genomes” data for 99 Nigerians and 99 CEU Whites (Northwestern Europeans from Utah – in other words, folks like Mitt Romney).  Let’s consider three SNPs and calculate Fst for different examples of groupings. 

First, a direct comparison of these two racial groups (Nigerians vs. CEU Whites), as it is usually done – calculating Fst of different distinct population groups compared to each other.

(UPDATE: I have changed the data format after getting criticism from some correspondents that the original version was not optimally clear to the layman. Hopefully the new version is better).  The first data:

Nigerians vs. CEU Whites

Fst = 0.1718

We observe the usual result.  The Fst between these two groups is 0.1718.  So, essentially, 17% of the total genetic variation inherent in the total of 198 individuals is that between the two racial groups of 99 each, and 83% is found within each group of 99.  The calculations from the Left always end there, with heavy breathing and triumphant cries of “more variation within than between” when we classify by “race.”  Let us continue the analysis.

Let us now arbitrarily break up each of the two populations into three subgroups of equal numbers (three groups 33 Nigerians and three groups of 33 CEU Whites) and measure Fst comparing now the intra-racial groups (Nigerians s. Nigerians and Whites vs. Whites).

Nigerians broken up into three “populations” of 33 individuals each:

Three arbitrary groups of Nigerians

1 vs. 2  Fst =   0.0024

1 vs. 3  Fst = – 0.0150

2 vs. 3  Fst =  -0.0027

Negative Fst (in red) is essentially the same as zero.  Thus, there is very little to no fraction of the total variation between these groups, virtually all within.  So – hey! – “more variation within than between” even in randomly picked individuals from single ethnic groups, exactly as I predicted (and which is consistent with simple common sense – something leftists lack).  Of course, this is not surprising (or shouldn’t be), comparing Nigerians to Nigerians there should not be a significant difference in variation between the groups, as the individuals are derived from the same population. BUT THIS IS EXACTLY THE POINT. When comparing races, we DO see a significant fraction of between group variation, because races are distinct and valid biological entities.  The fact that the between group variation in the inter-racial case is smaller than within group variation does not invalidate race – why would one imagine that human races would be so differentiated that you would have most of the variation between groups?  Most of that variation is random. One sees any significant Fst only when comparing different population groups, because they are distinct. The same pattern holds with dog breeds – more variation within than between (see below). In the Nigerian example presented here, there may be a lot of variation within groups, but that’s on an individual-to-individual level; the group in general is similar to itself as shown when arbitrarily broken up into sub-groups.

And of course, this individual-to-individual variation exists not only within groups but between groups, in fact between all individuals, and it does NOT in any way mean that members of a group are genetically more similar to members of other groups than they are to their own. The fact that group members are virtually ALWAYS more similar to same-group members has been shown (many times in fact and can be observed via private genetic testing – remember when Decode was giving ethnic similarity matches, even with 23andMe data?) – see here.

When looking at many markers at the same time, groups and the individuals within can be easily distinguished racially – see Edwards’ article for the logic there, and why Lewontin’s finding is a “fallacy” with respect to racial classification.

I would also like to point out that genetic variation is not the same as genetic difference.  Indeed, on a fundamental level, these concepts are not the same.  Degrees of variation is not the same as difference (or distance). If one were to catalog the types of ethnic populations extant in, say, New York City and San Diego, there would be differences.  One could clearly distinguish between the two – more Jews and Puerto Ricans and Dominicans and Caribbean Blacks in New York, and more Mexicans in San Diego, among other differences.  The populations of these two cities are distinct, and “distant” in their differences.  But the internal differences are even greater: consider the myriad ethnic types in NYC.  So, the ethnic variation within these cities is greater than that between, even though the two cities have highly distinct, easily classifiable populations, and these differences are not “trivial” but affect every aspect of life in those areas.

Back to the main point: if we apply the same SJW racial comments to the intra-group data, we’ll have to say that because Nigerians exhibit more variation within the group than between groups then there are no such thing as Nigerians, and yet at the same time the groups of Nigerians have low to Fst in comparison with each other, but significant Fst when compared to Whites and Asians.  So, at the same time, Nigerians do and no not exist as a group, a logical impossibility.  

And if, as the Left claims, Whites are more genetically different from each other than from Blacks, then White-White Fst should be greater than that of White-Black Fst, and one should see a considerable portion of the genetic variation in a White-White comparison to be between groups of Whites as opposed to within. 

[Note: In a logical sense, the leftist argument is absurd – they would clam that between group variation of the three White sub-groups would be great precisely because the amount of within group variation of the original White group is so large, and measure this with Fst, which compares the two.  But this is, again, the point: the claims of the Left are inherently and logically absurd, and when followed through to their conclusion leads one to a logical paradox – the greater the portion of within group variation then the greater the portion of between group variation when looking solely at that group.  On the other hand, Fst is a relative measure, and one can argue that the White-White comparison is qualitatively different from the logical perspective from the White-Black one. In either case, my approach achieves its goal – either the Left’s arguments are inherently illogical, OR, if you want to claim that their arguments are logical, I show in this post that the arguments are factually wrong, as the data yield the opposite results from leftist predictions].

The data for CEU Utah Whites:

Three arbitrary groups of CEU Whites

1 vs. 2  Fst =  -0.0094

1 vs. 3  Fst =   0.0072

2 vs. 3  Fst =  -0.0041

Again, minuscule to zero Fst (negative [red font] = zero). Once again, we observe the same “more within than between” pattern with arbitrary divisions of a group of humans.  Note that Fst is greatest with the inter-racial comparison (0.1718), precisely because races are valid biological entities with the greatest genetic distance between them (while Fst is not the best measure for distance, it does reflect differences in genetic distance, so is valid for such relative comparisons).

None of this should come as a surprise, since population genetics studies looking at Fst of different parts of a single country (indigenous natives) – such as, say, Germany or Italy, show relatively low Fst.  Nevertheless, it is useful to demonstrate that an arbitrary intra-population division not only mimics the racial finding (more variation within than between), but does so in a more extreme manner.

The preceding has been an appetizer; now we get to the main course.  The twin tenets of the radical Left view of Lewontin’s “finding” are:

1. Race (or even ethnicity) is an especially wrong classification scheme that (implied: specifically) results in “more genetic variation within than between” groups, because it artificially separates all the people of different “races” (leftist scare quotes) who are actually genetically similar.

2. Thus, the “more within than between” means that groups like Nigerians and Northwest European Utah Whites are more genetically similar to members of the other group than they are to members of their own group.

By now, we should know that this is nonsense, as is the claim that races don’t exist, but let’s continue to take this leftist farce at face value.  If these twin tenets are true, then arbitrarily creating multi-“racial” groups – say, random mixed groups each consisting of Nigerians and Utah Whites together in the same groups – would result in a larger Fst comparing these mixed groups, with relatively more variation between and relatively less within.  Or – let us be more charitable with all the leftist delusions and logical impossibilities.  Let us merely state that if the Left is correct, and that conceptions of race and ethnicity are meaningless due to the apportionment of genetic variation, then, at minimum, Fst comparisons of the mixed groups should be no less than that of between the racially defined groups.  That’s the most conservative interpretation of the Left, and the one that makes them seem less stupid and illogical.  What’s the data then?  Here it is (negative Fst again in red font):

Three arbitrary groups of mixed Nigerians and CEU Whites together

1 vs. 2  Fst =   0.0029

1 vs. 3  Fst =   0.0105

2 vs. 3  Fst =  -0.0047

This is crucially important. Mixing the two groups together has greatly reduced or eliminated Fst – it has essentially eliminated the between group genetic variation.  Here, virtually all the variation is within group.  This is a complete and perfect refutation of the extreme leftist (mis)interpretation of Lewontin’s “findings.”  The results from comparing variation between and within mixed race groups, contrasted to that obtained from monoracial groups, is exactly the opposite of what would obtain if leftist fantasies were correct.

One could continue playing around with genetic data in this manner, with larger data sets, random number generators to form groups, etc., but the point has already been established.  Thus, you can pick names randomly out of any diverse big city phonebook – New York for example – and use these random people to form groups, and if you would analyze the genetic variation of these random and arbitrary aracial groupings you will find more variation within than between AND a smaller Fst compared to real inter-racial comparisons.

Now, it can be – and should be – argued that the arguments and findings in this blog post are simple, common-sense, intuitive, even trivial.  OF COURSE random groups would have even more genetic variation within and OF COURSE racial groups will have a larger Fst, indicative of a larger share of variation between.  Of course races are real biological groups and of course the Left is wrong.  But given leftist hysteria and mendacity over race and genetics, the issue had to be formally demonstrated, which it was here.  It is unfortunate one must waste time “proving” things so obvious it is the equivalent of “the sky is blue” but so it goes in the modern world.

A comparison with the situation with dog breeds is also instructive.  There is much about dog breed genetics online, from both the Right and Left, and much of that is misleading; instead let’s read what an expert on the subject has to say, concentrating on the implications for Lewontinism:

The phenotypic diversity of the world’s 350 to 400 dog breeds is mirrored in their genetic diversity. Although most breeds have existed for less than two centuries, the level of diversity (FST) in dogs is about twice that found in humans (FST averages 0.28 among dog breeds).

So, we see that due to intense artificial selection, Fst between dog breeds is about twice of that between human races, despite the fact that many dog breeds are recent developments in evolutionary time.  Very well.  The most important fact that we observe here is that despite all of this intense artificial selection and the vast phenotypic differences between breeds, Fst for dog breeds is 0.28, meaning that the vast majority of the genetic variation – 72% in fact – is found within breeds; only 28% is between.  Consider the huge – existential in fact, defining the identities and utility of different dog types – marked heritable differences between dog breeds in physical appearance, physical capabilities, size, intelligence, and behavior and note that despite all these enormous differences there is still “more genetic variation within than between.”  More genetic variation within dog breeds than between!  What would the Left say?  Is the difference between a vicious Pit Bull and a placid Pug merely the figment of your imagination?  Does “more variation within than between” mean that the differences between a Chihuahua and a Mastiff are merely a “social construct?”  Do we claim that “dog breeds do not exist?”  Now consider again the vast differences between dog breeds and ponder the implications of the fact that human inter-racial between-group genetic variation reaches a full 50% of that between dog breeds.  Once again: the differences between humans is a full 50% of the enormity of difference between dog breeds that was derived from a regimen of constant intense and directed artificial selection. Racial differences are not only real, they are staggeringly large.

Let’s finish up by going back to the Wikipedia article on the Lewontin fallacy.

….biological anthropologist Jonathan Marks agrees with Edwards that correlations between geographical areas and genetics obviously exist in human populations, but goes on to note that “What is unclear is what this has to do with ‘race’ as that term has been used through much in the twentieth century—the mere fact that we can find groups to be different and can reliably allot people to them is trivial. Again, the point of the theory of race was to discover large clusters of people that are principally homogeneous within and heterogeneous between, contrasting groups. Lewontin’s analysis shows that such groups do not exist in the human species, and Edwards’ critique does not contradict that interpretation.

Typical Jewish flim-flam. Marks proposes an unachievable, unrealistic strawman definition of “race” so as to declare that it does not exist. Given the reality of unstructured (“random”) genetic variation that exists between any and all groupings of humans, it stands to reason that grouping by race will also show the same intra-group variation.  BUT THE GROUPS ARE STILL DIFFERENT AND MEMBERS OF THE SAME GROUP WILL ALWAYS BE MORE SIMILAR MEMBERS OF THEIR SAME GROUP COMPARED TO OTHER GROUPS.  That is what race is, no one says that a race has to consist of genetically homogeneous individuals.  Are families – apart from identical twins – genetically homogeneous?  Only in comparison to other families.  Each family will have considerable internal variation.  But they are different. Marks states “the mere fact that we can find groups to be different and can reliably allot people to them is trivial.”  So, he declares that the fact that humans can reliably be allotted to different groups – the essence of race – is trivial, before postulating his strawman version.  Well, Marks, the “trivial” differences lead to differences of phenotype that are acted upon by various forms of selection, thus affecting the underlying gene frequencies, and, hence, the adaptive fitness of the individuals in question. The underlying essence of life is natural selection and adaptive fitness based on genetic differences and kinship.  Thus, according to Marks, the fundamental basis of life on Earth – the genetic distinctiveness of organisms and their representation in subsequent generations – is “trivial.”  Genetic differences, no matter how ‘trivial,” can increase or decrease in frequency and thus constitute adaptive interests for evolved organisms, like humans.  To deny the fundamental meaning of this with misleading verbiage, to consider representation in the next generation is “trivial” is anti-science and anti-reality.  Note to leftists: the variation equivalent of halfway from a Chihuahua and a Mastiff is not a “trivial” amount of genetic variation.

The real translation from the likes of Marks: racial preservation of Whites is “trivial.”  That’s what it is all about, of course.  Nonsense about races having to be hermetically sealed clones completely variant from other clonal races is just Jewish meme wars against White ethnic genetic interests.

The view that, while geographic clustering of biological traits does exist, this does not lend biological validity to racial groups, was proposed by several evolutionary anthropologists and geneticists prior to the publication of Edwards critique of Lewontin.

Err…” geographic clustering of biological traits” (including gene frequencies) is precisely what race is, so if such clustering exists, race exists.  I suppose one can, like Marks or any other mendacious Jew (a redundancy) redefine “race” using unrealistic criteria so you can proclaim “race does not exist” but that is meaningless.  One can define human” in like manner.  Thus, a “human” is any nine foot tall hominid with naturally blue hair who has an IQ of 10,000.  Such individuals do not exist, hence there are no such thing as humans. QED.

In summary:

1. There is nothing special, defining, or “privileged” about race (or ethnicity with respect to “more genetic variation within than between.”  Any and all human groups or mixtures of groups, no matter how arbitrarily or randomly chosen will always exhibit the same pattern, because the pattern is due to individual human variation and that variation is present no matter how groups of humans are arraigned. Making a big deal of this “finding” when it comes to race derives from leftist sociopolitical motivations.

2. The “more variation within than between” in no way invalidates the race concept, as Edwards (and I) pointed out.  Even Marks concedes classification is possible; he just labels it “trivial” – and this subjective assertion is also motivated by leftist social and political beliefs.  The apportionment of genetic variation certainly does not invalidate genetic differences and similarities between groups, and the greater genetic distances between the major racial groups.

3. Strawman definitions of race implying that races have to be genetically homogeneous are ludicrous and also motivated by leftist concerns.  Given that genetic variation is randomly distributed among all people, such will as a matter of course be found within groups, including races.  However, Fst increases as we consider ever more distinct racial groups, as an increasing portion of the total genetic variation derives from between group differences.  Given the large totality of such differences, a consistent distinctive genome is sufficient to define biological races, along with the background of random variation.  And that’s not trivial.  An analogy would be an extremely important radio message, of life-dependent importance, that you are listening to among a larger degree of random noise, of static.  The static may be louder, but it is the message that is important, and by proper adjustments to your methodology, you can cancel out the static and listen to the message.  For humans, the message is nothing less than our adaptive fitness, the over-riding importance of genetic continuity, of genetic interests – the ultimate interests.