Measurement Error, Regression to the Mean, and Group Differences

Regression to the mean, RTM for short, is a statistical phenomenon which occurs when a variable that is in some sense unreliable or unstable is measured on two different occasions. Another way to put it is that RTM is to be expected whenever there is a less than perfect correlation between two measurements of the same thing. The most conspicuous consequence of RTM is that individuals who are far from the mean value of the distribution on first measurement tend to be noticeably closer to the mean on second measurement. As most variables aren’t perfectly stable over time, RTM is a more or less universal phenomenon.

In this post, I will attempt to explain why regression to the mean happens. I will also try to clarify certain common misconceptions about it, such as why RTM does not make people more average over time. Much of the post is devoted to demonstrating how RTM complicates group comparisons, and what can be done about it. My approach is didactic and I will repeat myself a lot, but I think that’s warranted given how often people are misled by this phenomenon.
New MQ paper

Kirkegaard, E. O. W. & Fuerst, J. (2016). Inequality in the United States: Ethnicity, Racial Admixture and Environmental Causes. Mankind Quarterly 56(4).

Previously, we looked at the association between overall state-level biogeographic ancestry (BGA) and overall state-level outcomes. It was found that European BGA relative to African and Amerindian BGA was associated with better outcomes. In this paper, the analysis is extended by looking at the state-level ancestry-outcome associations individually for black and Hispanic self-identified race-ethnicity (SIRE) groups. General socioeconomic factor (S) scores were calculated for US states by SIRE groups based on three indicators. The S factor loadings were generally stable across subgroup analyses and the factor scores were stable across factor analytic extraction methods (for the latter, almost all r’s ≈ 1). For Whites, Blacks and Hispanics, there were strong correlations between cognitive ability scores and S factor scores across states (r = .55 to .78; N = 28-50). This pattern also held when all data were analyzed together (r = .86, N = 115). Furthermore, the size of the Hispanic-White and Black-White S and cognitive ability gaps strongly correlated across states (r = .62 to .69; N = 36-37). Lastly, parasite prevalence did not plausibly explain SIRE gaps in cognitive ability because gaps were smaller in more parasite-rich states (combined analysis r = -.17, N = 91). We found that climatic and geospatial variables did not correlate strongly with cognitive ability and S scores when scores were decomposed by SIRE group, but did so at the total state level, even after statistically controlling for SIRE composition.

Philosophical Reflections on On Genetic Interest

I will leave a sum in my last will for my body to be carried to Brazil and to these forests… and this great Coprophanaeus beetle will bury me. They will enter, will bury, will live on my flesh; and in the shape of their children and mine, I will escape death. — Hamilton, 1991

Opening reflections

Through reproduction, living beings obtain immortality. This was the view of the ancients. All beings seek the divine, which is the eternal. For mortals, unending life can only be had through generation. While the individual particularity is doomed, through reproduction the general form can be perpetuated and a type of eternity can yet be grasped. In De Anima, Aristotle expresses the view thusly:

For any living thing … the most natural act is the production of another like itself, an animal producing an animal, a plant a plant, in order that, as far as it nature allows, it may partake in the eternal and divine. That is the goal towards which all things strive, that for the sake of which they do whatsoever their nature renders possible… Since then no living thing is able to partake in what is eternal and divine by uninterrupted continuance for nothing perishable can for ever remain one and the same, it tries to achieve that end in the only way possible to it[.]

In Plato’s Symposium, Diotima accounts for filial love likewise:

For among animals the principle is the same as with us, and mortal nature seeks so far as possible to live forever and be immortal. And this is possible in one way only: by reproduction… And in that way everything mortal is preserved, not, like the divine, by always being the same in every way, but because what is departing and aging leaves behind something new, something such as it had been… So don’t be surprised if everything naturally values its own offspring, because it is for the sake of immortality that everything shows zeal, which is Love.

Genetic Interest

A decade ago, Frank Salter published On Genetic Interests: Family, Ethnicity, and Humanity in an Age of Mass Migration (OGI). The book’s stated purpose was not to account for human behavior, “but rather to offer social and political theory about what individuals should do.” The book attempts to answer a theoretical question: “How would an individual behave in order to be adaptive in the modem world?” — where “adaptive” means maximizing the survival chances of the totality of one’s unique gene frequencies. In line with the book’s title, Salter concerns himself with individual, family, ethnic and species genetic stake. He concludes that a portfolio with a balanced investment in all of these is preferable. He asks, “Which [gene conserving] strategies are best?” And then replies that focusing exclusively on any one level of genetic interest is suboptimal. He concerns himself largely with “ethnic genetic interest” (EGI) for two reasons. First, reigning ideologies neglect it. They end up, as he notes, advancing species genetic interest (e.g., radical Christianity and humanism) and, when not, individual and family interest. And second, mass immigration presently threatens the existence, as coherent biocultural groups, of many ethnic groups.

Salter, both an ethnologist and political scientist in training, notes that he was motivated to write OGI after having discovered, with the help of anthropologist Henry Harpending, that the aggregate kinship shared by members of a typical ethnic or racial group, relative to random members of the species, “was typically 1000 times greater than” he originally anticipated. Prior to writing the book, he had been using van den Berghe’s theory of ethnic nepotism as a heuristic to understand ethnological findings. He wrote the book in light of his findings and the ongoing replacement-level immigration to the West. He felt that the biological impact of that process needed to be analyzed and discussed.
IQ and Permanent Income: Sizing Up the “IQ Paradox”

In his recent book Hive Mind economist Garett Jones argues that the direct effect of IQ on personal income is modest, and that most of the benefits of higher IQ flow from various spillover effects that make societies more productive, boosting everyone’s income. This, he says, explains the “IQ paradox” whereby IQ differences appear to explain a lot more of the economic differences between nations than within them.

Jones does not say in his book what he thinks the exact effect of IQ on personal income is, but on Twitter he has asserted that “Fans of g would do well to look at the labor lit: 1 IQ point predicts just 0.5% to 1.2% higher wages.” He has also said that, in terms of standardized effect sizes, IQ accounts for only about 10% of variance in personal income (a correlation of ~0.32).

The Measured Proficiency of Somali Americans

The discussion of the performance of African immigrants led by Chanda Chisala has been of unusually poor quality. As such, I thought that I might write a brief tutorial post on how to locate data and estimate differences in hopes that this will inspire better research practices and more rigorous debate. I will also elaborate on the Jensenist position and its predictions, as Chanda, and apparently many others, do not seem to have a good grasp of it at least in its quantified form.

Heritability of Racial and Ethnic Pride, Preference, and Prejudice

A while back, in “People in the Future Will Not Look Like Brazilians”, Razib suggested that the great amalgamation will stall because those who are inclined to out mix will do so, taking with them their xenophilic dispositions. The suggestion prompted a commenter to question whether there was any evidence that preferences for (racial) endogamy had, as seemingly presumed by Razib’s argument, a non-trivial genetic component. Apparently, there has been very little genetically informed research on this or closely related topics. Nonetheless, I was able to locate eight studies based on five independent samples which provided heritability estimates for some measures of national, ethnic, or racial pride, preference, or prejudice. The study results are summarized in the table below.

Alice Brues on Race

Bias in Measures of Implicit Racial Bias

It is claimed that implicit association tests, or IATs, reveal unconscious biases against racial and ethnic minorities and other stigmatized groups. The tests are simple and their results appear to be straightforward to interpret: if you are quicker to associate positive words (or other positive stimuli) with the non-stigmatized group (e.g., whites) and quicker to associate negative words with the stigmatized group (e.g., blacks), you have an implicit preference for the former and against the latter. Moreover, it has been shown that IAT scores are (modestly) related to arguably discriminatory behaviors. Given that the IAT scores of most people suggest that they are biased against stigmatized groups, it has been claimed that implicit biases explain discriminatory behaviors in the real world.

Hart Blanton, a long-term critic of various theoretical and methodological absurdities in the IAT paradigm, has written, with some colleagues, a paper challenging a key assumption of the IAT. Re-analyzing several published implicit bias studies, they found that the standard IAT scoring procedure will typically label as implicitly biased people whose observed behavior is neutral and unbiased. IAT researchers assume that individuals who associate positive and negative IAT stimuli with different groups with equal ease are unbiased, but the research by Blanton et al. suggests that such individuals tend to be biased in favor of the stigmatized group. In other words, the zero point of the IAT scale is not associated with behavioral neutrality.

Nature of Race (Published)

Below is an expanded and much improved rewrite of a draft which I had posted last year — improved thanks to the helpful commentary of Davide Piffer, Emil Kirkegaard, Kevin MacDonald, Peter Frost, Meng Hu, and others. As for the work, the intent was to  clarify the concept of race, understood from the perspective of natural history, so to render the term which describes it inessential. It is hoped that the piece will also clarify the purpose of this blog, the focus of which is human varieties, of which races as constant varieties and natural divisions are but subtypes.

Fuerst, J. (2015). The Nature of Race: the Genealogy of the Concept and the Biological Construct’s Contemporaneous Utility. Open Behavioral Genetics.

Abstract: Racial constructionists, anti-naturalists, and anti-realists have challenged users of the biological race concept to provide and defend, from the perspective of biology, biological philosophy, and ethics, a biologically informed concept of race. In this paper, an onto-epistemology of biology is developed. What it is, by this, to be “biological real” and “biologically meaningful” and to represent a “biological natural division” is explained. Early 18th century race concepts are discussed in detail and are shown to be both sensible and not greatly dissimilar to modern concepts. A general biological race concept (GBRC) is developed. It is explained what the GBRC does and does not entail and how this concept unifies the plethora of specific ones, past and present. Other race concepts as developed in the philosophical literature are discussed in relation to the GBRC. The sense in which races are both real and natural is explained. Racial essentialism of the relational sort is shown to be coherent. Next, the GBRC is discussed in relation to anthropological discourse. Traditional human racial classifications are defended from common criticisms: historical incoherence, arbitrariness, cluster discordance, etc. Whether or not these traditional human races could qualify as taxa subspecies — or even species — is considered. It is argued that they could qualify as taxa subspecies by liberal readings of conventional standards. Further, it is pointed out that some species concepts potentially allow certain human populations to be designated as species. It is explained why, by conventional population genetic and statistical standards, genetic differences between major human racial groups are at least moderate. Behavioral genetic differences associated with human races are discussed in general and in specific. The matter of race differences in cognitive ability is briefly considered. Finally, the race concept is defended from various criticisms. First, logical and empirical critiques are dissected. These include: biological scientific, sociological, ontological, onto-epistemological, semantic, and teleological arguments. None are found to have any merit. Second, moral-based arguments are investigated in context to a general ethical frame and are counter-critiqued. Racial inequality, racial nepotism, and the “Racial Worldview” are discussed. What is dubbed the Anti-Racial Worldview is rejected on both empirical and moral grounds. Finally, an area of future investigation – the politics of the destruction of the race concept – is pointed to.

Keywords: natural division, race, biology



I. Biology – A Philosophical Clarification…………………………………………..………………………………..……..5
I-A. Existing Views: Confusions Abound
I-B. Biological Concepts in General
I-C. The Validity of Biological Concepts
I-D. Biological Kinds
I-E. Natural Biological Divisions
I-F. Races as Natural Biological Divisions
I-G. The Intraspecific Natural Division as Type of Biological Variation
I-H. The Natural Division as a Taxonomic Unit
I-I. Natural Divisions and Intraspecific Variation with Regards to the Subspecies Category
I-J. Biologically Meaningful Race Concepts
I-K. Biological Reality
I-L. Biologically Important Differences
I-M. Concepts of Biological Race

II. The General Biological Race Concept………………………………………………………..………………..……..25
II-A. The Genealogy of the Concept
II-B. Semantic Complexities and the Evolution of the Race Concept
II-C. Biological Race
II-D. What the Core Biological Race Concept Does Not represent
II-E. Races, Clines, Clusters?
II-F. Clarification on the Meaning of “Arbitrary” and “Objective” in Context to Natural Divisions
II-G. Regarding Different Definitions of Biological Race: What Races Need Not Be
II-H. Genomic-Genealogical Complications
II-I. Estimated Genomic Similarity: Some Ambiguity
II-J. Race: Mixed and Undifferentiated
II-K. Essential and Cluster classes; Fuzzy and Discrete Sets
II-L. Sociological Clarifications

III. The Ontology of Biological Race……………………………………………….……………………………………..……62
III-A. Other Defenses of Biological Race
III-B. Biological Races and Biological Reality
III-C. Thin Biological Racial Essentialism

IV. The Races of Man……………………………………………………………………………………………………………………81
IV-A. A Very Brief Historical Review
IV-B. Human Biological Races and Scientific Consensus
IV-C. Racial Classifications and Biological Race Concepts
IV-D. Traditional human Races
IV-E. THRs and Biologically Objective Races
IV-F. THRs and Migration, Intermixing, and Ancient Admixture
IV-G. THRs and Cluster Discordance
IV-H. THRs and Taxonomy
IV-I. THRs and Subspecies
IV-J. Are There Human Species?
IV-K. “Significant” Racial Differences
IV-L. Human Biodiversity (HBD) and Society
IV-M. Race and Intelligence

V. Critique of Anti-Biological Race Arguments………………………………….…………………………………….126
V-A. Anti-Biological Arguments
V-B. Biological Scientific Arguments
V-C. Sociological Arguments
V-D. Unnaturalistic Arguments and the Numbers Game
V-E. Onto-epistemology Arguments
V-F. Semantic Arguments
V-G. No-True-Race Arguments
V-H. Teleological Argument: The Future of Race
V.I. Can a Good Argument be Made Against (the) Race (concept)?

VI. A Troublesome Inheritance?…………………………………………………………………………………………………148
VI-A. The Social Destruction of a Biological Reality
VI-B. A Not So New Morality for Race
VI-C. The Moral Critiques: Arguments based on Outcome Differences
VI-D. The Moral Critiques: Arguments based on Racial Classification and Identity
VI-E. The Moral Critiques: Arguments based on Racial Favoritism
VI-F. The Moral Critiques: Arguments based on the “Racial Worldview”





Regional Admixture and Aptitude in Colombia

Emil and I set out to determine if regional variation in racial ancestry could (statistically) explain regional variation in cognitive ability. To keep things simple, we have limited focus to the Americas, which contain primarily trihybrid populations and for which there is a decent amount of admixture data. The results so far align with predictions.  Both across nations and across regions within the U.S., Brazil, and Mexico, European ancestry positively correlates with regional-level cognitive ability. In contrast, both African and Amerindian ancestry negatively so correlates. The broader importance of the project is that it involves the construction of an expansive data set which allows for the statistical controlling of continental lineage and associated factors (genes + deep culture), ones which presently confound many analyses. This data set will hopefully allow one to uncover regional and national level factors which are not tangled with ancestry. They must exist. For example, we find that regional levels of European ancestry are associated with better outcomes in both the U.S. and Brazil but also that there is a substantial between nation effect that can not be explained by factors correlated with continental ancestry.


Here, I will discuss a new analysis involving Colombia. Colombia is marked by extensive spatial variation in Colombia2ancestry.  The admixture map on the left copied from Ruiz-Linares et al. (2014) and the ethnic map on the right taken from Rodriguez-Palau et al. (2007) roughly capture the lay of the land. African admixture is concentrated along the Pacific and Caribbean coast, European admixture is highest in the north and central interior region, and Amerindian admixture is concentrated in the east and south. This ancestral variation allows for a test of our general model.


I computed the variables as follows:


AdmixtureColombia1:  Estimating regional admixture for Colombia’s 32 departments plus the capital was not without difficulty since existent studies provide admixture data for only half of the departments. Problematically, specific estimates for the eastern and southeastern departments, which are reported to have high Amerindian components were not available. Nonetheless, we were able to construct three sets of admixture estimates. First, 18 departmental + capital estimates were taken from Salzano and Sans’ (2014) compilation. The ancestry ratios from Salzano and Sans’ (2014) two main sources correlated at 0.9, so we felt that using the combined estimates was justified. Second, missing values were filled in based on regional values and based on Ruiz-Linares et al.’s (2014)  and Rodriguez-Palau et al.’s (2007) maps. For example, estimates for Caribbean-Pacific departments were averaged and used to fill in missing data for other departments in this region. In context to the U.S., this would be akin to filling in South Carolina values using the average of the Deep South ones. Third, admixture was estimated using ethnic identity data from the 2005 census in conjunction with average ethnoracial admixture percents as reported in all available studies. The ethnoracial admixture percents came out to as follows:


The computation methods are detailed more precisely in the excel file.

Cognitive ability: For cognitive scores, the Colombian national SABER exam scores were used.  The average of the 2003 and 2005 grades 5 and 8 math and reading regional scores strongly correlated with the average of the 2012 and 2014 scores (about 0.85). The scores were on different metrics, moreover standard deviations were not available for the 2003 and 2005 scores (given the source used), so, in the end, the 2012 and 2014 average scores were employed.

Other variables: 2010 HDI scores were taken from Machado (2011). Ethnic identity percents were taken from the 2005 census. Population was taken from the census via Wikipedia.

Results:  I uploaded the excel file to facilitate future investigations. For the analyses reported below, in line with the general methods adopted for the meta-project, I excluded the capital and weighted by SQRT(population). Salzano and Sans’ (2014) admixture data showed only a weak negative correlation for Amerindian ancestry; this was because, as noted, data was missing for the most Amerindian parts of the country. When data was filled in, the association became significantly negative as predicted. It seems that the negative results are driven by the low scores in 5 districts (Amazonas, La Guajira, Guainía, Vaupés, and Vichada) all of which have high percents of self identifying indigenous and large reservations.


The results immediately above were replicated using the ethnic-admixture data.


Generally, European ancestry was non-trivially associated with cognitive ability (shown below) and with HDI (not shown). These results held regardless of which admixture variable was employed; they were largely driven by the strong negative association between regional outcomes and African ancestry.  It is interesting that regional Amerindian ancestry was not associated with regional ability in the case of Salzanploteuadmixo and Sans’ (2014) admixture estimates. While on the national level, Amerindian  ancestry negatively correlated with ability, as areas which were heavily populated by self-identifying Indigenous individuals did poorly, one might expect a more constant effect, one that would show up in Salzano and Sans’ (2014) restricted data set, which included only interior and coastal departments. The lack of association might have been due to the unreliability of the data, the specific samples analyzed, or the specific sampling of interior and coastal departments. Possibly, Amerindian ancestry is not negatively correlated with regional outcomes outside of largely indigenous regions. A determination of the matter will have to wait for the publication of more Colombian regional admixture data.


