# Human Varieties

## IQ and psychometrics

• On the Continued Misinterpretation of Stereotype Threat as Accounting for Black-White Differences on Cognitive Tests by Tomeh & Sackett. A common misconception about stereotype threat, and a major reason for the popularity of the idea, is that in the absence of threat in the testing situation, the black-white IQ gap is eliminated. This is of course not the case but rather the experimental activation of stereotypes has (sometimes) been found to make the black-white gap larger than it normally is. In an analysis of early writings on stereotype threat, Sackett et al. (2004) reported that this misinterpretation was found in the majority of journal articles, textbooks, and popular press articles discussing the effect. In the new article, Tomeh and Sackett find that more recent textbooks and journal articles are still about equally likely to misinterpret stereotype threat in this way as to describe it correctly. I had hoped that the large multi-lab study of the effect would have put the whole idea to bed by now, but that study has unfortunately been delayed.
• Invariance: What Does Measurement Invariance Allow us to Claim? by John Protzko. In this study people were randomized to complete either a scale aiming to measure “search for meaning in life”, or an altered nonsense version of the same scale where the words “meaning” and “purpose” had been replaced with the word “gavagai”. The respondents indicated their level of agreement or disagreement with statements such as “I am searching for meaning/gavagai in my life”. Both groups also completed an unaltered “free will” scale, and confirmatory factor models where a single factor underlay the “meaning/gavagai” items while another factor underlay the “free will” items were estimated. The two groups showed not only configural but also metric and scalar invariance for these factors. Given the usual interpretation of factorial invariance in psychometrics, this would suggest that the mean difference observed between the two groups on the “meaning/gavagai” scale reflects a mean difference on a particular latent construct. The data used were made available online, and I was able replicate the finding of configural, metric, and scalar invariance, given the ΔCFI/RMSEA criteria (strict invariance was not supported). The paradox appears to stem from the fact that individual differences on the “meaning in life” scale mostly reflect the wording and format of the items as well as response styles rather than tapping into a specific latent attitude which may not even exist, given the vagueness of the “meaning in life” scale. I found that I could move from scalar invariance to a more constrained model where all of the “meaning/gavagai” items had the same values for loadings and intercepts without worsening the model fit. So it seems that all the items were measuring the same thing (or things) but what that is is not apparent from a surface analysis of the items. Jordan Lasker has written a long response to Protzko, taking issue with the idea that two scales can have the same meaning without strict invariance as well as with the specific fit indices used. While I agree that strict invariance should always be pursued, Protzko’s discovery of scalar invariance using the conventional fit criteria is nevertheless interesting and requires an explanation. I think Lasker also makes a mistake in his analysis by setting the variances of the “meaning in life/gavagai” factors both to 1 even though this is not a constraint required for any level of factorial invariance. The extraneous constraint distorts his loadings estimates.
• Effort impacts IQ test scores in a minor way: A multi-study investigation with healthy adult volunteers by Bates & Gignac. In three experiments (total N = 1201), adult participants first took a short spatial ability test (like this one) and were randomly assigned either to a treatment group or to a control group. Both groups then completed another version of the same test, with the treatment group participants promised a monetary reward if they improved their score by at least 10%. The effect of the incentives on test scores was small, d = 0.166, corresponding to 2.5 points on a standard IQ scale. This suggests that the effect size of d = 0.64 (or 9.6 points) reported in the meta-analysis by Duckworth et al. is strongly upwardly biased, as has been suspected. A limitation of the study is that the incentives were small, £10 at most. However, the participants were recruited through a crowdsourcing website and paid £1.20 for their participation (excluding the incentive bonuses), so it is possible that the rewards were substantial to them. Nevertheless, I would have liked to see if a genuinely large reward had a larger effect. Bates & Gignac also conducted a series of big observational studies (total N = 3007) where the correlation between test performance and a self-report measure of test motivation was 0.28. However, this correlation is ambiguous because self-reported motivation may be related to how easy or hard the respondent finds the test.

## Education

• The Coin Flip by Spotted Toad. This is an illuminating commentary on the Tennessee Pre-K study (on which I commented here) and the difficulty of causal inference in long-term experiments.
• Do Meta-Analyses Oversell the Longer-Term Effects of Programs? Part 1 & Part 2 by Bailey & Weiss. This analysis found that in a meta-analytic sample of postsecondary education RCTs seeking to improve student outcomes, trials that reported larger initial effects were more likely to have long-term follow-up data collected and published. While this could be innocuous, with more effective interventions being selected for further study, it could also simply mean that studies more biased to the positive direction by sampling error were selected. So when you see a study touting the long-term benefits of some educational intervention, keep in mind that the sample may have been followed up only because the initial results were more promising than in other samples subjected to the same or similar interventions.
• An Anatomy of the Intergenerational Correlation of Educational Attainment -Learning from the Educational Attainments of Norwegian Twins and their Children by Baier et al. Using Norwegian register data on the educational attainment of twins and their children, this study finds that the intergenerational correlation for education is entirely genetically mediated in Norway. The heritability of education was about 60 percent is both parents and children, while the shared environmental variance was 16% in parents and only 2% in children. This indicates that the shared environment is much less important for educational attainment in Norway than elsewhere (cf., Silventoinen et al., 2020), although this is partly a function of how assortative mating modeled.

## Genetics

• Polygenic prediction of educational attainment within and between families from genome-wide association analyses in 3 million individuals by Okbay et al. This is the newest iteration of the educational attainment GWAS by the SSGAC consortium, now with a sample of three million people. It was published today and I have only skimmed it. The number of SNPs identified is about 4,000 now, up from 1,300 in the previous GWAS, while the $R^2$ increased from 11–13% to 12–16%, depending on the validation sample. They also conclude that there are no common SNPs with substantial dominance effects for educational attainment, underlining the validity of the additive genetic model. The within-family effect sizes are about 56% of the population effect sizes for educational attainment, while the same ratio is 82% for IQ and more than 90% for height and BMI. The discrepancy between the within-family and population estimates is probably mostly due to indirect genetic effects (“genetic nurture”) and assortative mating. Replicating SNP effects from the previous, smaller education GWAS sample, they find that 99.7% of the SNP effects have matching signs in the new data, and that 93% are significant at the 1% level or lower, which fits theoretical predictions well (it seems that the GWAS enterprise has vindicated the much-derided null hypothesis significance testing paradigm).
• Cross-trait assortative mating is widespread and inflates genetic correlation estimates by Border et al. The genetic correlation is a statistic measuring the extent to which genetic effects on two different traits are correlated. It is easy enough to calculate, but not easy to interpret because while the simplest interpretation is pleiotropy, several other causal and non-causal explanations are possible. This paper suggests that many genetic correlations are non-causal and result from cross-assortative mating, e.g., smarter than average women preferring to have children with taller than average men, which leads to a genetic correlation between IQ and height genes in the next generation even if height genes have no effect on IQ nor IQ genes on height. Among other findings, the paper suggests that the importance of the general factor of psychopathology has been overestimated due to a failure to consider cross-trait assortative mating.
• Modeling assortative mating and genetic similarities between partners, siblings, and in-laws by Torvik et al. This is a nice example of using psychometric methods to infer latent genetic parameters.
• Behavioral geneticist Lindon Eaves has died. He was one of the major creative forces behind the methodology of modern twin and family studies. I did not know that he was also an ordained Anglican priest. You do not see many men of the cloth in science these days (or at least their creed is rather different now). Eric Turkheimer says that he never heard Eaves utter an illiberal word, but I do notice some forbidden literature on his bookshelf at the first link.

## Miscellaneous

• More waves in longitudinal studies do not help against initial sampling error by Emil Kirkegaard. Speaking of James Heckman, he has published another one of his endless reanalyses of the Perry Preschool study. Emil has a fun takedown of this absurd enterprise.
• Assortative Mating and the Industrial Revolution: England, 1754-2021 by Clark & Cummings. In another installment of Greg Clark’s studies into the persistence of social status across generations, he has apparently found a constant, latent status correlation of 0.80 between spouses in England over the last few centuries. This suggest that grooms and brides matched tightly on underlying educational and occupational abilities even when higher education was rare and female participation in the labor market was limited. I have previously commented briefly on Clark’s work and the role of assortative mating in it here

## Genetics

• The “Golden Age” of Behavior Genetics? by Evan Charney. The author is a political scientist best known for his anti-hereditarian screeds, of which this is the latest. He likes to discuss various random phenomena from molecular biology, describing them as inscrutably complex, a hopeless tangle in the face of which genetic analyses are futile. Unfortunately, his understanding of the statistical models designed to cut through that tangle is very limited. For example, he endorses Eric Turkheimer’s howler that genome-wide association studies are “p-hacking”, and makes a ridiculous argument about GWAS findings being non-replicable (p. 8)–he does not appear to know, among other things, that statistical power is proportional to sample size (Ns in the studies he cites range from ~100k to ~1000k), that the p-value is a random variable, or that SNP “hits” are subject to the winner’s curse (he cites but evidently never read Okbay et al., 2016 and Lee et al., 2018, wherein it is shown that GWAS replication rates match theoretical expectations). He seeks to identify and amplify all possible sources of bias that could inflate genetic estimates, while ignoring biases in the opposite direction (e.g., the attenuating effect of assortative mating on within-sibship genomic estimates). Often the article is weirdly disjointed, e.g., Charney first discusses how sibling models have been used to control for population stratification, and then a couple of pages later says that it is impossible to know whether differences in religious affiliation are due to heritability or stratification. All in all, the article is a good example of what Thomas Bouchard has called pseudo-analysis.
• Neither nature nor nurture: Using extended pedigree data to elucidate the origins of indirect genetic effects on offspring educational outcomes by Nivard et al. Contra naysayers like Charney, we are in the midst of a genuine golden age in behavior genetics. The underlying reason is the abundance of genomic data, which has spurred the development of so many new methods that it is hard to keep up with them. This preprint is the latest salvo in the debate about indirect genetic effects. Previous research has found indirect parental genetic effects in models where child phenotypes are regressed on child and parent polygenic scores. This study refines the design by doing the regression on adjusted parental polygenic scores that capture the personal deviations of parents’ scores from the mean scores of the parents and their own siblings. This refined design finds scant evidence for indirect parental genetic effects on children’s test scores, suggesting instead that apparent indirect effects are grandparental or “dynastic” effects of some sort. I think assortative mating is the most likely culprit. A limitation of this study is that even with a big sibling sample, the power to discriminate between different models is not high. Moreover, the study does not actually test the difference between the βs of the sibship and personal polygenic scores, and instead reasons from differences in significance, which is bad form.
• The genetics of specific cognitive abilities by Procopio et al. This impressively large meta-analysis finds the heritability of specific abilities to be similar to that of g. That may be the case although most measures of non-g abilities in the analysis are confounded by g. They can formally separate g and non-g only in the TEDS cohort which has psychometrically rather weak measures of g.

## IQ

• Ian Deary and Robert Sternberg answer five self-inflicted questions about human intelligence. The two interlocutors in this discussion are mismatched: Deary is the most important intelligence researcher of his generation, known for his careful, wide-ranging empirical work, while Sternberg is one of the greatest blowhards and empty suits in psychology, known for generating mountains of repetitive, grandiose verbiage and for his disdain for anything but the most perfunctory tests of the theoretical entities that proliferate in his writings. Sternberg’s entries provide little insight, but there is some comedy in first reading his bloviations and then Deary’s courteous but often quietly savage responses. Deary emphasizes the value of establishing empirical regularities before or even instead of formulating psychological theories; notes the ubiquity of the jangle fallacy in cognitive research; and argues that cognitive psychological approaches have not generated any reductionist traction in explaining intelligence. According to Deary, a hard problem in intelligence research is one of public relations, that is, getting “across all the empirical regularities known about intelligence test scores”, the establishment of which has been “a success without equal in psychology.”
• More articles by Stephen Breuning that need retraction by Russell Warne. Stephen Breuning is an erstwhile academic psychologist who was caught fabricating data on a large scale. He received the first US criminal conviction for research fraud in 1988. Nevertheless, many of his publications have not been retracted and continue to be cited, e.g., in the influential meta-analysis of the effects of test motivation on IQ by Duckworth et al. (2011). Warne reviews four of Breuning’s unretracted studies and identifies a number of inconsistencies and implausibilities that point to data fraud. It might be useful to further analyze these studies with GRIM, SPRITE, and the like.

## Sex and race

• Sex differences in adolescents’ occupational aspirations: Variations across time and place by Stoet & Geary. More evidence for the gender equality paradox which postulates that sex differences are larger in wealthier and freer societies because heritable sex differences are suppressed in poorer societies where individuals have less choice.
• Why Are Racial Problems in the United States So Intractable? by Joseph Heath. Most modern societies have dealt with ethnic and racial diversity either by trying to integrate minorities to the majority population, or by recognizing the separateness of minorities and devolving political power to them. Some countries have judged the success of these efforts through the lens of equal opportunity, while others have sought outcome equality. Heath argues that race problems involving African-Americans are so intractable and acrimonious because there is no agreement on whether integration or separatism should be pursued, nor on how success and failure in racial affairs are to be judged. He manages to squeeze a good deal of analytic juice from this simple model while avoiding “bad actor” explanations which attribute all racial problems either to white malevolence or black incompetence. Money quote: “[T]he best way to describe the current American ap­proach to racial inclusion would to be to say that it is attempting to achieve Singaporean outcomes using Canadian methods and legal frameworks.”

I will try to get in the habit of collecting the most interesting studies, articles, posts, etc. related to human biodiversity in a monthly post, together with some commentary. The links are not necessarily to brand-new stuff; they are just what I happened to come across recently. Continue reading

Gregory Connor and I submitted the paper, “Linear and partially linear models of behavioral trait variation using admixture regression,” to MDPI’s Behavioral Sciences. This is a methodological paper explicating & proposing some modifications to the frequently used – across hundreds of papers – admixture regression method. We illustrated this method and our proposed tweaks using the ABCD cohort. This manuscript was peer-reviewed by three reviewers, accepted, proof-edited, paid for, but not published. Breaking with MDPI’s clearly outlined protocol, the editor of Behavioral Sciences – who I am fairly sure has now blacklisted me — sent it to a mysterious and seemingly not particularly acute 4th “reviewer”. This “reviewer” argued that the paper was “racist” and based on an “outdated” method. We were not given a chance to respond. And the opinions of the original three reviewers, whom we patiently replied to and made revisions for, were discarded.

You might wonder whether this 4th “reviewer” caught a serious methodological error – or even a substantive one. Nope. Instead, he argued that admixture regression – frequently used, since the early 2000s by numerous geneticists, genetic epidemiologists, medical researchers, and so on – is an “outdated approach (more of the 19th century)”. He kept repeating that the paper was about an outdated “biological concept” of race, when it concerned the relation between traits, genetic ancestry, and self-identified race/ethnicity. To note, typical MDPI reviews are not this ill-conceived and incoherent.

To let you judge if this post-hoc “review” had any merit, I provided this full comment along with my point-by-point empty-chair reply. Since the paper already passed peer-review and was accepted by MDPI, but not published for obvious political reasons, Greg and I have decided to publish it as a chapter in a forthcoming book. I usually do not publish reviews. However since I do not plan to have this paper peer-reviewed yet again, publishing the post-hoc commentary is warranted. Moreover, I usually do not speculate on motives, but it should be noted that, according to the editor, our post-hoc commenter was a knowledgeable geneticist. That fact, with the implication that the commenter understood the technique and literature, suggests that this was a hit job, with the goal of simply persuading the editor to cancel the paper. On the other hand, the commentary does read as if the “reviewer” was either clueless or was just trying to rationalize moral outrage.

“Peer-review” #4.

R4: Connor and Fuerst (here, C&F) proposed a new test that measure how differences in racial identity affects trait variation. They apply their variable to neuropsychological data collected by the Adolescent Brain Cognitive Development (ABCD) study and report that there exists a genetic component to neuropsychological traits and that there is a variation in the performance between different racial groups.

Empty chair reply: As we clearly explained in the introduction, admixture regression is commonly used in genetic epidemiology. Over the last two decades, hundreds of papers have been published using this technique by hundreds of well published geneticists, genetic epidemiologists, medical researchers and so on. In this paper, we explicate the underlying statistical model and propose some improvements to this frequently used technique.

R4: I found this paper unfounded, misleading, dishonest, and outdated, i.e., racist.

Empty-chair reply: Did you get your 30 pieces of silver for this hit job?

R4: The authors are missing some important advances in the field of population genetics. They used outdated terms (races) and cite no literature to support their racial perception.

Empty chair reply: You clearly did not understand the paper. We explicitly contrasted self-identified race/ethnicity (SIRE) with genetic ancestry. The former is posited as tagging environmental effects while the latter is posited as tagging genetic effects: Thus, we note: “Admixture regression leverages these two data sources, self-identified race or ethnicity (SIRE) and genetically-measured admixture proportions, to decompose trait variation correspondingly.” In line with ASHG (2018) we contrast self-identified race/ethnicity, a social construct, with genetic ancestry, a genetic construct. As ASHG (2018) notes:

Although a person’s genetics influences their phenotypic characteristics, and self-identified race might be influenced by physical appearance, race itself is a social construct. Any attempt to use genetics to rank populations demonstrates a fundamental misunderstanding of genetics. The past decade has seen the emergence of strategies for assessing an individual’s genetic ancestry. Such analyses are providing increasingly accurate ways of helping to define individuals’ ancestral origins and enabling new ways to explore and discuss ancestries that move us beyond blunt definitions of self-identified race. [Emphasis added]

R4: Their assumptions about human races are from the previous century. They consistently imply that their usage of racial categories used in social sciences have genetic merit, that’s racism and, of course, wrong. It is not surprise that they cannot find papers to support their genetic model, because it is unfounded.

Empty chair reply: See above. Also, we cited a plethora of examples of papers using admixture regression in the introduction and conclusion.

R4: The authors model individuals as races + admixture, but the emphasis is on races, as admixture is simply defined as combination of more than 1 race. This is a very ignorant modelling of human populations that ignores the vast literature on the subject. The genetic analyses results are skewed to reproduce their perceived racist model.

Empty chair reply: No. Genetic ancestry is not a combination of more than one SIRE group. And there are literally hundreds of papers which employ admixture regression analysis using the same major ancestry groups we used. The ABCD consortium, itself, even has their own genetic ancestry variables (European, African, Amerindian, and East Asian ancestry). We only recomputed these so to include South Asian ancestry

R4: Throughout the manuscript, the authors omit results (i.e., graphs and code) necessary to evaluate their code.

Empty chair reply: We provided the code in the supplemental files. Either you did not check or the editors did not forward this to you.

R4: 1. Where is the support to: “Many diverse national populations descend demographically from isolated continental groups within a few hundred years.”? where did you get it from? where is the scientific reference? ancient DNA study show that mixture is the norm rather than the exception.

Empty chair reply: Admixture within continental groups obviously doesn’t preclude isolation between them.

R4: 2. “Modern genetic technology can measure with high accuracy the proportion of an individual’s ancestry associated with these continental groups.” – yes, modern tests can predict continental origins with high accuracy, but where is the citation?

Empty chair reply: This is from ASHG’s positional statement on this topic.

R4: 3. “In many culturally diverse nations, most individuals can reliably self-identify as members of one or more racial or ethnic groups.” – nonsense. All self-reports are biased. No serious study uses self report ancestry. Of course, the authors must believe in that, because their entire method rests on this connection, but it is untrue. Unlike this unsupported claim of the authors, there are plenty of papers that prove otherwise :
Self-reported ancestry may not be a reliable method to reduce the possible impact of population stratification in genetic association studies of outbred populations, such as in the United States.
https://pubmed.ncbi.nlm.nih.gov/8761246/
https://pubmed.ncbi.nlm.nih.gov/10797159/
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2350912/
Read: https://www.nature.com/articles/s41408-018-0132-1 to see the differences between self-reported ancestry and genomic ancestry, calculated very accurately.

Empty chair reply: We did not say that SIRE is a reliable index of genetic ancestry – after all, the whole method is based on the contrast between SIRE and genetic ancestry. Rather, we said that SIRE is a reliable index of itself, in the sense that people who identify as a particular SIRE group at one time identify the same way at another. Thus it reliably tracks a cultural-environment.

R4: 4. Poor modeling: How can self-identified people report their % of ancestry? Hardly anyone mixed is 50%:50%.

Empty chair reply: How much did you bother to read beyond the abstract?

R4: 5. “The genotyped DNA samples are carefully decomposed into admixture proportions of geographic ancestry” – no. they are decomposed into a mixture of racial groups that the authors created after forcing the genetic data to show races. Races and admixture are two different concepts.

Empty chair reply: Translation: “The authors computed genetic ancestry in a standard way and entered this in a regression model with SIRE as have so many other researchers. This is bad: Reasons.”

R4: 6. “In most applications of admixture regression, individuals’ racial or ethnic group identities will have statistical relationships with individuals’ genetically identiﬁed geographic ancestries” – No! where is the evidence? Why this paper is completely devoid of reference for any fundamental assumption of the model. What does it mean “statistical relationships”?

Empty chair reply: Yes! Self-identified race/ethnicity generally, but imperfectly correlates with genetic ancestry. This just restates ASHG’s (2018) positional statement. But since you don’t even understand the meaning of “statistical relationship” what can one expect?

R4: 7. “The objective of admixture regression is to decompose trait variation into linear components due to genetic ancestries and linear components due to racial/ethnic group related effects” – unlike admixture mapping techniques, which the author misleading cite as a parallel method, their method is not designed to link a loci with a trait, but rather link conditions with races with a biological support to the racial concept.

Empty chair reply: Whew!… admixture regression analysis is not ‘our’ method. And this frequently used method is not “designed” to provide “biological support to the racial concept”: it explicitly takes advantage of social constructive aspects of racial identification in admixed populations. Do you need this point illustrated with a crayon?

R4: 8. “We show that the admixture regression model can be viewed as a statistically feasible simpliﬁcation of this linear polygenic index model, in which proportional ancestries serve as statistical proxies for ancestry-related genetic differences.” – proportional ancestries serve as statistical proxies for ancestry-related genetic differences? You calculate ancestries from genetics, this statement means nothing. This is a tautology.

Empty chair reply: So now you finally realize that we used genetic ancestry. But, of course you are still wrong, since local ancestry is a subset of global ancestry. The statement reads: in our model, [global] ancestries serve as statistical proxies for [local] ancestry-related genetic differences.

R4: 9. “an assumption of random mating across ancestral populations” – really? where is the reference for this assumption?

Empty chair reply: Unsurprisingly, no other reviewers had a problem interpreting this statement. To spell it out: It is an assumption made by the theoretical model – thus a limitation – not an assumption about the world.

R4: 10. “A key assumption of the admixture regression model is that admixture arises from recent random mating between the previously geographically-isolated ancestral groups.” – of course no reference, because it is untrue. Your key assumption is not supported by reality.

Empty chair reply:… we restate that random mating is a theoretical assumption of the commonly used admixture regression model which may or may not be violated to a practically significant extent in the real world.

R4: 11. “Many individuals self-identify as belonging to two or more racial or ethnic groups” – you of course model those groups as RACES, by the biological definition, i.e., groups that are completely separate from one another and didn’t mix. Again, where is your evidence (from this century)? Surely you realize that the racial groups that you used do not satisfy this condition, south and east Asians are closer to each other than to Africans, but you ignore that. There are relationships between those groups, it’s not a star topology.

Empty chair reply: We explicitly do not model self-identified “racial or ethnic groups” as “groups that are completely separate from one another and didn’t mix”! If they didn’t mix, we wouldn’t have admixture for our admixture regression! Nowhere in this paper do we talk about “biological races”. We talk about “genetic ancestry” and SIRE. Perhaps you could try reading our actual paper…

R4: 12. The author removed 80% of the genetic data. They claim that they follow the instruction of ADMIXTURE, but there are no such instruction or recommendation.

Empty chair reply: 100,000 random SNPs…. 100,000 random SNPs…

R4:

13. They force the genetic data into 5 racial categories to fit their made up racial categories. They never show a single result of the genetic analyses. we don’t see the STRUCTURE analysis, nor the PCA. We don’t see the scripts that they used. They through populations because they are “overly admixed”?? what does it mean? You think that Hispanics are less admixed than Druze? Where is the evidence? Why everything in this manuscript is made up BS?

Empty chair reply: You mean: we use K=5 (European, Amerindian, African, East Asian, & South Asian) instead of the K=4 (European, African, Amerindian, & East Asian) provided by the National Institute of Health for the ABCD dataset… Yes, only “racists” would use these ancestry components.

R4: 14. The authors don’t report their results. Are they afraid? Where are the findings of the model (blacks are poor and uneducated, bla bla). What is the point of this paper if the authors don’t stand behind their results? Why should anyone believe in it?

Empty chair reply: So you missed the part that this was a methodological paper which then illustrated the methodology using the ABCD sample.

15. Where is the null hypothesis?

R4: 16. I have major ethical concerns due to the extensive use of races, biologically defined. I think that it is wrong and unsupported by the data nor literature.

Empty chair reply: …so, again, we used SIRE vs. genetic ancestry. Which one, exactly, is the “wrong and unsupported” “races, biologically defined”?

R4: Minor comments 1. “It has particular value in the case of complex behavioral traits where reliably identifying genetic loci associated with trait variation is beyond the current reach of science” – so it is not beyond the reach of science?

Empty chair reply: Would you like it to be?

R4: I have a few more comments, but I think that the trend here is pretty obvious. It is an outdated approach (more of the 19th century).

Empty chair reply: Well, maybe you should tell that to the hundreds of research teams that currently use this method.

Thank you so much for your support! We met our yearly fundraising goal within 12 hours of yesterday’s post. We look forward to finishing and publishing these analyses.

In Lasker, Pesta, Fuerst, and Kirkegaard (2019), we found an unstandardized beta for European genetic ancestry, when predicting g, of .85 among African Americans (model 2; Table 6). Simply put: a 100% increase in European (vs. African) ancestry was associated with a 0.85 d increase in intelligence. We interpreted these results as strong support for a partial hereditarian model. As did others in the HBD sphere.

Bird (2021a), in contrast, argued that our regression analyses suffered from omitted variable bias. Notably, he did not disagree that the results would support a hereditarian model were they robust.

Given the 2.053 d (or 30.8 point) measured test score difference between continental Africans and Europeans which Bird (2021a) adopts, genetic effects alone, based on our results, would represent .85 d /2.053 d = 41% of the phenotypic difference. Expressed in terms of variance explained, this would be (.85 d)^2/(2.053 d)^2 = 17.14%. [1] However, this is relative to an average within-groups heritability for g of 66.5% for this specific sample (Mollon et al., 2018; Pesta, Kirkegaard, te Nijenhuis, Lasker, & Fuerst, 2020). Since the expected differences are proportionate to the within-groups heritability, the variance explained would be predicted to be around 17.14%/66.5%*50% = 12.88% conditioned on a heritability of 50%.[2]

Now, based on his analysis of SNP data, Bird (2021a) estimated a variance explained of 12% given a heritability of 50%. Thus, these two very different methodologies (global admixture analysis & SNPS Fst comparisons) derive very similar estimates conditioned on the same heritability coefficient.[3]

But Bird (2021) goes on to interpret his result as “no support for a hereditarian hypothesis”. Well, one could define a ‘hereditarian hypothesis’ such that these magnitudes do not support it. But, in that case, one could just cite our own widely discussed research results against it. In this case, Bird (2021b) should then also state that, “Lasker et al. (2019) also found ‘no support for the hereditarian hypothesis of the Black–White achievement gap’ and, in fact, Fuerst is strongly supportive of an environmental model, despite what some disreputable sites claim.”

I won’t complain. I am sure that being labeled an environmentalist will not hurt my career prospects. However, don’t call me a hereditarian for arguing X but then go on to argue X and also call that ‘no support for a hereditarian hypothesis’.

Note:
[1] To convert between variance metrics, such as R^2, and linear metrics such as r, you take the square-root of the former or the square of the latter. The difference between variance and linear metrics can lead to misinterpretations, since variance metrics do not align with our intuitive sense of distance. Phil Birnbaum (2006) gives the following example: if you were playing baseball and you made it to second base, you might think you made it 2/4 = .5 or one-half of the way home, but in terms of variance metrics you really only ran 2^2/4^2 = 4/16 =.25 or one-quarter of the total variance to home base. This is why, in context to the continental African and European differences discussed, a between-group variance of 17.14% is equivalent to a real-world percent explained of sqrt(17.14%) = 41%.

[2] Originally, I reported an average heritability for g in the TCP sample of 81.5; the correct value was 66.5 (White = 72%; Black = 61%). The text has been updated.

[3] As for which estimates to use, a point which Bird (2021b) touches on, ideally one would employ both within-groups broad-sense heritability and total genetic variance between populations so to calculate the broad-sense between-group heritability and the total expected differences. This is insofar as one is interested in the overall differences, not predicting offspring values from parental ones or testing specific evolutionary models. Now Bird (2021a) cites Polderman et al. (2015). For adults (age 16 to 65), Polderman et al. (2015) give meta-analytic MZ and DZ correlations of .68 and .28 (Figure 3; High-level cognitive functioning), which, using Falconer’s formula, yields a meta-analytic broad-sense heritability of 80%.

Of this, most of the variance is additive genetic; nearly all the remainder is due to an unknown mix of active gene-environmental covariance and dominance variance. Now, if for methodological or theoretical reasons, one uses within-groups narrow-sense heritability and additive genetic variance between populations, one simply derives the expected differences due to additive genetic differences. That can be useful for certain purposes, however, it will underestimate total genetic differences (unless, unexpectedly, in this case, the genetic variance components go in discordant directions between populations). Regardless, since global admixture results will relate to broad-sense heritability, one needs to adjust the heritability when comparing the results of Bird (2021) to those of Lasker et al. (2019).

References
Bird, K. A. (2021a). No support for the hereditarian hypothesis of the Black–White achievement gap using polygenic scores and tests for divergent selection. American Journal of Physical Anthropology.
Bird, K. A. (2021b, February 12). Still No Support For Hereditarianism. Accessed at: https://kevinabird.github.io/
Lasker, J., Pesta, B. J., Fuerst, J. G., & Kirkegaard, E. O. (2019). Global ancestry and cognitive ability. Psych, 1(1), 431-459.
Mollon, J., Knowles, E. E., Mathias, S. R., Gur, R., Peralta, J. M., Weiner, D. J., … & Glahn, D. C. (2018). Genetic influence on cognitive development between childhood and adulthood. Molecular psychiatry, 1-10.
Pesta, B. J., Kirkegaard, E. O., te Nijenhuis, J., Lasker, J., & Fuerst, J. G. (2020). Racial and ethnic group differences in the heritability of intelligence: A systematic review and meta-analysis. Intelligence, 78, 101408.
Polderman, T. J., Benyamin, B., De Leeuw, C. A., Sullivan, P. F., Van Bochoven, A., Visscher, P. M., & Posthuma, D. (2015). Meta-analysis of the heritability of human traits based on fifty years of twin studies. Nature genetics, 47(7), 702-709.

It’s March!

Which means that the Human Phenome Diversity Foundation’s (HPDF) annual fundraising drive has commenced.

Our goal is \$2,500.

We have some great projects which we would like to support this year if we can afford to.

Last year’s fundraising helped finance an important admixture paper, currently under review, which is up at biorxiv.

We would like to continue to fund genetically informed research with your support.

If you care to see this research done, you can donate at the HPDF’s official gofundme charity site. Donations are tax-deductible since the HPDF is a 501(c)(3) organization.

Also, the HPDF now has an associated corporate Kraken account, so you can donate directly with cryptocurrencies, too:

Bitcoin(XBT):34fHxYLwEVEpcn7GLLuYtZ4PZcZp9qWbhA

Ethereum(ETH):0x53d65c5f757D59153Cf9fffC44D40989FCcFB602

Monero(XMR):83SiAyTG7GdE5uvUuDj61SKmAQhHXuuxE5EKP3kao5GMiveZf
3oLSbsgc5Pejk5PajQjGVUF6YV11ZQbEWikJFxX2tRgX9R

Kevin Bird has a paper out in which he claims, more or less, to evidence “insignificant” race differences. There is a lot there to criticize: misinterpretations, odd analytic choices,  a crucial wrong formula [1], etc.

Maybe I will write a formal critique.

For now, it’s sufficient to point out that the results strongly agree with a hereditarian model:

• The predicted differences, given the genetic divergence in the educational and intelligence SNPs, are medium to large given reasonable estimates of broad-sense heritability (H2)[2].
• While there is inconsistent evidence of divergent selection (for this pairwise comparison), there is zero evidence of homogenizing or stabilizing selection.

To illustrate point (1), Table 1 shows the expected BGH given the 30.8 point continental European-African difference which Bird adopts along with the expected phenotypic gaps when environments are equal (i.e., when BGH is set to unity). I use the lowest Fst value Bird reports in his table. Proofs are provided for the formulas used in the .doc file.

Table 1. Expected Between Group Heritability (BGH)  and Expected IQ Point Differences between Europeans and Africans Given Different Values of the Genetic Intraclass Correlation (r and r_c) and H2 assuming  an eduSNP Fst  =.111 from Bird (2021; Table 1; 1301 clumped EA SNPs)

 H2 r t_observed BGH t_expected Expected IQ difference Cohen’s Interpretation 0.20 0.1990 0.5132 0.047 0.0473 6.69 Medium 0.35 0.1990 0.5132 0.083 0.0800 8.85 Medium 0.50 0.1990 0.5132 0.118 0.1105 10.58 Medium 0.65 0.1990 0.5132 0.153 0.1391 12.06 Large 0.80 0.1990 0.5132 0.189 0.1658 13.38 Large H2 r_c t_observed BGH t_expected Expected IQ difference Cohen’s Interpretation 0.20 0.2844 0.5132 0.075 0.0736 8.46 Medium 0.35 0.2844 0.5132 0.132 0.1221 11.19 Medium 0.50 0.2844 0.5132 0.188 0.1657 13.37 Large 0.65 0.2844 0.5132 0.245 0.2053 15.25 Large 0.80 0.2844 0.5132 0.302 0.2412 16.91 Large

Note: H2 = broad-sense heritability; r =  intraclass genetic correlation; r_c = intraclass genetic correlation corrected for mathematical constraints on Fst; t_observed = intraclass phenotypic correlation i.e., phenotypic variance between groups (given d = 2.053); BGH = between group heritability; t_expeced = expected phenotypic variance between groups when environments are equalized; Expected IQ difference = expected IQ differences when environments are equalized; Cohen’s Interpretation = conventional interpretation of effects sizes.

You can argue that one should use narrow-sense heritability, instead of broad-sense, contra Jensen (1972; 1998), then lowball the estimates, and finally take advantage of statistical illiteracy and portray the differences as ‘small’ or ‘insignificant’ by emphasizing the portion of variance explained. However, the expected differences (which are equal to sqrt(BGH) x observed phenotypic differences) are still medium to large. Of course, Bird (2021) argues that the differences could go either way with equal likelihood.  This would be true if you knew nothing else.  However, in his prior analyses, he uses polygenic score (PGS) weights, and the eduPGS weights are directional.  For the same set of eduSNPs the PGS differences are shown below:

Table 2. Mean MTAG-based PGS for CEU and YRI Calculated using population-GWAS and Within Family Betas.

 W/ population-GWAS W/ Within Family Betas CEU (N = 99) YRI (N = 108) CEU (N = 99) YRI (N = 108) All SNPS 0.866 -0.794 0.614 -0.563 p-value (Welch’s Two Sample t-test) < 0.0001 < 0.0001 Derived SNPs 0.938 -0.860 0.702 -0.643 p-value (Welch’s Two Sample t-test) < 0.0001 < 0.0001 Ancestral SNPs 0.605 -0.554 0.528 -0.484 p-value (Welch’s Two Sample t-test) < 0.0001 < 0.0001

Note: SNPs were filtered for MAF >0.01 for both CEU and YRI. Scores represent standard scores calculated using the standard deviation in the total sample. Sample sizes for the t-test were N = 99 for CEU and N=108 for YRI.

Thus, it makes no sense to say that the expected difference could go either way, with equal probability, when the eduPGS weights indicate a direction. When this is recognized, the only option is to declare that the eduPGS is biased between populations. This is possible, of course.

However, this leaves the evolutionary default or null, which is that differences will be commensurate with neutral variation. As Rosenberg, Edge, Pritchard, & Feldman (2019) note: “One key component of the inference of polygenic adaption is the use of an appropriate null expectation for polygenic scores distributions and phenotypic differences…[P]henotypic differences among populations are predicted under neutrality to be similar in magnitude to typical genetic differences among populations.”  The authors, of course, go on to cite Lewontin and slyly note that differences “are small in comparison with variation within populations”. But, of course, “large” differences in the conventional sense are also “small in comparison with variation within” (e.g., .80d = 14% variance). And while the evolutionary default is directionless, the totality of the behavioral genetic and psychometric data assembled on this topic points one way.

[1] See, for example, equation 4 in Bird (2021).  However,

total between phenotypic variance = phenotypic variance due to genes + phenotypic variance due to environment

which can be rewritten, in linear metrics, as PD^2 = GD^2 + ED^2  or PD = sqrt( GD^2 + ED^2 )

Since, BGH = phenotypic variance due to genes / total between phenotypic variance

BGH = GD^2 / PD^2 and, therefore, GD = sqrt(BGH)*PD

This is approximated but underestimated by 2*PD^2 * sqrt(2/pi) (*15) which Bird (2021) uses.

e.g., sqrt(.12)*30.8 = 10.67 (correct) versus 2*sqrt(.12)*sqrt(2/pi) (*15) = 8.29 (Bird, 2021)

[2] While the use of narrow-sense heritability is recommended for Qst-Fst comparisons and the assessment of directional selection, narrow-sense heritability, and the corresponding narrow-sense Qst underestimates between-group genetic variance by not taking into account non-additive genetic variation between populations, along with active gene-environment covariance (which is commonly classed as a genetic effect; Sesardic, 2005). Thus when it comes to calculating the expected difference due to genes, it makes sense to use the broad-sense heritability, at least for an upper-bounds estimate, as hereditarians have done (e.g., Jensen, 1998).

In 1969, Harvard Educational Review published a long, 122-page article under the title “How Much Can We Boost IQ and Scholastic Achievement?” It was authored by Arthur R. Jensen (1923–2012), a professor of educational psychology at the University of California, Berkeley. The article offered an overview of the measurement and determinants of cognitive ability and its relation to academic achievement, as well as a largely negative assessment of attempts to ameliorate intellectual and educational deficiencies through preschool and compensatory education programs. Jensen also made some suggestions on how to change educational systems to better accommodate students with disparate levels of ability.

While most of the article did not deal with race, Jensen did argue that it was “a not unreasonable hypothesis” that genetic differences between whites and blacks were an important cause of IQ and achievement gaps between the two races. This set off a huge academic controversy—Google Scholar says that the article was cited more than 1,200 times in the decade after its publication and almost 5,400 times by December 2019. The dispute about the article centered on the question of racial differences, which is understandable as Jensen’s thesis came out on the heels of the civil rights movement and its attendant controversies, such as school integration, busing of students, and affirmative action. Jensen questioned whether it is in fact possible to eliminate racial differences in socially valued outcomes through conventional policy measures, striking at the foundational assumption of liberal and radical racial politics. His floating of the racial-genetic hypothesis was what set his argument apart from the general tenor of the era’s scholarly and policy debate.

In this post, I will take a look at Jensen’s arguments and their development over time. The focus will be on the race question, but many related, more general topics will be discussed as well. The post has four parts. The first is a synopsis of Jensen’s argument as it was presented in the 1969 article. The second part offers an updated restatement of Jensen’s model of race and intelligence, while in the third part I argue, using the Bradford Hill criteria, that the model has many virtues as a causal explanation. In the fourth and concluding part I will make some more general remarks about the status and significance of racialist thinking about race and IQ.[Note]