Author: Chuck (Page 1 of 6)

SAT/ACT Scores by Detailed Race/Ethnicity From Applicants on Common App (2021)

We recently published the IQ scores for major ethnic groups, based on the broadly representative Adolescent Brain Cognitive Development sample. These ethnic averages correlated very strongly (r = .90 to .94) with scholastic aptitude scores (SAT or ACT scores) based on nationally representative samples of American-born college students between the years 2012 and 2020.  The aptitude scores came from the NPSAS surveys, which, unfortunately, have a limited number of ethnic classifications.

As Dalliard noted, understanding racial/ethnic differences in aptitude tests is important since it is a guide to the composition of the USA’s future cognitive elite.  Since different ethnic groups have different political interests, which, in turn, shape policy, understanding the cognitive capital of ethnic groups is essential to predicting the trajectory of the USA in the coming century.

While no open-source national surveys provide data on SAT/ACT scores decomposed by detailed ethnic groups, Common Application provides some data for USA citizens and residents.  Common App is an undergraduate college admission application service that allows one to apply to over one thousand member colleges in the USA. While the data sample is large, with over 1 million applicants each year, several issues have been reported by Freeman et al. (2021):

  • The percentage of applicants reporting a test score fell from 70 and 73% in 2018-19 and 2019-20 to 40% in 2020-21. This is likely due to 89%  of Common App’s members (900+ colleges) no longer requiring SAT/ACT for admission in 2020-21.
  • Nearly 60% of applicants applied from ZIP Codes in the top 20% of the median household income distribution. The decline in reporting rates between 2019-20 and 2020-21 was greater in lower-income communities.
  • Underrepresented minorities (not including Asians) report test scores at lower rates than non-underrepresented minorities (71% vs 77%). The drop in reporting rates between 2019-20 and 2020-21 was larger for underrepresented minorities (31% vs 47%).
  • In 2019-20, 78% of non-first-generation students reported test scores versus 69% of first-generation students while in 2020-2021 the rates were 48% and 30%.
  • Applications to private, more selective institutions were the most likely to include test scores (83% in 2019–20 and 44% in 2020–21), while applications to private, less selective institutions were the least likely to include test scores (67% and 28%).

The high rates of missing test scores, especially for low-achieving groups,  may mean that certain group averages are biased. Moreover, many ethnic groups suffer from ethnic attrition (Emeka, 2019), in which case group identification is correlated with aptitude. To illustrate, in the case of Nigerian-Americans, Emeka (2019) noticed that Nigerians residing in poor families with parents who have not completed high school or college degrees are much more likely to drop out of the Nigerian group in favor of the African American or Black group. This is because for them, “it is not Nigerian not to go to college”.

Those caveats noted, with respect to test scores, which are reported by Kim et al. (2022), the numbers are more or less as expected. Scores for the average Asian group, average White, average Black, and Native Hawaiian or Pacific Islander are 1382, 1278, 1108, and 1181, respectively. Applicants who did not report racial/ethnic information score (1378) higher than the non-underrepresented minorities (1297). Among Asian ethnicities, Asians from India, China, Korea, Japan, Malaysia, and “None provided” scored substantially higher (around 1400 vs 1300 or less) than Asians from Cambodia, Philippines, Vietnam, Pakistan, Other Southeast, or Other South Asia (Figure 2a).

The following table reports the SAT/ACT means by race/ethnicity, including mixed races, as well as Hispanic groups by both region and race. The columns display the N (unadjusted), % of reports, average SAT/ACT, GPA, N (adjusted for % of reports), SAT/ACT in IQ metrics. The IQ metric SAT/ACT scores were computed using the NPSAS20 total SAT/ACT standard deviations.

Ethnic N Reports SAT¹ GPA N² IQ-metric
White
Average 570400 0.53 1278 92 302312 100.0
Two+ Provided 17640 0.52 1310 93 9173 102.4
Europe 464670 0.55 1287 93 255569 100.7
Middle East 22720 0.39 1236 90 8861 96.9
None Provided 39580 0.43 1206 91 17019 94.6
Other 25800 0.35 1159 88 9030 91.1
African American
Average 140010 0.36 1108 85 50404 87.2
Africa 13840 0.34 1185 87 4706 93.0
Two+ Provided 17740 0.39 1170 87 6919 91.9
None Provided 670 0.29 1141 85 194 89.7
Caribbean 10610 0.37 1116 85 3926 87.8
Other 680 0.27 1113 84 184 87.6
U.S. Af-Am 96470 0.35 1084 84 33765 85.4
Asian
Average 115490 0.61 1382 95 70449 107.8
None Provided 1600 0.65 1438 97 1040 112.0
Korea 10480 0.67 1421 96 7022 110.7
India 32750 0.72 1415 96 23580 110.3
China 24620 0.64 1414 97 15757 110.2
Other East Asia 2800 0.59 1411 95 1652 110.0
Malaysia 240 0.61 1380 95 146 107.6
Two+ Provided 8520 0.58 1376 96 4942 107.3
Japan 1330 0.57 1364 94 758 106.4
Other South Asia 5620 0.45 1309 92 2529 102.3
Pakistan 5500 0.50 1301 92 2750 101.7
Vietnam 9090 0.55 1284 94 5000 100.5
Philippines 8100 0.47 1262 94 3807 98.8
Other Southeast Asia 4130 0.44 1261 92 1817 98.7
Cambodia 700 0.36 1216 92 252 95.4
Pacific Islander
Group Average 1770 0.32 1181 88 566 92.7
None Provided 100 0.36 1246 91 36 97.6
Guam 220 0.34 1233 91 75 96.6
Other (Excl. Philippines) 570 0.30 1204 86 171 94.5
Two+ Provided 150 0.28 1187 87 42 93.2
Hawaii 420 0.35 1180 88 147 92.6
Samoa 320 0.33 1082 88 106 85.3
American Indian
Average 2760 0.36 1162 87 994 91.3
OK Citizen Potawatomi 20 0.29 1338 90 6 104.5
OK Choctaw 90 0.55 1267 93 50 99.2
OK Chickasaw 50 0.52 1252 90 26 98.1
OK Muscogee (Creek) Nation 50 0.59 1241 91 30 97.2
OK Cherokee 140 0.53 1218 95 74 95.5
MI Sault Ste. Marie 40 0.57 1192 88 23 93.5
NY Saint Regis 50 0.16 1170 86 8 91.9
None Provided 80 0.29 1153 87 23 90.7
Unenrolled 1370 0.34 1147 86 466 90.2
Other Enrolled 640 0.34 1146 87 218 90.1
SD Oglala Sioux 20 0.18 1123 90 4 88.3
AZ Navajo 160 0.29 1096 89 46 86.3
NC Eastern Cherokee 40 0.41 1079 87 16 85.1
Two/More Races
Group Average 56130 0.50 1289 92 28065 100.8
Asian & White 25400 0.60 1354 95 15240 105.7
Asian & Pacific Islander 1020 0.43 1278 93 439 100.0
Asian & American Indian 140 0.45 1266 88 63 99.1
White & Pacific Islander 1010 0.48 1265 92 485 99.0
White & Native American 4620 0.50 1248 91 2310 97.7
Three or More Races 3610 0.41 1241 90 1480 97.2
Asian & African Am. 2680 0.43 1224 90 1152 96.0
White & African Am. 15680 0.40 1192 88 6272 93.5
African Am. & Pacific Isl. 40 0.19 1118 83 8 88.0
Native Am. & Pacific Isl. 1540 0.31 1095 84 477 86.2
African Am. & Native Am. 380 0.34 1093 84 129 86.1
Hispanic (Region)
Group Average 194060 0.37 1195 89 71802 93.8
Spain 4950 0.48 1284 92 2376 100.4
South America 24800 0.46 1247 91 11408 97.7
Cuba 6860 0.61 1236 92 4185 96.9
Two+ Provided 28730 0.41 1211 90 11779 94.9
Mexico 70270 0.32 1170 89 22486 91.9
Central America 17400 0.32 1168 89 5568 91.8
Puerto Rico 22540 0.37 1168 87 8340 91.7
None Provided 1850 0.29 1161 87 537 91.2
Other 16670 0.30 1152 87 5001 90.5
Hispanic (Races)
Group Average 194060 0.37 1195 89 71802 93.8
Asian 3290 0.43 1277 92 1415 99.9
Two+ Provided 9750 0.43 1235 90 4193 96.7
White 96690 0.45 1219 90 43511 95.6
Hispanic or Latinx Only 60870 0.27 1146 88 16435 90.1
American Indian 4740 0.29 1133 87 1375 89.1
African American 17740 0.29 1116 85 5145 87.8
Pacific Islander 980 0.25 1097 85 245 86.4

¹SAT/ACT average
²Real N estimated by multiplying N column by % Reports

Despite issues with the data that were pointed out above, one clearly notices the strong similarity between these IQ-metric SAT/ACT scores by race/ethnicity, including various mixed-race categories, and the IQ-metric SAT/ACT estimates from the NPSAS reported in a previous post.

The full dataset made available by Common App can be found at the following link.  See also the following report. Additional data is included in the file such as AP scores, the number of academic honors reported, and household income.

 

References

  1. Freeman, M., Magouirk, P., & Kajikawa, T. (2020). Applying to college in a test‐optional admissions landscape: Trends from Common App data.
  2. Kim, B.H., Freeman, M., Kajikawa, T., Karimi, H., & Magouirk, P. (2022). Unpacking applicant race and ethnicity, part 2: disparities in key indicators of applicant readiness and resources across detailed backgrounds. Common Application.

IQ scores by ethnic group in a nationally-representative sample of 10-year old American children

Note: We computed these results based on multiple versions of the ABCD data (v2.01 & v3.01) and with different inclusion/exclusion criteria.  I originally posted a version based on the ABCD 2.01 data filtered for missing admixture, and other scores. However, after looking, I found a version that uses the maximum 3.01 sample with age-corrected NIHTBX scores (N = 11474).  While the scores for the two versions correlate at r = .98, in some cases (e.g., Vietnamese), there is a notable difference.  I have now replaced the original table with the one based on N = 11474 and moved the original table to the end of the post.  For replicability and modifiability, I attached the latest R code which I had in my file.

……

In our manuscript, titled “Reply to Warne,” we present average eduPGS and NIH Toolbox composite scores from the ABCD study, categorized by ethnic and religious groups. In our analyses, we used unweighted means instead of sample weighted scores, since we were only interested in the correlation between mean eduPGS and cognitive ability. However, we also computed weighted NIH Toolbox scores, which may be of interest to some readers.

These scores were computed using the survey package for R as recommended by Heeringa and Berglund (2020). These weighted scores, reported below, represent the “neuropsychological performance” scores, measured between 2016 and 2018, of broadly representative samples of 10-year-old American children. (Though, children were excluded, by the ABCD consortium, if they were not fluent in English or if one of their parents were not fluent in either English or Spanish.) The first three columns, after the group labels, display the sample size, means, and standard deviations, respectively. The fourth column presents the scores normalized with the non-Hispanic White mean set to 100.00 and standard deviations set to 15.00. To norm scores, we pooled the standard deviations across all groups (pooled SD = 16.45) and transformed the values using the pooled SD. On a reader request, I added average years of parental education, which I previously outputted, in the fifth column.

The ethnic groups are mutually exclusive, and the specific variables used to code them are provided in the supplementary materials of the manuscript. Classifications are based on the race/ethnicity of the child as reported by the responding parent in conjunction with the nationality and immigrant status of the parents; see the Parent Demographics Survey for specific variables and the second table for definitions. To be clear, some of the definitions do not perfectly overlap with ones commonly used in the social sciences. For example, the classification “USA Blacks” refers to children who were identified as being Black, not being White, not being Hispanic, but also not having an immigrant parent or grandparent. This was done because, when computing the scores, we were interested in mutually exclusive ethnocultural groups.

Bear in mind that the sample sizes are often small and so the corresponding estimates are imprecise and also that the NIH Toolbox battery is fluid-intelligence loaded. For comparison, Sailer, in 2009, reported cognitive abilities of legal American immigrants based on the digit span backwards test. Additionally, in 2015, I summarized scores by immigrant generation and ethnic groups mostly based on scholastic tests.

Ethnic/National GroupNMSDIQ-Metric Score Parental Education (Years)
Chinese81116.5321.02111.3216.95
Korean & Japanese33115.1319.15110.0516.36
White & Asian Indian44114.6614.19109.6216.75
White & Korean/Japanese78111.4118.02106.6515.89
White & Chinese77109.7718.16105.1616.42
White & Filipino60109.6718.09105.0716.16
Filipino51107.9917.53103.5315.5
Other Asian52106.820.17102.4615.91
Asian Indian53106.7717.03102.4216.8
White5858104.1116.5110015.45
White & Pacific Islander25103.7416.999.6615.47
Vietnamese24102.6814.5198.6915.95
N. Africa & Mid. East47100.3320.0196.5614.92
Pacific Islander1799.7912.1896.0614.07
White & Native American14499.3215.3695.6314.51
Central & South American35298.3616.9894.7614.15
Not Identified21796.6917.7893.2413.4
Dominican3895.116.6591.7913.99
White Mexican77595.116.3791.7812.8
White Cuban15194.981691.6713.92
NH Black & White41894.9316.9191.6314.14
Other Hispanic51894.5617.7591.2913.95
White Puerto Rican13394.2217.2390.9813.74
Black African5993.8413.4190.6314.97
Other Cuban3092.8118.389.6914.33
Native American3992.2916.189.2213.14
Other Mexican46091.8216.0288.7911.9
Black Caribbean5191.7416.7988.7214.26
Black & Other Puerto Rican9090.6115.4987.6913.22
USA Black149985.4414.882.9813.32

Continue reading

Decoding Admixture Results

Race/ethnic cognitive/academic achievement gaps are considered so important in the social sciences that number 4 in the social science’s top 10 list of “grand challenge questions that are both foundational and transformative” (Giles, 2010) is: “How do we reduce the ‘skill gap’ between black and white people in America?” Illustrating just how much effort has been focused on this topic, Google scholar yields 48,200 search results when queried for “race” and “achievement gaps.” The concern is arguably well justified as race/ethnic-related social outcome gaps can largely be accounted for by differences in cognitive ability (e.g., Fryer, 2014).

Given the intensity of academic interest in this subject, the fact that only a handful of researchers are focused on understanding why achievement gaps so tightly track genetically-identified ancestry within socially-identified racial/ethnic groups is indeed curious. For instance, in Guo, Lin, & Harris (2019), the authors report results for Peabody Picture Vocabulary based on the ADD Health sample. In Table 3 of said publication, among Hispanic and non-Hispanic Blacks, ancestry principle components PC1 and PC3 are strongly associated with verbal intelligence. And among non-Black Hispanics, ancestry principle components PC2 and PC3 show the strongest association.

In their report, Braudt & Harris (2020) provide the Rosetta Stone for interpreting these otherwise opaque results. PC1, in this sample, separates Sub-Saharan African ancestry from out-of-African ancestry, while PC2 separates European ancestry from non-European out-of-African ancestry. PC3 is not shown, but we can deduce from the distribution among Black and non-Black Hispanics that it separates out Amerindian ancestry. In other words, in this large national sample – just as in the nationally representative Adolescent Brain Cognitive Development study (Fuerst, 2021) – African and Amerindian genetic ancestry are strongly negatively related to intelligence among socially-defined Blacks and Hispanics.

Once again, despite these robust findings, armies of sociologists nominally interested in the source of racial and ethnic-related cognitive/academic achievement gaps continue to flagrantly ignore genetic ancestry. Not only do they ignore such results, but they also censor them. Thus, predictably, the published version of Guo, Lin, & Harris (2019) drops the results for non-Whites, along with Table 3 shown above, on reviewers’ insistence. Other researchers, who have looked at predictors of cognitive ability, have informed me that reviewers similarly have demanded PCs in place of more interpretable ancestry percentages and then, also, that the PC variables not be reported in the tables.

So, unsurprisingly Google Scholar yields just 48 hits, or two orders of magnitude fewer search results, for “genetic ancestry” and “achievement gaps.” But why? One has to suspect that our sociologists are not particularly interested in understanding the true cause of race/ethnic differences. Reality evasion continues unabated in academia.

References

Braudt, D., & Harris, K. M. (2020). Polygenic scores (pgss) in the national longitudinal study of adolescent to adult health (add health)–release 2.

Fryer, R. (2014). 21st-century inequality: The declining significance of discrimination. Issues in Science and Technology, 31(1), 27-32.

Fuerst, J. G. (2021). Robustness analysis of African genetic ancestry in admixture regression models of cognitive test scores. Mankind Quarterly, 62(2).

Giles, J. (2011) Social science lines up its biggest challenges. Nature, 470(7332):18–19.

Guo, G., Lin, M. J., & Harris, K. M. (2019). Socioeconomic and genomic roots of verbal ability. bioRxiv, 544411.

The Post-hoc 4th Review

Gregory Connor and I submitted the paper, “Linear and partially linear models of behavioral trait variation using admixture regression,” to MDPI’s Behavioral Sciences. This is a methodological paper explicating & proposing some modifications to the frequently used – across hundreds of papers – admixture regression method. We illustrated this method and our proposed tweaks using the ABCD cohort. This manuscript was peer-reviewed by three reviewers, accepted, proof-edited, paid for, but not published. Breaking with MDPI’s clearly outlined protocol, the editor of Behavioral Sciences – who I am fairly sure has now blacklisted me — sent it to a mysterious and seemingly not particularly acute 4th “reviewer”. This “reviewer” argued that the paper was “racist” and based on an “outdated” method. We were not given a chance to respond. And the opinions of the original three reviewers, whom we patiently replied to and made revisions for, were discarded.

You might wonder whether this 4th “reviewer” caught a serious methodological error – or even a substantive one. Nope. Instead, he argued that admixture regression – frequently used, since the early 2000s by numerous geneticists, genetic epidemiologists, medical researchers, and so on – is an “outdated approach (more of the 19th century)”. He kept repeating that the paper was about an outdated “biological concept” of race, when it concerned the relation between traits, genetic ancestry, and self-identified race/ethnicity. To note, typical MDPI reviews are not this ill-conceived and incoherent.

To let you judge if this post-hoc “review” had any merit, I provided this full comment along with my point-by-point empty-chair reply. Since the paper already passed peer-review and was accepted by MDPI, but not published for obvious political reasons, Greg and I have decided to publish it as a chapter in a forthcoming book. I usually do not publish reviews. However since I do not plan to have this paper peer-reviewed yet again, publishing the post-hoc commentary is warranted. Moreover, I usually do not speculate on motives, but it should be noted that, according to the editor, our post-hoc commenter was a knowledgeable geneticist. That fact, with the implication that the commenter understood the technique and literature, suggests that this was a hit job, with the goal of simply persuading the editor to cancel the paper. On the other hand, the commentary does read as if the “reviewer” was either clueless or was just trying to rationalize moral outrage.

“Peer-review” #4.

R4: Connor and Fuerst (here, C&F) proposed a new test that measure how differences in racial identity affects trait variation. They apply their variable to neuropsychological data collected by the Adolescent Brain Cognitive Development (ABCD) study and report that there exists a genetic component to neuropsychological traits and that there is a variation in the performance between different racial groups.

Empty chair reply: As we clearly explained in the introduction, admixture regression is commonly used in genetic epidemiology. Over the last two decades, hundreds of papers have been published using this technique by hundreds of well published geneticists, genetic epidemiologists, medical researchers and so on. In this paper, we explicate the underlying statistical model and propose some improvements to this frequently used technique.

R4: I found this paper unfounded, misleading, dishonest, and outdated, i.e., racist.

Empty-chair reply: Did you get your 30 pieces of silver for this hit job?

R4: The authors are missing some important advances in the field of population genetics. They used outdated terms (races) and cite no literature to support their racial perception.

Empty chair reply: You clearly did not understand the paper. We explicitly contrasted self-identified race/ethnicity (SIRE) with genetic ancestry. The former is posited as tagging environmental effects while the latter is posited as tagging genetic effects: Thus, we note: “Admixture regression leverages these two data sources, self-identified race or ethnicity (SIRE) and genetically-measured admixture proportions, to decompose trait variation correspondingly.” In line with ASHG (2018) we contrast self-identified race/ethnicity, a social construct, with genetic ancestry, a genetic construct. As ASHG (2018) notes:

Although a person’s genetics influences their phenotypic characteristics, and self-identified race might be influenced by physical appearance, race itself is a social construct. Any attempt to use genetics to rank populations demonstrates a fundamental misunderstanding of genetics. The past decade has seen the emergence of strategies for assessing an individual’s genetic ancestry. Such analyses are providing increasingly accurate ways of helping to define individuals’ ancestral origins and enabling new ways to explore and discuss ancestries that move us beyond blunt definitions of self-identified race. [Emphasis added]

R4: Their assumptions about human races are from the previous century. They consistently imply that their usage of racial categories used in social sciences have genetic merit, that’s racism and, of course, wrong. It is not surprise that they cannot find papers to support their genetic model, because it is unfounded.

Empty chair reply: See above. Also, we cited a plethora of examples of papers using admixture regression in the introduction and conclusion.

R4: The authors model individuals as races + admixture, but the emphasis is on races, as admixture is simply defined as combination of more than 1 race. This is a very ignorant modelling of human populations that ignores the vast literature on the subject. The genetic analyses results are skewed to reproduce their perceived racist model.

Empty chair reply: No. Genetic ancestry is not a combination of more than one SIRE group. And there are literally hundreds of papers which employ admixture regression analysis using the same major ancestry groups we used. The ABCD consortium, itself, even has their own genetic ancestry variables (European, African, Amerindian, and East Asian ancestry). We only recomputed these so to include South Asian ancestry

R4: Throughout the manuscript, the authors omit results (i.e., graphs and code) necessary to evaluate their code.

Empty chair reply: We provided the code in the supplemental files. Either you did not check or the editors did not forward this to you.

R4: 1. Where is the support to: “Many diverse national populations descend demographically from isolated continental groups within a few hundred years.”? where did you get it from? where is the scientific reference? ancient DNA study show that mixture is the norm rather than the exception.

Empty chair reply: Admixture within continental groups obviously doesn’t preclude isolation between them.

R4: 2. “Modern genetic technology can measure with high accuracy the proportion of an individual’s ancestry associated with these continental groups.” – yes, modern tests can predict continental origins with high accuracy, but where is the citation?

Empty chair reply: This is from ASHG’s positional statement on this topic.

R4: 3. “In many culturally diverse nations, most individuals can reliably self-identify as members of one or more racial or ethnic groups.” – nonsense. All self-reports are biased. No serious study uses self report ancestry. Of course, the authors must believe in that, because their entire method rests on this connection, but it is untrue. Unlike this unsupported claim of the authors, there are plenty of papers that prove otherwise :
https://academic.oup.com/aje/article/163/5/486/61161?login=true
Self-reported ancestry may not be a reliable method to reduce the possible impact of population stratification in genetic association studies of outbred populations, such as in the United States.
https://pubmed.ncbi.nlm.nih.gov/8761246/
https://pubmed.ncbi.nlm.nih.gov/10797159/
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2350912/
Read: https://www.nature.com/articles/s41408-018-0132-1 to see the differences between self-reported ancestry and genomic ancestry, calculated very accurately.

Empty chair reply: We did not say that SIRE is a reliable index of genetic ancestry – after all, the whole method is based on the contrast between SIRE and genetic ancestry. Rather, we said that SIRE is a reliable index of itself, in the sense that people who identify as a particular SIRE group at one time identify the same way at another. Thus it reliably tracks a cultural-environment.

R4: 4. Poor modeling: How can self-identified people report their % of ancestry? Hardly anyone mixed is 50%:50%.

Empty chair reply: How much did you bother to read beyond the abstract?

R4: 5. “The genotyped DNA samples are carefully decomposed into admixture proportions of geographic ancestry” – no. they are decomposed into a mixture of racial groups that the authors created after forcing the genetic data to show races. Races and admixture are two different concepts.

Empty chair reply: Translation: “The authors computed genetic ancestry in a standard way and entered this in a regression model with SIRE as have so many other researchers. This is bad: Reasons.”

R4: 6. “In most applications of admixture regression, individuals’ racial or ethnic group identities will have statistical relationships with individuals’ genetically identified geographic ancestries” – No! where is the evidence? Why this paper is completely devoid of reference for any fundamental assumption of the model. What does it mean “statistical relationships”?

Empty chair reply: Yes! Self-identified race/ethnicity generally, but imperfectly correlates with genetic ancestry. This just restates ASHG’s (2018) positional statement. But since you don’t even understand the meaning of “statistical relationship” what can one expect?

R4: 7. “The objective of admixture regression is to decompose trait variation into linear components due to genetic ancestries and linear components due to racial/ethnic group related effects” – unlike admixture mapping techniques, which the author misleading cite as a parallel method, their method is not designed to link a loci with a trait, but rather link conditions with races with a biological support to the racial concept.

Empty chair reply: Whew!… admixture regression analysis is not ‘our’ method. And this frequently used method is not “designed” to provide “biological support to the racial concept”: it explicitly takes advantage of social constructive aspects of racial identification in admixed populations. Do you need this point illustrated with a crayon?

R4: 8. “We show that the admixture regression model can be viewed as a statistically feasible simplification of this linear polygenic index model, in which proportional ancestries serve as statistical proxies for ancestry-related genetic differences.” – proportional ancestries serve as statistical proxies for ancestry-related genetic differences? You calculate ancestries from genetics, this statement means nothing. This is a tautology.

Empty chair reply: So now you finally realize that we used genetic ancestry. But, of course you are still wrong, since local ancestry is a subset of global ancestry. The statement reads: in our model, [global] ancestries serve as statistical proxies for [local] ancestry-related genetic differences.

R4: 9. “an assumption of random mating across ancestral populations” – really? where is the reference for this assumption?

Empty chair reply: Unsurprisingly, no other reviewers had a problem interpreting this statement. To spell it out: It is an assumption made by the theoretical model – thus a limitation – not an assumption about the world.

R4: 10. “A key assumption of the admixture regression model is that admixture arises from recent random mating between the previously geographically-isolated ancestral groups.” – of course no reference, because it is untrue. Your key assumption is not supported by reality.

Empty chair reply:… we restate that random mating is a theoretical assumption of the commonly used admixture regression model which may or may not be violated to a practically significant extent in the real world.

R4: 11. “Many individuals self-identify as belonging to two or more racial or ethnic groups” – you of course model those groups as RACES, by the biological definition, i.e., groups that are completely separate from one another and didn’t mix. Again, where is your evidence (from this century)? Surely you realize that the racial groups that you used do not satisfy this condition, south and east Asians are closer to each other than to Africans, but you ignore that. There are relationships between those groups, it’s not a star topology.

Empty chair reply: We explicitly do not model self-identified “racial or ethnic groups” as “groups that are completely separate from one another and didn’t mix”! If they didn’t mix, we wouldn’t have admixture for our admixture regression! Nowhere in this paper do we talk about “biological races”. We talk about “genetic ancestry” and SIRE. Perhaps you could try reading our actual paper…

R4: 12. The author removed 80% of the genetic data. They claim that they follow the instruction of ADMIXTURE, but there are no such instruction or recommendation.

Empty chair reply: 100,000 random SNPs…. 100,000 random SNPs…

R4:

13. They force the genetic data into 5 racial categories to fit their made up racial categories. They never show a single result of the genetic analyses. we don’t see the STRUCTURE analysis, nor the PCA. We don’t see the scripts that they used. They through populations because they are “overly admixed”?? what does it mean? You think that Hispanics are less admixed than Druze? Where is the evidence? Why everything in this manuscript is made up BS?

Empty chair reply: You mean: we use K=5 (European, Amerindian, African, East Asian, & South Asian) instead of the K=4 (European, African, Amerindian, & East Asian) provided by the National Institute of Health for the ABCD dataset… Yes, only “racists” would use these ancestry components.

R4: 14. The authors don’t report their results. Are they afraid? Where are the findings of the model (blacks are poor and uneducated, bla bla). What is the point of this paper if the authors don’t stand behind their results? Why should anyone believe in it?

Empty chair reply: So you missed the part that this was a methodological paper which then illustrated the methodology using the ABCD sample.

15. Where is the null hypothesis?

Empty chair reply: Whew!

R4: 16. I have major ethical concerns due to the extensive use of races, biologically defined. I think that it is wrong and unsupported by the data nor literature.

Empty chair reply: …so, again, we used SIRE vs. genetic ancestry. Which one, exactly, is the “wrong and unsupported” “races, biologically defined”?

R4: Minor comments 1. “It has particular value in the case of complex behavioral traits where reliably identifying genetic loci associated with trait variation is beyond the current reach of science” – so it is not beyond the reach of science?

Empty chair reply: Would you like it to be?

R4: I have a few more comments, but I think that the trend here is pretty obvious. It is an outdated approach (more of the 19th century).

Empty chair reply: Well, maybe you should tell that to the hundreds of research teams that currently use this method.

Calling a Deer a Horse

In Lasker, Pesta, Fuerst, and Kirkegaard (2019), we found an unstandardized beta for European genetic ancestry, when predicting g, of .85 among African Americans (model 2; Table 6). Simply put: a 100% increase in European (vs. African) ancestry was associated with a 0.85 d increase in intelligence. We interpreted these results as strong support for a partial hereditarian model. As did others in the HBD sphere.

Bird (2021a), in contrast, argued that our regression analyses suffered from omitted variable bias. Notably, he did not disagree that the results would support a hereditarian model were they robust.

Given the 2.053 d (or 30.8 point) measured test score difference between continental Africans and Europeans which Bird (2021a) adopts, genetic effects alone, based on our results, would represent .85 d /2.053 d = 41% of the phenotypic difference. Expressed in terms of variance explained, this would be (.85 d)^2/(2.053 d)^2 = 17.14%. [1] However, this is relative to an average within-groups heritability for g of 66.5% for this specific sample (Mollon et al., 2018; Pesta, Kirkegaard, te Nijenhuis, Lasker, & Fuerst, 2020). Since the expected differences are proportionate to the within-groups heritability, the variance explained would be predicted to be around 17.14%/66.5%*50% = 12.88% conditioned on a heritability of 50%.[2]

Now, based on his analysis of SNP data, Bird (2021a) estimated a variance explained of 12% given a heritability of 50%. Thus, these two very different methodologies (global admixture analysis & SNPS Fst comparisons) derive very similar estimates conditioned on the same heritability coefficient.[3]

But Bird (2021) goes on to interpret his result as “no support for a hereditarian hypothesis”. Well, one could define a ‘hereditarian hypothesis’ such that these magnitudes do not support it. But, in that case, one could just cite our own widely discussed research results against it. In this case, Bird (2021b) should then also state that, “Lasker et al. (2019) also found ‘no support for the hereditarian hypothesis of the Black–White achievement gap’ and, in fact, Fuerst is strongly supportive of an environmental model, despite what some disreputable sites claim.”

I won’t complain. I am sure that being labeled an environmentalist will not hurt my career prospects. However, don’t call me a hereditarian for arguing X but then go on to argue X and also call that ‘no support for a hereditarian hypothesis’.

Note:
[1] To convert between variance metrics, such as R^2, and linear metrics such as r, you take the square-root of the former or the square of the latter. The difference between variance and linear metrics can lead to misinterpretations, since variance metrics do not align with our intuitive sense of distance. Phil Birnbaum (2006) gives the following example: if you were playing baseball and you made it to second base, you might think you made it 2/4 = .5 or one-half of the way home, but in terms of variance metrics you really only ran 2^2/4^2 = 4/16 =.25 or one-quarter of the total variance to home base. This is why, in context to the continental African and European differences discussed, a between-group variance of 17.14% is equivalent to a real-world percent explained of sqrt(17.14%) = 41%.

[2] Originally, I reported an average heritability for g in the TCP sample of 81.5; the correct value was 66.5 (White = 72%; Black = 61%). The text has been updated.

[3] As for which estimates to use, a point which Bird (2021b) touches on, ideally one would employ both within-groups broad-sense heritability and total genetic variance between populations so to calculate the broad-sense between-group heritability and the total expected differences. This is insofar as one is interested in the overall differences, not predicting offspring values from parental ones or testing specific evolutionary models. Now Bird (2021a) cites Polderman et al. (2015). For adults (age 16 to 65), Polderman et al. (2015) give meta-analytic MZ and DZ correlations of .68 and .28 (Figure 3; High-level cognitive functioning), which, using Falconer’s formula, yields a meta-analytic broad-sense heritability of 80%.

Of this, most of the variance is additive genetic; nearly all the remainder is due to an unknown mix of active gene-environmental covariance and dominance variance. Now, if for methodological or theoretical reasons, one uses within-groups narrow-sense heritability and additive genetic variance between populations, one simply derives the expected differences due to additive genetic differences. That can be useful for certain purposes, however, it will underestimate total genetic differences (unless, unexpectedly, in this case, the genetic variance components go in discordant directions between populations). Regardless, since global admixture results will relate to broad-sense heritability, one needs to adjust the heritability when comparing the results of Bird (2021) to those of Lasker et al. (2019).

References
Bird, K. A. (2021a). No support for the hereditarian hypothesis of the Black–White achievement gap using polygenic scores and tests for divergent selection. American Journal of Physical Anthropology.
Bird, K. A. (2021b, February 12). Still No Support For Hereditarianism. Accessed at: https://kevinabird.github.io/
Lasker, J., Pesta, B. J., Fuerst, J. G., & Kirkegaard, E. O. (2019). Global ancestry and cognitive ability. Psych, 1(1), 431-459.
Mollon, J., Knowles, E. E., Mathias, S. R., Gur, R., Peralta, J. M., Weiner, D. J., … & Glahn, D. C. (2018). Genetic influence on cognitive development between childhood and adulthood. Molecular psychiatry, 1-10.
Pesta, B. J., Kirkegaard, E. O., te Nijenhuis, J., Lasker, J., & Fuerst, J. G. (2020). Racial and ethnic group differences in the heritability of intelligence: A systematic review and meta-analysis. Intelligence, 78, 101408.
Polderman, T. J., Benyamin, B., De Leeuw, C. A., Sullivan, P. F., Van Bochoven, A., Visscher, P. M., & Posthuma, D. (2015). Meta-analysis of the heritability of human traits based on fifty years of twin studies. Nature genetics, 47(7), 702-709.

Human Phenome Diversity Foundation 2021 Fundraiser

It’s March!

Which means that the Human Phenome Diversity Foundation’s (HPDF) annual fundraising drive has commenced.

Our goal is $2,500.

We have some great projects which we would like to support this year if we can afford to.

Last year’s fundraising helped finance an important admixture paper, currently under review, which is up at biorxiv.

We would like to continue to fund genetically informed research with your support.

If you care to see this research done, you can donate at the HPDF’s official gofundme charity site. Donations are tax-deductible since the HPDF is a 501(c)(3) organization.

Also, the HPDF now has an associated corporate Kraken account, so you can donate directly with cryptocurrencies, too:

Bitcoin(XBT):34fHxYLwEVEpcn7GLLuYtZ4PZcZp9qWbhA

Ethereum(ETH):0x53d65c5f757D59153Cf9fffC44D40989FCcFB602

Monero(XMR):83SiAyTG7GdE5uvUuDj61SKmAQhHXuuxE5EKP3kao5GMiveZf
3oLSbsgc5Pejk5PajQjGVUF6YV11ZQbEWikJFxX2tRgX9R

“Insignificant” Differences

Kevin Bird has a paper out in which he claims, more or less, to evidence “insignificant” race differences. There is a lot there to criticize: misinterpretations, odd analytic choices,  a crucial wrong formula [1], etc.

Maybe I will write a formal critique.

For now, it’s sufficient to point out that the results strongly agree with a hereditarian model:

  • The predicted differences, given the genetic divergence in the educational and intelligence SNPs, are medium to large given reasonable estimates of broad-sense heritability (H2)[2].
  • While there is inconsistent evidence of divergent selection (for this pairwise comparison), there is zero evidence of homogenizing or stabilizing selection.

To illustrate point (1), Table 1 shows the expected BGH given the 30.8 point continental European-African difference which Bird adopts along with the expected phenotypic gaps when environments are equal (i.e., when BGH is set to unity). I use the lowest Fst value Bird reports in his table. Proofs are provided for the formulas used in the .doc file.

Table 1. Expected Between Group Heritability (BGH)  and Expected IQ Point Differences between Europeans and Africans Given Different Values of the Genetic Intraclass Correlation (r and r_c) and H2 assuming  an eduSNP Fst  =.111 from Bird (2021; Table 1; 1301 clumped EA SNPs)

H2 r t_observed BGH t_expected Expected IQ difference Cohen’s Interpretation
0.20 0.1990 0.5132 0.047 0.0473 6.69 Medium
0.35 0.1990 0.5132 0.083 0.0800 8.85 Medium
0.50 0.1990 0.5132 0.118 0.1105 10.58 Medium
0.65 0.1990 0.5132 0.153 0.1391 12.06 Large
0.80 0.1990 0.5132 0.189 0.1658 13.38 Large
H2 r_c t_observed BGH t_expected Expected IQ difference Cohen’s Interpretation
0.20 0.2844 0.5132 0.075 0.0736 8.46 Medium
0.35 0.2844 0.5132 0.132 0.1221 11.19 Medium
0.50 0.2844 0.5132 0.188 0.1657 13.37 Large
0.65 0.2844 0.5132 0.245 0.2053 15.25 Large
0.80 0.2844 0.5132 0.302 0.2412 16.91 Large

Note: H2 = broad-sense heritability; r =  intraclass genetic correlation; r_c = intraclass genetic correlation corrected for mathematical constraints on Fst; t_observed = intraclass phenotypic correlation i.e., phenotypic variance between groups (given d = 2.053); BGH = between group heritability; t_expected = expected phenotypic variance between groups when environments are equalized; Expected IQ difference = expected IQ differences when environments are equalized; Cohen’s Interpretation = conventional interpretation of effects sizes.

You can argue that one should use narrow-sense heritability, instead of broad-sense, contra Jensen (1972; 1998), then lowball the estimates, and finally take advantage of statistical illiteracy and portray the differences as ‘small’ or ‘insignificant’ by emphasizing the portion of variance explained. However, the expected differences (which are equal to sqrt(BGH) x observed phenotypic differences) are still medium to large. Of course, Bird (2021) argues that the differences could go either way with equal likelihood.  This would be true if you knew nothing else.  However, in his prior analyses, he uses polygenic score (PGS) weights, and the eduPGS weights are directional.  For the same set of eduSNPs the PGS differences are shown below:

Table 2. Mean MTAG-based PGS for CEU and YRI Calculated using population-GWAS and Within Family Betas.

W/ population-GWAS W/ Within Family Betas
CEU (N = 99) YRI (N = 108) CEU (N = 99) YRI (N = 108)
All SNPS 0.866 -0.794 0.614 -0.563
p-value (Welch’s Two Sample t-test) < 0.0001 < 0.0001
Derived SNPs 0.938 -0.860 0.702 -0.643
p-value (Welch’s Two Sample t-test) < 0.0001 < 0.0001
Ancestral SNPs 0.605 -0.554 0.528 -0.484
p-value (Welch’s Two Sample t-test) < 0.0001 < 0.0001

Note: SNPs were filtered for MAF >0.01 for both CEU and YRI. Scores represent standard scores calculated using the standard deviation in the total sample. Sample sizes for the t-test were N = 99 for CEU and N=108 for YRI.

Thus, it makes no sense to say that the expected difference could go either way, with equal probability, when the eduPGS weights indicate a direction. When this is recognized, the only option is to declare that the eduPGS is biased between populations. This is possible, of course.

However, this leaves the evolutionary default or null, which is that differences will be commensurate with neutral variation. As Rosenberg, Edge, Pritchard, & Feldman (2019) note: “One key component of the inference of polygenic adaption is the use of an appropriate null expectation for polygenic scores distributions and phenotypic differences…[P]henotypic differences among populations are predicted under neutrality to be similar in magnitude to typical genetic differences among populations.”  The authors, of course, go on to cite Lewontin and slyly note that differences “are small in comparison with variation within populations”. But, of course, “large” differences in the conventional sense are also “small in comparison with variation within” (e.g., .80d = 14% variance). And while the evolutionary default is directionless, the totality of the behavioral genetic and psychometric data assembled on this topic points one way.

[1] See, for example, equation 4 in Bird (2021).  However,

total between phenotypic variance = phenotypic variance due to genes + phenotypic variance due to environment

which can be rewritten, in linear metrics, as PD^2 = GD^2 + ED^2  or PD = sqrt( GD^2 + ED^2 )

Since, BGH = phenotypic variance due to genes / total between phenotypic variance

BGH = GD^2 / PD^2 and, therefore, GD = sqrt(BGH)*PD

This is approximated but underestimated by 2*PD^2 * sqrt(2/pi) (*15) which Bird (2021) uses.

e.g., sqrt(.12)*30.8 = 10.67 (correct) versus 2*sqrt(.12)*sqrt(2/pi) (*15) = 8.29 (Bird, 2021)

[2] While the use of narrow-sense heritability is recommended for Qst-Fst comparisons and the assessment of directional selection, narrow-sense heritability, and the corresponding narrow-sense Qst underestimates between-group genetic variance by not taking into account non-additive genetic variation between populations, along with active gene-environment covariance (which is commonly classed as a genetic effect; Sesardic, 2005). Thus when it comes to calculating the expected difference due to genes, it makes sense to use the broad-sense heritability, at least for an upper-bounds estimate, as hereditarians have done (e.g., Jensen, 1998).

Dissertation Bounties

Last updated: 4/18/2018

I was asked to meta-analyze a century (1914-2014) of IQ/Academic achievement and racial admixture (genealogy, gestalt racial appearance, and color) research. There is a lot out there, especially when one takes into account MA & PhD dissertations. To this end I am posting $20 (negotiable) bounties for each of the following dissertations (to be paid in bitcoin):

–Snider, J. G. (1953). A Comparative Study of the Intelligence and Aptitudes of Whites and Nezperce Indians (Doctoral dissertation, University of Idaho).

–Zimmerman, H. E. (1934). The Indian’s Ability to Learn Mathematics According to Degree of Indian Blood. MA, Kansas State Teacher’s College, Pittsburg.

–Rainey, C. D. (1932). A study of the Salem Indian High School, comparing the cultural background, the intelligence scores, and the per cents of white blood, and the classroom grades (Doctoral dissertation, Willamette University).

–Ross, D. D. (1962). A comparative intergroup study of the academic achievement and attendance patterns between the full-blood type and the mixed-blood type Oglala Sioux Indian student in the secondary department of Oglala Community School, Pine Ridge, South Dakota (Doctoral dissertation, Chadron State College).

The full dissertations are not needed, but just a copy (or photo) of the relevant tables and/or correlation matrices along with the following sample characteristics: country of sample, first order administrative unit of sample (e.g., North Carolina), group type (e.g., school, college, random stratified), ethnic group, age range, sample size, description of the ancestry index, cognitive tests used, statistical methods for comparing admixed groups (e.g., means & SD, correlations, Chi-square)

This should be an easy job if you are at one of the schools at which there is a copy of the dissertation. If interested email at j122177@hotmail.com. I will update this list as I go along.

Biogeographic ancestry and endophenotypes, etc.

There are a couple of new, well designed, obtainable, surveys out — with ancestry, MRI, and cognitive data – which should allow for the (dis)confirmation of certain conjectures of ill repute:

–Neurodevelopmental Genomics: Trajectories of Complex Phenotypes (age 8-21, N ~ 10,000)
–The Brain Genomics Superstruct Project (age 18-35, N ~ 1,500)

For example, Greg Cochran likes to go on about how major ancestry groups often differ in crude brain morphology, and how these differences probably explain a significant chunk (> 20%) of bio-ancestry related differences in CA. I doubt it. But if he agrees to specify the analytic strategy, I will try to get the data and run the analyses.

I did look through the PING survey (age 3-21, N ~ 1,500) – which might not be very informative owing to the age structure. Going by this, Greg seems to be more or less correct about some of the endo differences and probably about their origins. As an example, Figure 1 & 2 show the B/W diffs for intracranial and total brain volume by age. (AAs are picked out for illustration since they are the largest non-White ethnic group, showing the biggest deviation from Whites.) And Figure 3 shows the relation between brain volume and ancestry in the self-identified AA group; the results were basically the same for intracranial volume, etc. — and so not shown.

Yet, as seen in Table 1 &  2, CA was more or less uncorrelated with these particular endophenotypes (r = 0.07-0.08); unsurprisingly, CA explained virtually no endo differences, and vice versa. Yet, CA was strongly (negatively) associated with both African and Amerindian ancestry – and also, though to a lesser degree, with Oceanian.

Perhaps there is a more sound way to run the numbers? Or a better way to take into account age? Dunno, it’s not my position to defend.

Results below.
Continue reading

« Older posts

© 2024 Human Varieties

Theme by Anders NorenUp ↑