Color Differences: Corrections and Further Analysis. Part 1

It has come to my attention that interest in IQ-HBD is on the wane even amongst HBDers and HBDresses. Given this state of affairs, I have decided to compose this post as pithily as possible in hopes that my main points will be delivered before my few readers have fully tuned out:

There was an error in the original syntax. We therefore thought it worthwhile to redo the previous analysis. While doing so, we took a more in-depth look at the data. Part 1 deals with the full NLSY sample (color and outcomes for all racial and ethnic groups). Part 2 will focus specifically on the associations within the African American population.

Part 1. (Updated and corrected results with syntax can be found here.)

It has been claimed that an ubiquitous generalized color discrimination induces the association between color and outcomes in the US and abroad. Also, it has been claimed that “race” is a social construct; from this latter, it has somehow been deduced, by some, that the concept of ‘race’ is ill-defined and that therefore the study of some types of racial differences is scientifically invalid. Given this situation, we feel that it’s justified to look at the association between color and outcomes in general. By doing so, we address the claim of pervasive colorism and we circumvent the problems of conceptual ambiguity when it comes to race/ethnicity and differences. Color, after all, is a very real percept. One can hardly argue that color hierarchies don’t exist on the account of imprecision when it comes to racial categorization. That said:

First, the discussion below will only make sense if you have read: “Color Differences: Ubiquitous Yet Understudied“. If you have not read that post, do so now.

Figure 1 gives the updated full sibling Color, AFQT, and HGE correlations. Correlations with the effects of age and sex partialed out were added. (The syntax used for this is included in the updated excel file). In the case of rho, partial correlations were calculated using rank ordered variables. Correlations between averaged PIAT scores and color, and averaged PIAT scores and color if AFQT scores were missing, were added. Within families only 1 out of 24 of the color-outcome correlations was significant.

Figure 1

It has been reported that the interviewer race influences color ratings e.g., White interviewers code Blacks as being darker than Black interviewers code Blacks as. It’s possible, then, that an interviewer race effect could be attenuating the within family correlations as for some sib pairs subjects were color coded by individuals of different races. For example, in some cases, a White individual would color code sibling 1 in wave 8 and a Black individual would color code siblings 2 in wave 10. To investigate the possible influence of interviewer race we created a dummy variable coded: interviewers, different race =1/ interviewers, same race = 0. We then partialed out the effect of interviewer race. The results are shown in Figure 1a. Interviewer race had no apparent independent effect on the within family associations.

Figure 1a

To better understand the found correlations from figure 1, we employed bootstrapping to estimate the 95% confidence intervals of the correlations. The between and within family results based on unweighted scores are reported in Figure 1c. The within population 95% CI were: r(within, AFQT) = -0.84 ≤ ρ ≤ 0.04, rho(within, AFQT) = -0.117 ≤ ρ ≤ 0.0113; r(within, HGE) = -0.74 ≤ ρ ≤ 0.27, rho(within, HGE) = -0.112 ≤ ρ ≤ 0.025.


As there was concern about assumption violations, we employed robust regression as suggested by Erceg-Hurn and Mirosevich, (2008). Specifically, we used a Wilcoxon analysis, which is said to be robust to assumption violations with respect to the dependent variable. The program used can be located at: Figure 2 gives the Wilcoxon and Least squares regression results and graphs; between family results are on the right and within family results are on the left. Generally, assumption violations with respect to the dependent (AFQT) were not driving the lack of association within families.

Figure 2

This naturally led us to explore curve fits. We were primarily concerned with the associations with families. It is possible, for example, that there is a non-linear or non-monotonic association between color and outcomes within families that is not being detected by our analyses. Figure 3 shows some of the models explored. Between families, the association in monotonic-nonlinear. Within families we were unable to find a good fit.

Figure 3

To explore the issue further we conducted the simplest analysis that we could think of, a binomial comparison (in which color and cognitive ability were dichotomously coded). We looked at how often lighter colored individuals were also smarter and more educated than their darker colored siblings. Contrary to previous statements, using this simplest of simplest methods, a significant difference was found. Lighter sibs were smarter and more educated ~55% of the time.

Figure 4

Putting this issue to the side for a moment, we replicated previous findings concerning cognitive ability mediating the color-HGE association. Figure 5 shows the results using linear regression and linear regression with rank-ordered variables. The latter simulates a non-parametric regression and was included because the color-AFQT association is monotonic non-linear. Again, AFQT differences more than completely explained subsequent outcome differences.

Figure 5

We then replicated our non-full sibling analysis. Here we included age and sex controlled correlations. This is shown in figure 6. The results of importance were the same as before: There was a significant association between color and cognitive ability within families between non-full siblings.

Figure 6

We also replicated the g- and t- factor analysis. The results are shown in figures 7 and 8. These are mostly the same as before.

Figure 7 & 8


We did not go through the tedious process of replicating our MCV analysis. The results from our previous analysis are shown in figure 9. We feel that these are reasonably accurate. The between family color-cognitive ability correlation is g-loaded.

Figure 9


Returning to the question of whether color is association with cognitive ability within families, we conducted a number of analyses based on dichotomously coded color. The concern was that the color scale was only a semi-interval scale. It’s possible that the treatment of color differences as interval scaled differences masks a “true” color-ability association.

Within populations: We looked at lighter siblings versus darker siblings. If the first sibling was lighter we coded them as being so (first, lighter sib =1, all else =0). If the first sibling was darker we coded them as being so (first darker sib, all else =0). We then entered these two dummy variables into linear regressions. The results are given in (10A). Next we created a dichotomous variable (first = darker sib =1, first = lighter sib =0) and then computed the point-biserial correlation, r(pb). The results are given in (10B). The purpose here was to remove the attenuating effect of sibling pairs for which there were no color differences. Finally, we added the scores of the first siblings = lighter to those of the first siblings = darker multiplied by (-1) and conducted a simple t-test. The logic here is that if there is no significant association between color and cognitive ability then the mean AFQT difference scores, which were computed by subtracting the scores of sib1 from those of sib2, should not be significantly different from zero when dealing separately with first sibs who were lighter or first sibs who were darker, or since we are working with a dichotomous pair, the first sibs who were lighter + first sibs who were darker*-1. The results are given in (10C). The results of this t-test, of course, are the same as those of the point biserial correlation since the statistic is the same. This is just another way of presenting the results. This way, the mean score differences can be seen. Between populations: We repeated the above analyses comparing the average scores of the lightest half of the sib pairs to the average scores of the darkest half of the sib pairs. To split the population we used median color scores (since the mean color scores were skewed).

Figure 10 presents the results based on unweighted scores. We found no significant association within families but some associations were trending towards significance. (For example, when weighted, the point biserial correlation was r(pb) = 0.07, p = 0.051, 1-tail.) Generally, these analyses substantiated our previous findings of a very small non significant relation within families for the full sample. The broad picture was that the more information we included in the analysis the smaller the within family correlations were. To review:

Figure 10

1. Simple binomial association (lighter sib — dichotomously coded — is smarter —- dichotomously coded). Difference: 55% to 45%. Significant. 2. Point biserial correlation (lighter sib — — dichotomously coded — is smarter). Difference: r(pb) = 0.07. Non-significant, but trending. 3. Spearman’s correlation (rank lightness associated with rank smartness). Difference: rho(unweighted) = 0.05. Not-significant, but close. 4. Pearson’s correlation (lightness — interval scale — associated with smartness — interval scale). Difference: r(unweighted) = 0.02. Not-significant, not even close.

This effect could be because the underlying statistical assumptions were violated and therefore more info equals more bias; alternatively. it could be that the “true” association within families between FS is almost undetectable. Whatever the case, since the cognitive ability scores which we are looking at, measured in adolescence as they were, are antecedent to adult outcomes, an association within families, if substantiated, would not support “colorism”, which holds that cognitive ability difference are consequent to outcome difference, themselves, which are consequent to labor market discrimination. But then what could be the cause of such differences? Previously, we suggested pleiotropy, but two alternatives strike us: the first is an additive genetic model in conjunction with mis-identified full sibs. We already showed a robust (statistically) significant association between color and cognitive ability within families between non-full siblings. Given this, it’s reasonable to conjecture that a slight within family correlation could be driven by mis-identification. An alternative is some form of parental level discrimination for lighter siblings. How this latter hypothesis could possibly be disentangled from a pleiotropy hypothesis, given the existent data sets, is beyond us. Before indulging in more speculations, though, we had better first look at “colorism” within socially defined races (e.g., Blacks).


Erceg-Hurn, D. M., & Mirosevich, V. M. (2008). Modern robust statistical methods: an easy way to maximize the accuracy and power of your research. American Psychologist, 63(7), 591.

4 thoughts on “Color Differences: Corrections and Further Analysis. Part 1

  1. One reason for full sibling mis-identification could be non-paternity events. There’s one study that says that the non-paternity rate in African Americans is about 10 percent. The study is actually from the 1960s, so the rate could be higher today.

    • What happened to your recent post? For Blacks, except in the case of PIAT, none of the within family full sib associations were significant by r,rho,r(pb),t — regardless of whether or not weights were used or whether or not sex, age, interviewer race was controlled for. But…many of the associations were approaching significance and were only 1/2 of those between families — virtually all of which were significant. So, I suspect a true small within family association. Full sibling mis-identification seems plausible. We might compare the rate of identified HS to the “true” rate. With the Add health sample, this hypothesis could be directly tested using the genetic data.

Leave a Reply

Your email address will not be published. Required fields are marked *