It has been noted that in the Americas racial identification and genomic racial ancestry frequently don’t well correspond. In Latin America, the association seems to be modest on the individual level. For example, Ruiz-Linares et al. (2014) found a correlation of 0.48 between self-identified European and Amerindian racial identity and genomic ancestry in a five country sample. In principle, the same could hold on the aggregate national level. And in some instances there’s a clear discordance. While the Argentinean and Brazilian national populations have roughly the same degree of pre-1500 European ancestry, Argentine has a White European national image while Brazil has a multiracial one. One might wonder, then, to what extent average racial self-identification concords with average racial admixture on the national level. This is an interesting question and others in a similar vein can be asked. For example: To what degree are differences in national racial identification related to such and such outcomes independent of genomic ancestry? Perhaps, for example, members of countries with a more European identity act in aggregate different than ones that have developed, net of genotype, a less European one — an acting White effect on the national level. Ruiz-Linares et al. (2014) found that, on the individual level, White identity was associated with wealth (but not educational attainment) net of European ancestry (see note 1). If such a pattern can exist on the level of the individual, it could so on the level of the nation. Here, the first matter will be explored. I first present several indexes of national ancestry for the Americans; these include: national genomic percents, aggregate self- identified race percents, Putterman’s ancestry percents, and national skin reflectance scores. For comparability, these values are expressed in terms of major racial categories e.g., White European, Black African, and Amerindian — plus an “Other” group. I then use correlation analysis to validate these estimates.
Genomic Ancestry (variables: Eugenomic, Afrgenomic, Amergenomic): Average genomic ancestry percents were created for the 36 American nations for which admixture data was available. Most Admixture studies decomposed geographic ancestry into three components (European, African, and Amerindian). For some, a significant fraction of the population had another regional ancestral component (e.g., South Asian, East Asian, or Oceanian). As such, an “Other” category was included. Not all possible studies were used in creating the averages. Rather, estimates from the most methodologically sound and nationally representative studies were. Roughly 70 different estimates were employed in creating the 36 national ones. For some countries up to four sets of estimates were averaged while for others only one was available. The results are shown in Table 1. For Belize and Paraguay no regional or national level data was available; estimates instead were calculated based on those of the surrounding nations; this was justified given the migration histories of the countries and ancillary facts. For Trinidad and Tobago data was available only for the Black population, which constitutes approximately 40% of the total. Estimates were made based on self reported ethnicity and reasonable assumptions given the known admixture in the Black population. For the Virgin Islands, again, admixture data was only available for the Black population, which constitutes 76% of the population. National level estimates were made on the assumption that the White population, which comprises 16% of the total, was fully European and that the mixed /other ethnic population, which comprises 8% of the total, was half European and half African. For the U.S., data was only available for specific ethnic populations (e.g., African Americans, Hispanics, Whites, and Native Americans). National level estimates were created by weighting these by the percent of individuals who identified with each ethnic category. Asians (~4.5% of the population) were treated as 100% Other. Pacific Islanders and Mixed race individuals (~1.5%) were discounted. For Canada, the national estimate was made using U.S. ethnic admixture percents in conjunction with Canadian ethnic identity percents. Computations and sources are provided in the excel file. To make rates more comparable across countries, national admixture was expressed also in terms of the three main source populations: European/ West Caucasian, African, and Amerindian. To note, Middle Eastern and North African ancestry components were also generally lumped with the “European” one.
Table 1: Genomic Ancestry Estimates by Nation
(In black, estimates were reported in sources and averaged; in blue, average estimates were partially estimated based on self identified ethnic rates in conjunction with admixture results; in red, ancestry was estimated based on that of adjacent nations.)
Self-Identified Race (variables: IDCIAWhite, IDCIABlack, IDCIAAmer): Percent self-reported ethnicity and race as given by the CIA World Factbook was used to create national racial identification (ID) averages, except in the case of Canada, in which case the 2011 Canadian census data was used. As with genomic ancestry, European, African, Amerindian, and Other percents were computed. Specific ethnic groups such as “Spanish” or “Aymara” were grouped into regional racial identities. For hybrid identities such as Mestizo and Mulatto percents were split by parental group e.g., one half European and one half Amerindian. For tribrid identities such as Montubio, percents were split three ways. Assumptions had to be made for a number of nations. For example, Costa Rica was said to be 83.6% “White and Mestizo”; this was treated as 83.6 percent Mestizo (that is, as 41.8 percent European and 41.8 percent Amerindian). St. Lucia was said to be 85.3% Black, 3.9% White, and 10.9% Mixed; it was assumed that the “mixed” group was mixed Black and White i.e., Mulatto; thus, the African identity component was 85.3 + 1/2 *(10.9) and the European component was 3.9%+1/2*(10.9). Judgment calls such as these were noted in the excel file. Again, to make estimates more comparable across countries, national racial identities were also expressed in terms of the three main races: European/ West Caucasian, Amerindian, and African.
Table 2. Racial identity by Nation
Putterman and Weil’s World Migration Matrix (variables: PuttermanEU, PuttermanAfr, PuttermanAmer): Ancestry components were also computed based on Putterman and Weil’s ancestry matrix for 165 countries. For each nation, the matrix gives the percent of ancestors hailing from every nation in the year 1500. Putterman and Weil based their estimates on a mix of genetic studies, immigration data, and other sources. As above, four ancestral components were created: European, African, Amerindian, and other (including Middle Easterner and North African). This was done by summing the year 1500 national ancestry components into the four mentioned broad categories. Again, to make scores more comparable across countries, ancestral components were also expressed in terms of the three main racial groups.
Table 3. Putterman’s Ancestral component by Nation
Skin Reflectance (variable: SkinRefl): National skin reflectance data was provided by Gerhard Meisenberg (Personal Communications, 2014). It has previously been used in a number of analyses e.g., Meisenberg and Woodley (2013). For this variable, higher values correspond with lighter skin color.
Table 4. National Skin Reflectance Scores
The (most recent) data file can be found here.
Method: Correlation analyses were run. Since nations differed wildly in population sizes (e.g., Cayman Islands Pop = 56,732; Brazil Pop = 202,656,788) and since the oddity of comparing countries that vary by up to four orders of magnitude in size has been pointed out (e.g., Hunt and Sternberg, 2006), weights were created by taking the square root of the population size. Weighted correlations are presented in the tables under the diagonal (in blue), unweighted above. All data is made available in case some wish to employ alternative methods.
Results: Results are shown in figures 1 through 3 below. European, African, and Amerindian genomic estimates strongly correlate with estimates based on racial identification and on Putterman and Weil’s ancestry matrix. As expected, White/European ancestry is a strong positive predictor of national reflectance, while Black/African ancestry is a strong negative one.
Figure 1: Correlations for European/White
Figure 2: Correlations for African/Black
Figure 3: Correlations for Amerindian
Overall, the results establish the validity of the ancestry estimates. They also establish that there is a high correspondence between genomic ancestry and average racial identification on the national level.
1. In their supplementary file (s2), Ruiz-Linares, et al. (2014) report a highly significant association between European ancestry and both wealth and education (r= 0.12, p-value <2.2×10-16) . Net of genotype, wealth and education was not associated with African or Amerindian racial identity. Wealth, however, was significantly but (apparently weakly) associated with European/White identity; the authors report a regression coefficient of 0.00291, p-value 6.1 x 10-4.
Hunt, E., & Sternberg, R. J. (2006). Sorry, wrong numbers: An analysis of a study of a correlation between skin color and IQ. Intelligence, 34(2), 131-137.
Meisenberg, G., & Woodley, M. A. (2013). Global behavioral variation: A test of differential- K. Personality and individual differences, 55(3), 273-278.
Ruiz-Linares, et al. (2014). Admixture in Latin America: geographic structure, phenotypic diversity and self-perception of ancestry based on 7,342 individuals. PLoS genetics, 10(9), e1004572.