Regional Admixture and Aptitude in Colombia

Emil and I set out to determine if regional variation in racial ancestry could (statistically) explain regional variation in cognitive ability. To keep things simple, we have limited focus to the Americas, which contain primarily trihybrid populations and for which there is a decent amount of admixture data. The results so far align with predictions.  Both across nations and across regions within the U.S., Brazil, and Mexico, European ancestry positively correlates with regional-level cognitive ability. In contrast, both African and Amerindian ancestry negatively so correlates. The broader importance of the project is that it involves the construction of an expansive data set which allows for the statistical controlling of continental lineage and associated factors (genes + deep culture), ones which presently confound many analyses. This data set will hopefully allow one to uncover regional and national level factors which are not tangled with ancestry. They must exist. For example, we find that regional levels of European ancestry are associated with better outcomes in both the U.S. and Brazil but also that there is a substantial between nation effect that can not be explained by factors correlated with continental ancestry.

Nationsandregions

Here, I will discuss a new analysis involving Colombia. Colombia is marked by extensive spatial variation in Colombia2ancestry.  The admixture map on the left copied from Ruiz-Linares et al. (2014) and the ethnic map on the right taken from Rodriguez-Palau et al. (2007) roughly capture the lay of the land. African admixture is concentrated along the Pacific and Caribbean coast, European admixture is highest in the north and central interior region, and Amerindian admixture is concentrated in the east and south. This ancestral variation allows for a test of our general model.

 

I computed the variables as follows:

 

AdmixtureColombia1:  Estimating regional admixture for Colombia’s 32 departments plus the capital was not without difficulty since existent studies provide admixture data for only half of the departments. Problematically, specific estimates for the eastern and southeastern departments, which are reported to have high Amerindian components were not available. Nonetheless, we were able to construct three sets of admixture estimates. First, 18 departmental + capital estimates were taken from Salzano and Sans’ (2014) compilation. The ancestry ratios from Salzano and Sans’ (2014) two main sources correlated at 0.9, so we felt that using the combined estimates was justified. Second, missing values were filled in based on regional values and based on Ruiz-Linares et al.’s (2014)  and Rodriguez-Palau et al.’s (2007) maps. For example, estimates for Caribbean-Pacific departments were averaged and used to fill in missing data for other departments in this region. In context to the U.S., this would be akin to filling in South Carolina values using the average of the Deep South ones. Third, admixture was estimated using ethnic identity data from the 2005 census in conjunction with average ethnoracial admixture percents as reported in all available studies. The ethnoracial admixture percents came out to as follows:

coladmix

The computation methods are detailed more precisely in the excel file.

Cognitive ability: For cognitive scores, the Colombian national SABER exam scores were used.  The average of the 2003 and 2005 grades 5 and 8 math and reading regional scores strongly correlated with the average of the 2012 and 2014 scores (about 0.85). The scores were on different metrics, moreover standard deviations were not available for the 2003 and 2005 scores (given the source used), so, in the end, the 2012 and 2014 average scores were employed.

Other variables: 2010 HDI scores were taken from Machado (2011). Ethnic identity percents were taken from the 2005 census. Population was taken from the census via Wikipedia.

Results:  I uploaded the excel file to facilitate future investigations. For the analyses reported below, in line with the general methods adopted for the meta-project, I excluded the capital and weighted by SQRT(population). Salzano and Sans’ (2014) admixture data showed only a weak negative correlation for Amerindian ancestry; this was because, as noted, data was missing for the most Amerindian parts of the country. When data was filled in, the association became significantly negative as predicted. It seems that the negative results are driven by the low scores in 5 districts (Amazonas, La Guajira, Guainía, Vaupés, and Vichada) all of which have high percents of self identifying indigenous and large reservations.

colregression1

The results immediately above were replicated using the ethnic-admixture data.

colreg2

Generally, European ancestry was non-trivially associated with cognitive ability (shown below) and with HDI (not shown). These results held regardless of which admixture variable was employed; they were largely driven by the strong negative association between regional outcomes and African ancestry.  It is interesting that regional Amerindian ancestry was not associated with regional ability in the case of Salzanploteuadmixo and Sans’ (2014) admixture estimates. While on the national level, Amerindian  ancestry negatively correlated with ability, as areas which were heavily populated by self-identifying Indigenous individuals did poorly, one might expect a more constant effect, one that would show up in Salzano and Sans’ (2014) restricted data set, which included only interior and coastal departments. The lack of association might have been due to the unreliability of the data, the specific samples analyzed, or the specific sampling of interior and coastal departments. Possibly, Amerindian ancestry is not negatively correlated with regional outcomes outside of largely indigenous regions. A determination of the matter will have to wait for the publication of more Colombian regional admixture data.

 

The Bell Curve, 20 years after

Or nearly so. I was planning to publish that blog article for the 31th December 2014. As you can see, I failed in this task, and didn’t finish in the right time. Anyway, I wrote this article, mainly because I am bothered that when people cite The Bell Curve the typical opponent responds with a link toward Wikipedia, specifically the part related to the “controversy” of The Bell Curve. It goes without saying that these persons did not read the books written in response to The Bell Curve. In fact, they have certainly read none of them. It is ridiculous to cite a book you didn’t read, but apparently, it does not bother many people, as I see.

For the 20 years of the book, I found appropriate to write a defense of the book. Or more precisely, a critical comment on the critics. I have decided to read carefully one of these books I can have access, and for what I have read here and there, it is probably the best book ever written against The Bell Curve. I know that Richard Lynn (1999) has already written a review before. But I wanted to go into the details. The title of the book I’m reviewing is :

Devlin, B. (1997). Intelligence, Genes and Success: Scientists Respond to the Bell Curve. Springer.

In fact, I have read that book some time ago, but didn’t find the need to read everything in detail. And I was unwilling to write a lengthy review. But I have changed my mind because of some nasty cowards.
Continue reading

Racial Ancestry in the Americas. Part 2: Cognitive Variation between Nations: Parasite Load, Climate, and Ancestry

Following up with a previous analysis, I examined the cognitive variation across the whole of the Americas using a newly constructed data set.  Files can be found here and here, with the latest versions provided on request.  The analysis was restricted to sovereign nations, not e.g., departments such as Martinique or territories such as the Virgin Islands.  Non-sovereign regions were excluded so to avoid an inter-nation x intra-national interaction and because international exam data was not available for these regions.  The following 35 countries were included:  Argentina, Antigua and Barbuda, Bahamas The, Belize, Bolivia, Brazil, Barbados, Chile, Colombia, Costa Rica, Cuba, Dominica, Dominican Republic, Ecuador, Guyana, Grenada, Honduras, Haiti, Jamaica, St. Kitts and Nevis, St. Lucia, Mexico, Nicaragua, Panama, Peru, Paraguay, El Salvador, Suriname, Trinidad and Tobago, Uruguay, United States, St. Vincent, Venezuela RB, and Canada. Eight regression analyses were run, using the following dependent variables:

  • (Skinrefl) Skin reflectance.
  • (AchQ) National Achievement Scores –  this was an updated set provided by Gerhard Meisenberg during October of 2014.
  • (NIQ) National IQ scores – these were based on Richard Lynn’s 2014 (work in progress) results and Jason Malloy’s 2013 to 2014 estimates, with adjustments.
  • (AHQ) 1880 to 1930 birth cohort age heaping scores — this is a measure of education/numeracy.
  • (logSciresearch) Log of scientific researchers from 2005 to 2012.
  • (logGDP) Average of 1990, 2000, and 2010 log of World Bank per capita GDP.
  • (Crimes) Violent Crime rates.
  • (HDI2012) 2012 Human Development Index scores.

The following independents were included:

  • (relativeEu) European Ancestry percent — the percent of European ancestry out of the percent of  European + Amerindian + African ancestry.  (For a discussion of this variable, refer here.)
  • (notUSCanada) Not US or Canada — whether the region was not US or Canada.
  • (logparasiteload) Log Parasite load — the log of the 2004 WHO parasite infections per 100,000 for each country.
  • (logColddemand) Log Cold demand — the log of Van de Vliert’s (2013) cold stress scores.
  • (PopUnder1million)  Population under 1 million — whether the country’s population was under one million.

Simple correlation analysis demonstrated that ancestry, cold weather, and parasite load intercorrelated.  This situation renders difficult the isolation of causal associations.  To illustrate, skin reflectance was set as a dependent with Eu ancestry, cold weather,  parasite load, population under 1 million, and not US and Canada as independents.  The correlation between Eu ancestry and skin reflectance is clearly mostly genetic in origin.  To the extent that the association between ancestry and skin reflectance is mediated by other variables, it is suggested that these variables co-vary with causal effects related to genes (and thus that controlling for them controls for ancestry related causal effects).  Regression results are shown in Table 1, below.  Generally, parasite load and cold weather seem to partially index ancestry effects.  Parasite load is a particularly problematic “environmental factor” because it significantly correlates with STD and HIV rates (at 0.47).  Yet the spread of HIV throughout the Americas, in the ’70s and ’80s, was subsequent to the origin of cognitive ability differences, which, in the form of national age heaping rates, were already present in the 1800s.  Thus, STD and HIV rates and with them parasite load are, to some extent, consequent of cognitive ability differences.

Results will not be discussed in detail.  The data file is made freely available; readers can run the analyses as desired.   Generally, European ancestry was a robust predictor of lower rates of violent crime, scientific activity, and achievement scores, and achievement plus National IQ scores.  (For national IQ alone, in the final model, none of the predictors were significant; this was because the NIQ sample had many missing values.)   In contrast to cognitive ability and the other mentioned indexes, European ancestry was generally not significantly associated with GDP or Human developmental indexes.  The results for National achievement scores are shown in Table 2, below; a regression plot is shown in figure 1.

Table 1.  Regression Results for Skin reflectance

reg1skinrefl

Table 2.  Regression Results for ACHQ2014

ACH2014

Figure 1.  National Achievement Scores by % European Ancestry for Sovereign American Nations

ACH2014AncestryAmer

Racial Ancestry in the Americas. Part 1: National Genomic Racial Admixture: Estimates and Validation

It has been noted that in the Americas racial identification and genomic racial ancestry frequently don’t well correspond. In Latin America, the association seems to be modest on the individual level. For example, Ruiz-Linares et al. (2014) found a correlation of 0.48 between self-identified European and Amerindian racial identity and genomic ancestry in a five country sample. In principle, the same could hold on the aggregate national level. And in some instances there’s a clear discordance. While the Argentinean and Brazilian national populations have roughly the same degree of pre-1500 European ancestry, Argentine has a White European national image while Brazil has a multiracial one. One might wonder, then, to what extent average racial self-identification concords with average racial admixture on the national level. This is an interesting question and others in a similar vein can be asked. For example: To what degree are differences in national racial identification related to such and such outcomes independent of genomic ancestry? Perhaps, for example, members of countries with a more European identity act in aggregate different than ones that have developed, net of genotype, a less European one — an acting White effect on the national level. Ruiz-Linares et al. (2014) found that, on the individual level, White identity was associated with wealth (but not educational attainment) net of European ancestry (see note 1). If such a pattern can exist on the level of the individual, it could so on the level of the nation. Here, the first matter will be explored. I first present several indexes of national ancestry for the Americans; these include: national genomic percents, aggregate self- identified race percents, Putterman’s ancestry percents, and national skin reflectance scores. For comparability, these values are expressed in terms of major racial categories e.g., White European, Black African, and Amerindian — plus an “Other” group. I then use correlation analysis to validate these estimates.

Continue reading

District-Level Variation in Continental Racial Admixture Predicts Outcomes in Mexico

Previously, a literature review was conducted regarding continental racial admixture and educational attainment and/or socioeconomic status.  Across the Americas, Amerindian and African (versus European) ancestry was found to be negatively correlated in admixed populations (e.g., Hispanic Americans) with income, educational attainment, occupational rank, and other cognitively correlated indexes of socioeconomic status.  Multiple possible explanations were discussed.  Some of these predict that the ancestry-outcome association will generalize spatially, such that admixture will be correlated with outcomes across regions and nations.  This need not be the case and is not directly predicted by other accounts of the individual level admixture-outcome associations, such as phenotypic based discrimination ones, which work on the individual-level.  The association between regional ancestral and cognitive related outcome variation in Mexico will first be explored, since for this county reliable regional admixture estimates, at least with regards to European and Amerindian ancestry, and outcome measures are available and also since there is a good deal of spatial variation in admixture (and outcomes).  Subsequently, the analysis will be generalized to the whole of the Americas.  The Figure 1 below depicts the Mexican spatial ancestral racial variation.

Method:  Admixture estimates:  Admixture estimates were taken from Salzano and Sans (2014) and Moreno-Estrada et al.  (2014). For the two sources, Pearson correlation was 0.94 for European admixture, -0.60 for African Admixture, and 0.94 for Amerindian admixture. Regarding European and Amerindian admixture, the estimates exhibited a high reliability, thus justifying their combination.  The African Admixture estimates were unreliable due to the noisiness of the measures in conjunction with the limited range and variance in admixture. Admixture estimates were averaged for each district.  Missing district data was then estimated based on the measured admixture of adjacent regions.  This produced four different admixture estimates:  (a) Salzano and Sans (2014), (b) Moreno-Estrada et al. (2014), (c) the average of (a) and (b), and (d) estimates based on (c) taking into account regional proximity.    Descriptives are presented in Table 1.  Cognitive Ability estimates:  2003, 2006, 2009, and 2012 average math and reading PISA scores were computed for each district.  Regional scores were highly correlated across years, thus justifying the use of cross year average scores; deviation scores relative to the Mexican national mean were computed and averaged across years. 2002 and 2005 average district level Raven’s Matrices scores were also computed.  Human Development Index:  2010 and 2012 Human Development Index scores highly correlated across year.  And average scores was computed. The excel data file is attached.

Results:  Since % African estimates were unreliable, these were treated as noise and this noise was partialed out in the correlation analyses. Thus, for these analyses the total ancestry was the Amerindian + European ancestry. The correlations between district level Amerindian Ancestry (with and without estimates) and district level cognitive ability and human development are shown below. Percent Amerindian Admixture was a strong negative predictor of district level outcomes. Partailing out the noisy African Admixture didn’t have a substantive effect on the correlations. Correlations using estimated and measured only admixture were similar as were ones using averaged admixture estimates and those provided by Salzano and Sans (2014) and Moreno-Estrada et al. (2014) independently. Results are shown in Table 2.  The regression plot showing district level admixture (without African ancestry partailed out) and district level cognitive scores is shown in Figure 2.  District level cognitive ability substantially mediated the association between continental ancestry and HDI (with and without African ancestry controlled for).  (Without African admixture controlled for: Pearson correlation (AmerAdmix x HDI) =  -0.603; AmerAdmix x HDI) correlation with PISA scores partailed out: -0.271.)

Discussion:  Regional Amerindian Admixture was a robust predictor of regional outcome differences.  The Amerindian ancestry -outcome association found based on admixture mapping generalizes spatially, at least in Mexico. These findings are as would be predicted by an evolutionary genetic explanation.  Variations of shared environmental — “cultural”  — accounts are possible insofar as shared environment can be genealogically sticky.

Tables and Figures:
Continue reading

Is there no population genetic ‘support’ for a racial hereditarian hypothesis?

(10/18/2014 update:  data from two additional studies — Martínez et al.  (2007) and Ruiz-Linares et al. (2014) — have been added.)

Over the last decade, scores of large scale admixture-mapping studies have been conducted largely in an attempt to elucidate the origin of ethnic disparities in disease rates and medical outcomes.  In the simplest type of such studies, researchers determine if there is a robust association between genotypically defined continental racial ancestry (typically: African, European, and Amerindian) and relevant outcomes in admixed populations.  To control for potential confounding effects, measures of educational attainment and other indexes of SES are often included in the analyses.  These variables are often treated as environmental indicators, which is odd, since within populations they are found to be under non-trivial genetic influence.  For example, based on a recent international meta-analysis of biometric studies involving 51,545 kinship pairs, Branigan, et al. (2013) found that educational attainment had a kinship-based heritability of 0.40, meaning that genes explained 40% of inter-individual educational differences; based on a sample involving 7,959 individuals, Rietveld et al. (2013, table S12)  found a GCTA-based heritability, one which takes into account only the effects of population-wide common genetic variants, of 0.22.  These results were replicated by Marioni, et al. (2014, table 3), who found a kinship-based heritability of 0.40 and a GCTA-based one of 0.21.  When genes explain some of the variance in a trait within groups, they plausibly explain an indefinite portion of the variance between groups.  Curious it is, then, that these external outcomes are often assumed to represent environmental influences between groups.

Continue reading