Category: Stats

DIF Review and Analysis of Racial Bias in Wordsum Test using IRT and LCA

February 10, 2023 / Meng Hu / 0 Comments

As reviewed in my previous article, the majority of studies on measurement bias, either on the item- or subtest-level, reached an agreement about the fairness of IQ test. Unfortunately, even among studies which use acceptable Differential Item Functioning (DIF) methods, the procedure was often sub-optimal. This probably leads to more spurious DIFs being detected.

The advantages (and shortcomings) of each DIF method are presented. The GSS data is used to compare the performance of the two best DIF methods, namely IRT and LCA, at detecting bias in the wordsum vocabulary test between whites and blacks.
Continue reading

How to calculate and use predicted Y-values in multiple regression

December 18, 2014 / Meng Hu / 0 Comments

Here, I will explain how to use the so-called “Yhat” or predicted values of Y when doing regression (OLS, logistic and multilevel).

(Update 2017) This article is based on my paper: Hu, M. (2017). An update on the secular narrowing of the Black-White gap in the Wordsum vocabulary test (1974-2012). Mankind Quarterly, 58(2), 324-354. Continue reading

The Fallacy of Significance Tests

June 10, 2014 / Meng Hu / 3 Comments

It must be known that a p-value, or any other statistics based on the Chi-Square, is not a useful number. It has two components : sample size and effect size. Its ability to detect a non-zero difference increases when either sample size or effect size increases. If only sample size increases, even with the other left constant, the statistics become inflated. There is also a problem with the assumption. If it is about the detection of “non-zero” difference, it is of no use if the magnitude, i.e., effect size, is of no importance. I will provide several examples of the deceptiveness of significance tests.
Continue reading

Multiple Regression, Multiple Fallacies

June 7, 2014 / Meng Hu / 5 Comments

It goes without saying that multiple regression is one of most popular and applied statistical methods. Thus, it would be odd if most practitioners among scientists and researchers do not understand and misapply it. And yet, this provocative conclusion seems most likely.

Because a simple bivariate correlation does not disentangle confounding effects, the multiple regression is said to be preferred. The technique attempts to evaluate the strength of an independent (predictor) variable in the prediction of an outcome (dependent) variable, when controlling, i.e., holding constant, every other variables entered (included) as independent variables into the regression model, either progressively step by step or altogether at the same time. The rationale is to get the effect of an independent variable that only belongs to it. But this is a fallacy.
Continue reading

What does it mean to have a low R-squared ? A warning about misleading interpretation

March 31, 2014 / Meng Hu / 9 Comments

A common argument we read everytime, everywhere. All with the same common mistake. It consists in squaring the correlation. For example : “Your brain-IQ correlation is r=0.40, so if you square it, that only amounts to a tiny 16% (r²=0.40*0.40=0.16) of variance explained which is not impressive”. Or something in this vein. R² use and abuse caused enough damage. It is more than time to put an end to this utter fallacy.

Category: Stats

DIF Review and Analysis of Racial Bias in Wordsum Test using IRT and LCA

How to calculate and use predicted Y-values in multiple regression

The Fallacy of Significance Tests

Multiple Regression, Multiple Fallacies

What does it mean to have a low R-squared ? A warning about misleading interpretation

Recent Posts

Categories

Good reading on Human Varieties

Archives

Meta