Racial/Ethnic Differences in the SAT in 2023

This post is a quick update on my in-depth review of group differences in the SAT, occasioned by the publication of the College Board’s reports on the SAT scores of the 2023 high school graduate cohort. Using national-level test data as well as data from selected states, I will examine how the most recent results relate to the trends that I previously identified.

1. Test participation

The following graph shows the numbers of SAT-takers from different groups across years, with the 2023 figures highlighted in red. The 2023 data are from this report, while sources for the earlier years are listed in Chapter 1 of the earlier post.

Even with the retreat of the Covid pandemic, you might expect SAT participation to be in serious decline as most colleges have gone “test-optional” or “test-blind” in recent years. However, the 2023 participation numbers actually point to a continuing recovery. The total number of test-takers in 2023 was about 1.9 million, which is clearly short of the record 2.2 million seen in 2019 and 2020, but well above the long-term participation trend. (Note that the SAT administration years refer to high school graduate cohorts rather than the actual dates when the tests were taken.) The recovery in participation was reasonably balanced across racial/ethnic groups.

2. National results

The following graph shows national SAT total score means by race/ethnicity in 1987–2023. The 2023 data are from this report, while sources for the earlier years are listed in Chapter 1 of the earlier post, where the various time trends seen in the graph are discussed in detail. In 2006–2016, SAT total scores ranged from 600 to 2400, while a scale of 400 to 1600 has been used in all other years. In the graph, the scores have been transformed so that the 400 to 1600 scale is used in all years.

The SAT means for all groups have tended to decline in the last few years, and this trend continued in 2023. Not even Asians are immune to this downturn. While this is consistent with learning loss due to the pandemic, in the national data this explanation is difficult to disentangle from changes in test participation. State-level data, to be examined next, may shed some more light on this phenomenon.

3. Results from high-participation states

In 2020, there were 11 states where 98–100 percent of high school graduates took the SAT. In 2023, the rate was that high only in the District of Columbia. However, 10 of the 11 states had a rate of at least 90 percent in 2023, so they still provide reasonably representative samples of high school graduates. The table below lists the high-participation states, with links to SAT reports for each state.

States with high SAT-participation in 2023
State N Participation rate
Colorado 57293 90%
Connecticut 40405 93%
Delaware 10368 95%
District of Columbia 4987 100%
Florida 205159 90%
Idaho 21813 95%
Illinois 142769 96%
Michigan 102466 97%
Rhode Island 10745 95%
West Virginia 16154 90%
Overall 612159 93%

The figure below depicts total mean scores by race/ethnicity in the ten states in 2023. The states are sorted so that white means increase from left to right.

These ten states had ~100 percent participation rates in 2020, so it is useful to compare the results in the two years. This is done in the next graph. Sources for the 2020 data are linked in Table 3.1 of the earlier post.

With the exception of Native Americans, the mean scores of all groups decreased from 2020 to 2023, as measured by median scores across the high-participation states. This happened despite the fact that participation rates in 2023 were somewhat lower, with probably more attrition among less able students. The declining scores are consistent with the pandemic learning loss hypothesis. Alternative, it could be that students’ level of effort in the test is declining together with its importance.

The results for Native Americans may signal a turnaround for them after many years of peculiarly poor SAT scores. However, the sample sizes are small, and no similar reversal is seen in the national data, so it may be simply a selection artifact.

The final table shows numerical differences in SAT total score means between whites and non-whites across the two years. Positive gaps mean that the non-white group outscored whites, while negative gaps indicate a white advantage. It can be seen that non-whites generally gained ground on whites in the last few years, a reflection of greater declines in white than non-white mean scores.

Differences in SAT total scores between whites and non-whites in high-participation states in 2020 and 2023
  Asian–white   Black–white   Hispanic–white   Native American–white
State   2020 2023   2020 2023   2020 2023   2020 2023
Colorado 50 55 -167 -166 -156 -166 -212 -162
Connecticut 122 133 -198 -203 -178 -175 -167 -148
Delaware 115 147 -172 -161 -153 -155 -212 -178
District of Columbia -53 -6 -386 -367 -294 -265 -481 -434
Florida 85 115 -170 -158 -80 -67 -133 -139
Idaho 51 70 -137 -146 -126 -126 -173 -123
Illinois 104 106 -186 -200 -129 -147 -234 -199
Michigan 135 157 -173 -168 -108 -93 -200 -109
Rhode Island 68 64 -173 -143 -162 -132 -202 -199
West Virginia 193 175 -85 -72 -12 1 -123 -40
MEDIAN 98 114 -172 -162 -137 -136 -211 -150

4. Digital SAT

In 2024, the paper-and-pencil SAT will be replaced by a new digital test. The new test is based on a multistage adaptive design, meaning that both the verbal and math sections are split into two parts, and the test-taker’s performance in the first part determines the difficulty of the items in the second part. A benefit of this design is that the same test score reliability as before can be achieved with fewer items and a shorter testing time.

According to the College Board, scores on the digital SAT will be equivalent with those from the current test, so that there will be no need to convert scores between test versions. This would suggest that racial/ethnic gaps in the new test will be very much like they are today. That remains to be seen.

The College Board has also promised to conduct various validity studies related to the digital test. As I discussed in my previous SAT post, published analyses on the predictive validity and measurement invariance of the current test are somewhat unsatisfactory. Hopefully research on the new test will be better.

Appendix: R code
# read data for 2023
hp_sat2023 <- read.csv(text="Year,State,Group,N,Total,ERW,Math
2023,National,All Groups,1913742,1028,520,508
2023,National,Native American,15384,901,458,443
2023,National,Asian,194108,1219,593,626
2023,National,Black,225954,908,466,441
2023,National,Hispanic/Latino,462186,943,482,461
2023,National,Pacific Islander,3791,925,473,452
2023,National,White,752632,1082,550,532
2023,National,Two or More Races,69410,1091,556,535
2023,National,No Response,190277,955,478,477
2023,Colorado,All Groups,57293,996,508,488
2023,Colorado,Native American,510,898,458,440
2023,Colorado,Asian,2211,1115,553,562
2023,Colorado,Black,2353,894,458,436
2023,Colorado,Hispanic/Latino,19202,894,456,438
2023,Colorado,Pacific Islander,138,906,464,442
2023,Colorado,White,29348,1060,542,518
2023,Colorado,Two or More Races,2499,1052,538,514
2023,Colorado,No Response,1032,960,490,470
2023,Connecticut,All Groups,40405,1007,512,495
2023,Connecticut,Native American,80,919,467,452
2023,Connecticut,Asian,2343,1200,591,609
2023,Connecticut,Black,4341,864,444,421
2023,Connecticut,Hispanic/Latino,10185,892,456,436
2023,Connecticut,Pacific Islander,42,899,455,444
2023,Connecticut,White,20913,1067,543,524
2023,Connecticut,Two or More Races,1545,1051,535,516
2023,Connecticut,No Response,956,1034,522,512
2023,Delaware,All Groups,10368,958,489,469
2023,Delaware,Native American,73,862,440,423
2023,Delaware,Asian,459,1187,592,595
2023,Delaware,Black,2289,879,450,429
2023,Delaware,Hispanic/Latino,1790,885,452,433
2023,Delaware,Pacific Islander,6,NA,NA,NA
2023,Delaware,White,4048,1040,532,508
2023,Delaware,Two or More Races,565,975,500,475
2023,Delaware,No Response,1138,849,428,420
2023,District of Columbia,All Groups,4987,969,495,474
2023,District of Columbia,Native American,25,798,400,399
2023,District of Columbia,Asian,147,1226,613,613
2023,District of Columbia,Black,2302,865,443,422
2023,District of Columbia,Hispanic/Latino,679,967,495,471
2023,District of Columbia,Pacific Islander,4,NA,NA,NA
2023,District of Columbia,White,1006,1232,630,602
2023,District of Columbia,Two or More Races,190,1161,592,568
2023,District of Columbia,No Response,634,822,421,402
2023,Florida,All Groups,205159,966,503,463
2023,Florida,Native American,1255,891,465,426
2023,Florida,Asian,7434,1145,575,570
2023,Florida,Black,35432,872,459,413
2023,Florida,Hispanic/Latino,66716,963,503,460
2023,Florida,Pacific Islander,392,893,468,425
2023,Florida,White,68592,1030,534,496
2023,Florida,Two or More Races,7114,1020,531,489
2023,Florida,No Response,18224,828,436,393
2023,Idaho,All Groups,21813,970,494,476
2023,Idaho,Native American,504,900,458,442
2023,Idaho,Asian,311,1093,546,547
2023,Idaho,Black,216,877,451,426
2023,Idaho,Hispanic/Latino,2798,897,456,441
2023,Idaho,Pacific Islander,62,930,474,456
2023,Idaho,White,11364,1023,521,502
2023,Idaho,Two or More Races,614,1008,514,494
2023,Idaho,No Response,5944,901,460,442
2023,Illinois,All Groups,142769,970,492,478
2023,Illinois,Native American,1370,876,446,430
2023,Illinois,Asian,7745,1181,583,599
2023,Illinois,Black,14048,875,448,427
2023,Illinois,Hispanic/Latino,30921,928,470,457
2023,Illinois,Pacific Islander,119,874,448,425
2023,Illinois,White,47968,1075,544,531
2023,Illinois,Two or More Races,3942,1064,541,524
2023,Illinois,No Response,36656,855,436,419
2023,Michigan,All Groups,102466,967,493,474
2023,Michigan,Native American,1229,886,453,433
2023,Michigan,Asian,4368,1152,570,582
2023,Michigan,Black,14136,827,426,401
2023,Michigan,Hispanic/Latino,9903,902,462,440
2023,Michigan,Pacific Islander,77,901,462,440
2023,Michigan,White,67583,995,507,488
2023,Michigan,Two or More Races,2341,1042,532,510
2023,Michigan,No Response,2829,914,469,445
2023,Rhode Island,All Groups,10745,958,489,468
2023,Rhode Island,Native American,74,821,412,409
2023,Rhode Island,Asian,369,1084,540,544
2023,Rhode Island,Black,966,877,447,431
2023,Rhode Island,Hispanic/Latino,1744,888,454,435
2023,Rhode Island,Pacific Islander,13,918,473,445
2023,Rhode Island,White,5771,1020,522,498
2023,Rhode Island,Two or More Races,374,988,508,480
2023,Rhode Island,No Response,1434,811,414,397
2023,West Virginia,All Groups,16154,923,478,445
2023,West Virginia,Native American,33,885,462,422
2023,West Virginia,Asian,178,1100,546,554
2023,West Virginia,Black,656,853,443,410
2023,West Virginia,Hispanic/Latino,319,926,479,447
2023,West Virginia,Pacific Islander,9,NA,NA,NA
2023,West Virginia,White,14274,925,479,446
2023,West Virginia,Two or More Races,84,1061,550,511
2023,West Virginia,No Response,601,883,458,425")

# read pre-2023 cohort size data
cohort_sizes <- read.csv(text="Year,Group,N
2002,All Groups,1327831
2002,Asian/Pacific Islander,103242
2002,Black,122684
2002,Hispanic/Latino,104155
2002,Native American,7506
2002,No Response,252618
2002,Other,38967
2002,White,698659
2003,All Groups,1406324
2003,Asian/Pacific Islander,100970
2003,Black,125657
2003,Hispanic/Latino,107492
2003,Native American,7452
2003,No Response,355347
2003,Other,39146
2003,White,670260
2004,All Groups,1419007
2004,Asian/Pacific Islander,112542
2004,Black,137953
2004,Hispanic/Latino,122380
2004,Native American,8219
2004,No Response,271545
2004,Other,46615
2004,White,719753
2005,All Groups,1475623
2005,Asian/Pacific Islander,134996
2005,Black,153132
2005,Hispanic/Latino,144196
2005,Native American,8916
2005,No Response,151440
2005,Other,58167
2005,White,824776
2006,All Groups,1465744
2006,Asian/Pacific Islander,138303
2006,Black,150643
2006,Hispanic/Latino,151761
2006,Native American,9301
2006,No Response,135346
2006,Other,54469
2006,White,825921
2007,All Groups,1494531
2007,Asian/Pacific Islander,140794
2007,Black,159849
2007,Hispanic/Latino,168544
2007,Native American,9897
2007,No Response,133508
2007,Other,53901
2007,White,828038
2008,All Groups,1518859
2008,Asian/Pacific Islander,151235
2008,Black,174383
2008,Hispanic/Latino,190203
2008,Native American,9595
2008,No Response,82866
2008,Other,52016
2008,White,858561
2009,All Groups,1530128
2009,Asian/Pacific Islander,158757
2009,Black,187136
2009,Hispanic/Latino,206584
2009,Native American,8974
2009,No Response,66448
2009,Other,51215
2009,White,851014
2010,All Groups,1547990
2010,Asian/Pacific Islander,166064
2010,Black,196961
2010,Hispanic/Latino,222380
2010,Native American,8550
2010,No Response,67098
2010,Other,48702
2010,White,838235
2011,All Groups,1647123
2011,Asian/Pacific Islander,183853
2011,Black,215816
2011,Hispanic/Latino,252703
2011,Native American,9244
2011,No Response,61148
2011,Other,58699
2011,White,865660
2012,All Groups,1664479
2012,Asian/Pacific Islander,192577
2012,Black,217656
2012,Hispanic/Latino,272633
2012,Native American,9716
2012,No Response,57413
2012,Other,62340
2012,White,852144
2013,All Groups,1660047
2013,Asian/Pacific Islander,196030
2013,Black,210151
2013,Hispanic/Latino,284261
2013,Native American,9818
2013,No Response,62603
2013,Other,62251
2013,White,834933
2014,All Groups,1672395
2014,Asian/Pacific Islander,206564
2014,Black,212524
2014,Hispanic/Latino,300357
2014,Native American,9767
2014,No Response,55588
2014,Other,64774
2014,White,822821
2015,All Groups,1698521
2015,Asian/Pacific Islander,211238
2015,Black,219018
2015,Hispanic/Latino,322873
2015,Native American,10031
2015,No Response,70062
2015,Other,65063
2015,White,800236
2016,All Groups,1637589
2016,All Groups,1637589
2016,Asian,196735
2016,Asian/Pacific Islander,199106
2016,Black,199306
2016,Hispanic/Latino,355829
2016,Native American,7778
2016,No Response,84070
2016,Other,20604
2016,Pacific Islander,2371
2016,Two or More Races,28460
2016,White,742436
2017,All Groups,1715481
2017,Asian,158031
2017,Asian/Pacific Islander,162162
2017,Black,225860
2017,Hispanic/Latino,408067
2017,Native American,7782
2017,No Response,94199
2017,Pacific Islander,4131
2017,Two or More Races,57049
2017,White,760362
2018,All Groups,2136539
2018,Asian,217971
2018,Asian/Pacific Islander,223591
2018,Black,263318
2018,Hispanic/Latino,499442
2018,Native American,10946
2018,No Response,131339
2018,Pacific Islander,5620
2018,Two or More Races,77078
2018,White,930825
2019,All Groups,2220087
2019,Asian,228527
2019,Asian/Pacific Islander,233957
2019,Black,271178
2019,Hispanic/Latino,554665
2019,Native American,12917
2019,No Response,112350
2019,Pacific Islander,5430
2019,Two or More Races,87178
2019,White,947842
2020,All Groups,2198460
2020,Asian,223451
2020,Asian/Pacific Islander,228558
2020,Black,261326
2020,Hispanic/Latino,569370
2020,Native American,14050
2020,No Response,125513
2020,Pacific Islander,5107
2020,Two or More Races,89656
2020,White,909987
2021,All Groups,1509133
2021,Asian,167208
2021,Asian/Pacific Islander,170223
2021,Black,168454
2021,Hispanic/Latino,352094
2021,Native American,10288
2021,No Response,117627
2021,Pacific Islander,3015
2021,Two or More Races,54961
2021,White,635486
2022,All Groups,1737678
2022,Asian,175468
2022,Asian/Pacific Islander,178844
2022,Black,201645
2022,Hispanic/Latino,396422
2022,Native American,14800
2022,No Response,146319
2022,Pacific Islander,3376
2022,Two or More Races,66702
2022,White,732946")

# merge cohort size data
cohort_sizes <- merge(cohort_sizes, subset(hp_sat2023, State=="National", select=c("Year", "Group", "N")), all=TRUE)
cohort_sizes <- rbind(cohort_sizes, list(2023, "Asian/Pacific Islander", 194108+3791))

# graph of SAT cohort sizes
#install.packages("ggplot2")
library(ggplot2)
ggplot(data=subset(cohort_sizes, Year<2023), aes(Year, N))+
geom_point(color="green4")+
geom_point(data=subset(cohort_sizes, Year==2023), color="red")+
geom_line(linetype = "solid",color="green4")+
geom_line(data=subset(cohort_sizes, Year>2021), linetype = "solid", color="red")+
theme_classic()+
scale_y_continuous(labels=function(x) format(x, big.mark = ",", scientific = FALSE))+
theme(panel.grid.major = element_line(color = "gray87", linetype = "solid"), axis.title.x = element_text(margin = margin(t = 10)), axis.title.y = element_text(vjust=1),text=element_text(size=16), plot.caption = element_text(hjust = 0, margin = margin(t = 15), size = 16), axis.text.x = element_text(size=9, angle=90,vjust=0.5), axis.text.y = element_text(size=9))+
scale_x_continuous(breaks=c(2002:2023))+
labs(caption="Number of SAT takers by race/ethnicity in 2002–2023", y="Number of test takers")+
theme(legend.position="none")+
facet_wrap(. ~ Group, ncol=3,scales = "free_y")
ggsave("sat_cohort_sizes.png", height=5.4, width=9.9, dpi=300)

# total score means by race/ethnicity in 1987-2022
total_scores <- read.csv(text="Year,Group,Total
1987,Asian/Pacific Islander,1020
1987,Black,839
1987,Hispanic/Latino,912
1987,Native American,934
1987,White,1038
1991,Asian/Pacific Islander,1033
1991,Black,846
1991,Hispanic/Latino,911
1991,Native American,938
1991,White,1031
1997,Asian/Pacific Islander,1056
1997,Black,857
1997,Hispanic/Latino,918
1997,Native American,950
1997,White,1052
2001,Asian/Pacific Islander,1067
2001,Black,859
2001,Hispanic/Latino,915
2001,Native American,960
2001,White,1060
2002,Asian/Pacific Islander,1070
2002,Black,857
2002,Hispanic/Latino,911
2002,Native American,962
2002,White,1060
2003,Asian/Pacific Islander,1083
2003,Black,857
2003,Hispanic/Latino,912
2003,Native American,962
2003,White,1063
2004,Asian/Pacific Islander,1084
2004,Black,857
2004,Hispanic/Latino,916
2004,Native American,971
2004,White,1059
2005,Asian/Pacific Islander,1091
2005,Black,864
2005,Hispanic/Latino,923
2005,Native American,982
2005,White,1068
2006,Asian/Pacific Islander,1066.66666666667
2006,Black,860.666666666667
2006,Hispanic/Latino,913.333333333333
2006,Native American,970
2006,White,1054.66666666667
2007,Asian/Pacific Islander,1070
2007,Black,858
2007,Hispanic/Latino,913.333333333333
2007,Native American,969.333333333333
2007,White,1052.66666666667
2008,Asian/Pacific Islander,1073.33333333333
2008,Black,853.333333333333
2008,Hispanic/Latino,908.666666666667
2008,Native American,964
2008,White,1055.33333333333
2009,Asian/Pacific Islander,1082
2009,Black,850.666666666667
2009,Hispanic/Latino,907.333333333333
2009,Native American,965.333333333333
2009,White,1054
2010,Asian/Pacific Islander,1090.66666666667
2010,Black,851.333333333333
2010,Hispanic/Latino,909.333333333333
2010,Native American,962.666666666667
2010,White,1053.33333333333
2011,Asian/Pacific Islander,1093.33333333333
2011,Black,848
2011,Hispanic/Latino,905.333333333333
2011,Native American,958
2011,White,1052.66666666667
2012,Asian/Pacific Islander,1094
2012,Black,848.666666666667
2012,Hispanic/Latino,901.333333333333
2012,Native American,955.333333333333
2012,White,1052
2013,Asian/Pacific Islander,1096.66666666667
2013,Black,852
2013,Hispanic/Latino,902.666666666667
2013,Native American,951.333333333333
2013,White,1050.66666666667
2014,Asian/Pacific Islander,1100.66666666667
2014,Black,852
2014,Hispanic/Latino,902
2014,Native American,952
2014,White,1050.66666666667
2015,Asian/Pacific Islander,1102.66666666667
2015,Black,851.333333333333
2015,Hispanic/Latino,896
2015,Native American,948.666666666667
2015,White,1050.66666666667
2016,Asian,1110
2016,Black,846.666666666667
2016,Hispanic/Latino,891.333333333333
2016,Native American,924
2016,Pacific Islander,862
2016,White,1048
2017,Asian,1181
2017,Black,941
2017,Hispanic/Latino,990
2017,Native American,963
2017,Pacific Islander,986
2017,White,1118
2018,Asian,1223
2018,Black,946
2018,Hispanic/Latino,990
2018,Native American,949
2018,Pacific Islander,986
2018,White,1123
2019,Asian,1223
2019,Black,933
2019,Hispanic/Latino,978
2019,Native American,912
2019,Pacific Islander,964
2019,White,1114
2020,Asian,1217
2020,Black,927
2020,Hispanic/Latino,969
2020,Native American,902
2020,Pacific Islander,948
2020,White,1104
2021,Asian,1239
2021,Black,934
2021,Hispanic/Latino,967
2021,Native American,927
2021,Pacific Islander,950
2021,White,1112
2022,Asian,1229
2022,Black,926
2022,Hispanic/Latino,964
2022,Native American,936
2022,Pacific Islander,945
2022,White,1098")

# merge total scores from 2023 and before
total_scores <- merge(total_scores, subset(hp_sat2023, State=="National" & !Group %in% c("No Response", "Two or More Races", "All Groups"), select=c("Year", "Group", "Total")), all=TRUE)

# graph of national SAT total score means by race/ethnicity in 1987-2023
ggplot(data=total_scores, aes(Year, Total, color=Group, shape=Group))+
geom_point()+
geom_line(linetype = "solid")+
theme_classic()+
theme(axis.text.x = element_text(size=11, vjust=0.5,angle=90), panel.grid.major = element_line(color = "gray87", linetype = "dotted"), text=element_text(size=16), plot.caption = element_text(hjust = 0, margin = margin(t = 15), size = 16), axis.title.y = element_text(margin = margin(r = 10)), legend.title=element_text(size=16), legend.text=element_text(size=15),axis.title.x = element_text(margin = margin(t = 5)))+
scale_y_continuous(name="Total score mean", breaks=c(850,900,950,1000,1050,1100,1150,1200,1250))+
scale_x_continuous(breaks=c(1987:2023))+
scale_shape_manual(values=c(12,15,17,18,10,7,16))+
scale_color_manual(values=c("turquoise3", "#F0E442", "black", "#009E73", "brown1", "blue", "purple"))+
labs(caption="SAT total mean scores by race/ethnicity in 1987–2023, national data", color="Race/ethnicity", shape="Race/ethnicity")
ggsave("total_score_1987to2023.png", height=5.4, width=9.9, dpi=300)

# read data for high SAT-participation in 2023
hp_states_2023 <- read.csv(text="State,Participation_rate
Colorado,90%
Connecticut,93%
Delaware,95%
District of Columbia,100%
Florida,90%
Idaho,95%
Illinois,96%
Michigan,97%
Rhode Island,95%
West Virginia,90%
Overall,NA")

# Ns
hp_states_2023$N <- c(hp_sat2023$N[hp_sat2023$State!="National" & hp_sat2023$Group=="All Groups"],NA)
hp_states_2023$N[hp_states_2023$State=="Overall"] <- sum(hp_states_2023$N[1:10])
hp_states_2023$Participation_rate[hp_states_2023$State=="Overall"] <-  paste(round(weighted.mean(as.numeric(gsub("%", "", hp_states_2023[1:10,2])), hp_states_2023[1:10,3]),0), "%", sep="")
hp_states_2023 <- hp_states_2023[,c(1,3,2)] 

# table of high-participation states
#install.packages("ztable")
library(ztable)
colnames(hp_states_2023) <- c("State", "N", "Participation rate")
hp_states_2023_html  <- ztable(hp_states_2023 ,zebra=2,zebra.color="#d4effc;", caption="States with high SAT-participation in 2023", caption.placement="top",caption.position="l", caption.bold=TRUE, align="lrr",include.rownames=FALSE,size=3,colnames.bold=TRUE)
hp_states_2023_html <- hlines(hp_states_2023_html, add = c(10))
capture.output(hp_states_2023_html,file="hp_states_2023.html")

# reorder
hp_sat2023$State <- factor(hp_sat2023$State, levels=c("West Virginia", "Michigan", "Rhode Island", "Idaho", "Florida", "Delaware", "Colorado", "Connecticut", "Illinois",  "National", "District of Columbia"))

# graph of SAT total scores in high-participation states in 2023
library(ggplot2)
ggplot(data=subset(hp_sat2023,State!="National" & !Group %in% c("All Groups", "Pacific Islander", "Two or More Races", "No Response")),aes(x=State, y=Total,group=Group,color=Group))+ 
geom_line(linetype = "dashed") +
geom_point(aes(size=N)) +
theme_classic()+
theme(panel.grid.major = element_line(color = "gray87", linetype = "dotted"), plot.caption = element_text(face="bold",hjust = 0, margin = margin(t = 15), size = 17),text=element_text(size=16), axis.text.x = element_text(angle = 45,margin = margin(t = 27)), axis.title.x = element_text(margin=margin(t=-25)))+
labs(caption="Mean SAT total scores by race/ethnicity in high-participation states in 2023", x="State", y = "Mean score")+
scale_color_discrete(name = "Race/ethnicity", labels = c("Asian", "Black", "Hispanic/Latino", "Native American", "White"),type=c("purple", "azure4", "#009E73", "brown1", "#56B4E9"))+
scale_size_continuous(name = "Sample size", breaks=c(50,500,5000,25000,50000), labels=c("50","500","5,000","25,000","50,000"))+
scale_x_discrete(labels = c("District of Columbia" = "District of\nColumbia"))+
scale_y_continuous(breaks=c(800,900,1000,1100,1200,1300), limits=c(770,1310))+
guides(color=guide_legend(order=1), size=guide_legend(order=2))
ggsave("sat_high_participation_2023.png", height=5.4, width=9.9, dpi=300)

# read data for high-participation states in 2020
hp_sat2020 <- read.csv(text="Year,State,Group,Total,N
2020,West Virginia,Asian,1133,196
2020,West Virginia,Black,855,720
2020,West Virginia,Hispanic/Latino,928,432
2020,West Virginia,Native American,817,246
2020,West Virginia,Pacific Islander,852,14
2020,West Virginia,White,940,14664
2020,Idaho,Asian,1077,295
2020,Idaho,Black,889,223
2020,Idaho,Hispanic/Latino,900,3199
2020,Idaho,Native American,853,420
2020,Idaho,Pacific Islander,914,62
2020,Idaho,White,1026,12792
2020,Michigan,Asian,1168,4407
2020,Michigan,Black,860,13425
2020,Michigan,Hispanic/Latino,925,9962
2020,Michigan,Native American,833,1701
2020,Michigan,Pacific Islander,852,129
2020,Michigan,White,1033,69303
2020,Rhode Island,Asian,1120,501
2020,Rhode Island,Black,879,905
2020,Rhode Island,Hispanic/Latino,890,2378
2020,Rhode Island,Native American,850,114
2020,Rhode Island,Pacific Islander,807,14
2020,Rhode Island,White,1052,6379
2020,Delaware,Asian,1171,477
2020,Delaware,Black,884,2399
2020,Delaware,Hispanic/Latino,903,1836
2020,Delaware,Native American,844,102
2020,Delaware,Pacific Islander,820,16
2020,Delaware,White,1056,4819
2020,Florida,Asian,1145,6767
2020,Florida,Black,890,35955
2020,Florida,Hispanic/Latino,980,63510
2020,Florida,Native American,927,1316
2020,Florida,Pacific Islander,921,330
2020,Florida,White,1060,63548
2020,Colorado,Asian,1122,2306
2020,Colorado,Black,905,2475
2020,Colorado,Hispanic/Latino,916,19366
2020,Colorado,Native American,860,596
2020,Colorado,Pacific Islander,928,133
2020,Colorado,White,1072,31260
2020,Illinois,Asian,1177,7726
2020,Illinois,Black,887,18273
2020,Illinois,Hispanic/Latino,944,36688
2020,Illinois,Native American,839,1301
2020,Illinois,Pacific Islander,890,144
2020,Illinois,White,1073,64670
2020,Connecticut,Asian,1217,2631
2020,Connecticut,Black,897,4704
2020,Connecticut,Hispanic/Latino,917,9580
2020,Connecticut,Native American,928,98
2020,Connecticut,Pacific Islander,929,42
2020,Connecticut,White,1095,23334
2020,District of Columbia,Asian,1210,158
2020,District of Columbia,Black,877,2416
2020,District of Columbia,Hispanic/Latino,969,767
2020,District of Columbia,Native American,782,21
2020,District of Columbia,Pacific Islander,NA,7
2020,District of Columbia,White,1263,990")

# merge 2020 and 2023 data
hp_sat_2020_and_2023 <- merge(subset(hp_sat2020, Group!="Pacific Islander"), subset(hp_sat2023, State!="National" & !Group %in% c("All Groups", "Pacific Islander", "Two or More Races", "No Response"))[,1:5], all=TRUE)

# compute medians
sapply(unique(subset(hp_sat_2020_and_2023, Year==2020)$Group), function(group) hp_sat_2020_and_2023[nrow(hp_sat_2020_and_2023) + 1,] <<- list(2020, "MEDIAN", group, median(subset(hp_sat_2020_and_2023, Year==2020 & Group==group)$Total), NA))
sapply(unique(subset(hp_sat_2020_and_2023, Year==2023)$Group), function(group) hp_sat_2020_and_2023[nrow(hp_sat_2020_and_2023) + 1,] <<- list(2023, "MEDIAN", group, median(subset(hp_sat_2020_and_2023, Year==2023 & Group==group)$Total), NA))

# state abbreviations
state_abb <- function(state_name) {
  if(state_name == "Colorado") return("CO")
  else if(state_name == "Connecticut") return("CT")
  else if(state_name == "Delaware") return("DE")
  else if(state_name == "District of Columbia") return("DC")
  else if(state_name == "") return("CT")
  else if(state_name == "Florida") return("FL")
  else if(state_name == "Idaho") return("ID")
  else if(state_name == "Illinois") return("IL")
  else if(state_name == "Michigan") return("MI")
  else if(state_name == "Rhode Island") return("RI")
  else if(state_name == "West Virginia") return("WV")
  else return("MEDIAN")
}
hp_sat_2020_and_2023$State_Abb <- sapply(hp_sat_2020_and_2023$State, function(state) state_abb(state))

# graph of SAT total scores in high-participation states in 2020 and 2023
#install.packages("ggrepel")
library(ggrepel)
high_participation_2020_and_2023 <- ggplot(data=hp_sat_2020_and_2023, aes(Year, Total, group=State, color=State, label=ifelse(State!="MEDIAN", State_Abb, paste(sep="", "Median=", round(Total,0)))))+
geom_point(show.legend=TRUE, aes(size=N))+
geom_line(data=subset(hp_sat_2020_and_2023, State=="MEDIAN"), show.legend=FALSE, linetype="solid")+
geom_text_repel(aes(segment.linetype="dashed"), max.overlaps=20, size=2,show.legend=FALSE)+
theme_classic()+
scale_x_continuous(breaks=c(2020:2023), limits=c(2019.5,2023.5))+
labs(y="Mean score", caption="SAT total score means in high-participation states in 2020 and 2023",  tag = "Dots show means in each state, while lines indicate medians across states.")+
theme(legend.title = element_text(size = 12), legend.position = c(.85, .35), panel.background = element_rect(fill = NA, color = "black"), axis.title.x = element_text(margin = margin(t = 10)), axis.title.y = element_text(margin = margin(r = 10)), axis.title=element_text(size=13), axis.text=element_text(size=12), strip.text = element_text(size = 14), plot.margin = unit(c(0,0.4,0.4,0.4), "in"), plot.tag = element_text(size = 11), plot.tag.position =  c(0.363,-0.011),plot.caption = element_text(hjust = 0, margin = margin(t = 15), size = 16, face="bold"), panel.grid.major = element_line(color = "gray87", linetype = "dotted"))+
facet_wrap(. ~ Group, ncol=3, scales = "free")+
guides(group="none", color="none")+
scale_size(name="Sample size", breaks = c(50,500,5000,10000,25000,50000), labels=c("50","500","5,000","10,000","25,000","50,000"))+
scale_color_manual(values=c("#9E0142", "#D53E4F", "#F46D43", "#FDAE61", "#FEE08B", "#ABDDA4", "#E6F598", "black", "#66C2A5", "#3288BD", "#5E4FA2"))+
scale_y_continuous(breaks=seq(780,1260,30))
ggsave("high_participation_2020_and_2023.png", height=10.8, width=9.9, dpi=300)

# table of gaps between whites and non-whites
sat_gaps_2023 <- data.frame(State=unique(hp_sat_2020_and_2023$State))
sat_gaps_2023[,2:5]<-as.data.frame(with(subset(hp_sat_2020_and_2023, Year==2020 & State!="National" & !Group %in% c("Pacific Islander", "Two or More Races", "No Response")), sapply(unique(Group[Group!="White"]), function(group) round(-Total[Group=="White"]+Total[Group==group],0))))
sat_gaps_2023[,6:9]<-as.data.frame(with(subset(hp_sat_2020_and_2023, Year==2023 & State!="National" & !Group %in% c("Pacific Islander", "Two or More Races", "No Response")), sapply(unique(Group[Group!="White"]), function(group) round(-Total[Group=="White"]+Total[Group==group],0))))
sat_gaps_2023 <- sat_gaps_2023[,c(1,2,6,3,7,4,8,5,9)]

# html table of racial/ethnic gaps
colnames(sat_gaps_2023) <- c("State", "2020", "2023", "2020", "2023", "2020", "2023", "2020", "2023")
cgroup <- c("", "Asian–white", "Black–white", "Hispanic–white", "Native American–white")
n.cgroup <- c(1,2,2,2,2)
sat_gaps_2023_html <- ztable(roundDf(sat_gaps_2023,0),zebra=2,zebra.color="#d4effc;", caption="Differences in SAT total scores between whites and non-whites in high-participation states in 2020 and 2023", caption.placement="top",caption.position="l", caption.bold=TRUE, align="lrrrrrrrr",include.rownames=FALSE,colnames.bold=TRUE)
sat_gaps_2023_html <- addcgroup(sat_gaps_2023_html, cgroup, n.cgroup)
sat_gaps_2023_html <- hlines(sat_gaps_2023_html, add = c(10))
capture.output(sat_gaps_2023_html,file="sat_gaps.html")

 


Discover more from Human Varieties

Subscribe to get the latest posts sent to your email.

1 Comment

  1. eah

    In addition to the number of test takers, it is helpful to know what fraction of the respective cohort they represent, e.g. if approx 200k Asians took the test in 2023, how many didn’t?

    In 2020, there were 11 states where 98–100 percent of high school graduates took the SAT. In 2023, the rate was that high only in the District of Columbia. However, 10 of the 11 states had a rate of at least 90 percent in 2023, so they still provide reasonably representative samples of high school graduates.

    For some reason these high participation rates remind me of the old ‘A mind is a terrible thing to waste’ commercials for the United Negro College Fund that used to air on TV several decades ago.

    It seems absurd to me that such a high percentage of HS grads take the SAT, especially if you think e.g. of an ‘average’ white HS grad having an ‘average’ IQ of 100 — there is simply no way someone with an IQ of 100 belongs at a university, yet one has to assume that with a participation rate close to 100%, plenty of them are taking the test.

    Some sort of tracked education system makes a lot more sense — personally, I think it is pointless to attempt to educate the vast majority of black kids in an academic setting past the 8th grade — teach them how to read and write; build their vocabulary, including by making them read and look up words they don’t know; teach them to competently do basic arithmetic, including fractions. as well as something about the binary system and the basics of how a computer works; cover elementary biology, including Linnaean taxonomy and human biology; also something about the natural sciences, including a historical outline (physics, chemistry); show them how to use a library to find material and do a little research — a reasonable curriculum for this could be developed.
    Then instead of an academic HS setting, allow them to choose some sort of career or vocational training.
    Many people who go on to college and land in junk academia would be better off doing something similar — then university spending on that nonsense could be drastically cut too.
    It looks like in the not too distant future 60% of college grads will be female — obviously they are heavily concentrated in junk academic majors.

Leave a Reply

Your email address will not be published. Required fields are marked *

© 2024 Human Varieties

Theme by Anders NorenUp ↑