Performance metrics

The tables in this section summarize the performance metrics of all the models used to predict success in college-level courses. The tables include the following metrics: AUC ROC, AUC PR, and MSE. The tables highlight the model with the highest AUC ROC, and the model that used only high school GPA to predict success in college-level courses. The latter is highlighted because previous research suggests GPA is a strong predictor of college-level success.

AUC ROC is a metric of area under the curve (AUC) for a receiver operating characteristic (ROC) curve (the ROC curves are explained further in the next section). AUC ROC is a simple but effective metric for determining performance of binary models. The higher the value, the better the model is at making predictions.

In the case of imbalanced data, when one of the categories is very rare, ROC curves can give misleading results and be overly optimistic. In this case, AUC of precision recall curve (AUC PR) may be a better metric. AUC PR is a metric of area under the curve (AUC) for a precision recall (PR) curve. AUC PR is a better metric than AUC ROC for imbalanced data because AUC PR does not account for true negatives, unlike in AUC ROC.

Finally, we can examine mean squared error (MSE), which measures deviations between an observed value and predicted value. The lower the MSE, the better the model performs. The definition of MSE is the average of the squared deviations between the observed and predicted values.

English

color.me <- which(testMetricsE$AUC_ROC == max(testMetricsE$AUC_ROC) | 
                    testMetricsE$Predictors == "GPA")
testMetricsE %>%
  select(Predictors, AUC_ROC, AUC_PR, MSE) %>% 
  kable(digits = 3) %>% 
  kable_styling(full_width = F) %>% 
  row_spec(color.me, bold = T, color = "white", background = mdrc_colors["blue","c"])
Predictors AUC_ROC AUC_PR MSE
ACT English test 0.565 0.924 0.084
ACT English test + GPA 0.675 0.952 0.082
All ACT tests 0.582 0.927 0.083
All ACT tests + Class rank + GPA 0.678 0.951 0.082
All ACT tests + GPA 0.675 0.951 0.082
All predictors 0.706 0.952 0.079
Attendance 0.575 0.929 0.083
Class rank 0.546 0.922 0.084
Class rank + GPA 0.681 0.953 0.082
GPA 0.681 0.954 0.082
Grade 10 tests 0.550 0.929 0.084
Grade 9 tests 0.538 0.932 0.084
High School + GPA 0.701 0.951 0.080
Passed Algebra 1 0.493 0.948 0.084
Passed Algebra 2 0.531 0.930 0.084
Passed Geometry 0.519 0.925 0.084
Passed math courses 0.535 0.923 0.084
Passed math courses + GPA 0.675 0.952 0.082

Math

color.me <- which(testMetricsM$AUC_ROC == max(testMetricsM$AUC_ROC) | 
                    testMetricsM$Predictors == "GPA")
testMetricsM %>%
  select(Predictors, AUC_ROC, AUC_PR, MSE) %>% 
  kable(digits = 3) %>% 
  kable_styling(full_width = F) %>% 
  row_spec(color.me, bold = T, color = "white", background = mdrc_colors["blue","c"])
Predictors AUC_ROC AUC_PR MSE
ACT Math test 0.607 0.897 0.120
ACT Math test + GPA 0.697 0.926 0.116
All ACT tests 0.611 0.900 0.120
All ACT tests + Class rank + GPA 0.695 0.924 0.115
All ACT tests + GPA 0.695 0.925 0.116
All predictors 0.708 0.920 0.112
Attendance 0.567 0.886 0.122
Class rank 0.547 0.883 0.122
Class rank + GPA 0.700 0.927 0.116
GPA 0.706 0.932 0.117
Grade 10 tests 0.561 0.893 0.121
Grade 9 tests 0.526 0.887 0.122
High School + GPA 0.695 0.918 0.115
Passed Algebra 1 0.499 0.861 0.123
Passed Algebra 2 0.497 0.876 0.123
Passed Geometry 0.501 0.871 0.123
Passed math courses 0.498 0.860 0.123
Passed math courses + GPA 0.707 0.930 0.117

ROC Curves

The graphs in this section show the ROC curves for the better performing models (the models with an AUC ROC great than 0.65).

An ROC curve plots false positive rate on the x-axis against the true positive rate on the y-axis, as the threshold changes. The first point in the bottom left corner of the ROC curve corresponds to the TPR and FPR at a threshold of 1. Next, imagine we change the threshold to 0.95. Then all students with a predicted value above 0.95 are predicted to pass the college-level course, and all students with a predicted value below 0.95 are predicted to fail. At this threshold, most of the students we classify as passing will be observed positives, so we expect a high true positive rate and a low false positive rate. As we lower the threshold, we will classify more and more student as passing, which increases the false positive rate and decreases the true positive rate. A good model has a high true positive rate and a low false positive rate, so points to the left and up are markers of a good model.

We usually compare a model’s ROC curve to a line that goes straight up the diagonal. This diagonal line corresponds to the performance of a random classifier (e.g., a model that predicts as well as a coin flip). Ideally, a trained model informed by the patterns in the data performs much better than a random classifier.

English

rocResultsTestE %>% 
  filter(highROC == 1 | learnerName == "glm_predset_act_engl_1") %>% 
  ggplot(aes(x = 1 - specificity, y = sensitivity)) +
  geom_line(aes(linetype = predSet_info)) +
  geom_abline(color = unname(unlist(mdrc_colors["red","c"]))) +    
  coord_equal() +
  theme_bw() +
  labs(x = "False Positive Rate",
       y = "True Positive Rate",
       linetype = "Predictors",
       title = 'ROC Curve (Passing College-level English)') +
  guides(color = "none")

English - Focused

rocResultsTestE %>% 
  filter(learnerName %in% c("glm_predset_gpa_1", "glm_predset_act_engl_1")) %>% 
  ggplot(aes(x = 1 - specificity, y = sensitivity)) +
  geom_line(aes(linetype = predSet_info, color = predSet_info)) +
  geom_abline(color = unname(unlist(mdrc_colors["red","c"]))) +
  scale_color_manual(values = unname(unlist(mdrc_colors["blue",c(2,3)])))+
  coord_equal() +
  theme_bw() +
  labs(x = "False Positive Rate",
       y = "True Positive Rate",
       linetype = "Predictors",
       color    = "Predictors",
       title = 'ROC Curve (Passing College-level English)')

Math

rocResultsTestM %>% 
  filter(highROC == 1 | learnerName == "glm_predset_act_math_1") %>%  
  ggplot(aes(x = 1 - specificity, y = sensitivity)) +
  geom_path(aes(linetype = predSet_info)) +
  geom_abline(color = unname(unlist(mdrc_colors["red","c"]))) +
  coord_equal() +
  theme_bw() +
  labs(x = "False Positive Rate",
       y = "True Positive Rate",
       linetype = "Predictors",
       title = 'ROC Curve (Passing College-level Math)')

Math - Focused

rocResultsTestM %>% 
  filter(learnerName %in% c("glm_predset_gpa_1", "glm_predset_act_math_1")) %>% 
  ggplot(aes(x = 1 - specificity, y = sensitivity)) +
  geom_line(aes(linetype = predSet_info, color = predSet_info)) +
  geom_abline(color = unname(unlist(mdrc_colors["red","c"]))) +
  scale_color_manual(values = unname(unlist(mdrc_colors["blue",c(2,3)])))+  
  coord_equal() +
  theme_bw() +
  labs(x = "False Positive Rate",
       y = "True Positive Rate",
       linetype = "Predictors",,
       color    = "Predictors",
       title = 'ROC Curve (Passing College-level Math)')

Examining GPA Thresholds

If GPA is used to place students into college-level courses, it is be useful to explore which GPA cut-offs will place more students accurately.

Consider the following 2 x 2 table. First, (in the rows) it classifies students by their placement recommendation for whether they should be in a college-level course or not. This recommendation may be based a single measure, like high school GPA, or on a combination of multiple measures. Then, (in columns) students are classified by their observed success in the college-level course. For a given threshold, students with GPAs above it are recommended to be placed in the college-level course, and students with GPAs below it are not recommended for the college-level course.

Table 1: 2x2 Table of Placement Recommendations vs. Success in College-Level Course

Placement Success in college-level
Student succeeded Student did not succeed
College-level a. Correct (true positive) b. Incorrect (false positive)
Non-college-level c. Incorrect (false negative) d. Correct (true negative)

The 2 x 2 table assumes that a placement is correct if a student was placed in the college-level course of focus and passed the course. The 2x2 table also assumes that a placement is correct if a student was not placed into the college-level course and would not have passed the course had they taken it. In the 2 x 2 table, these correct placements are represented in cells (a) and (d). Otherwise, there are two types of incorrect placements: A student who is placed in the college-level course who does not pass is considered overplaced - cell (b). A student who is not placed in the college-level course but would have passed if they had taken the college level course is considered underplaced - cell (c).

The challenge in filling in this table is that we often cannot observe students who fall into cells (c) and (d). That is, for students who do not not take a college-level course, we cannot know their college-level course outcome. However, we address this gap in our data by restricting our analysis only to students who took the college-level course of focus. By relying only on these students, we were able to fill in the columns of the 2x2 table using their observed outcomes. And we are able to fill in the rows of the 2x2 table by specifying different GPA cut scores (or thresholds).

For each threshold, we can fill in a new 2x2 table, and from the 2x2 table, we can compute various performance metrics:

  • The true positive rate: Among students who would succeed in a college-level course, this is the proportion correctly placed (a/(a+c)).
  • The false positive rate: Among students who would not succeed in a college-level course, this is the proportion incorrectly placed (b/(b+d)).
  • The accuracy: the proportion of all students who were correctly placed ((a+d)/(a+b+c+d)).

Because our approach only relies on students who enrolled in college-level courses, the generalizability of the findings is limited. The findings do not include information about the extent to which students who did not enroll in college-level would have succeeded if they had taken the college-level course.

English

GPA cut score of 2.5

df <- matrix_table(data = data_eng_cl, 
                   thresh = 2.5,
                   outcome = "c10btvcretengc")
df %>% 
  kable() %>% 
  kable_styling(full_width = F)
Predicted to Pass Passed Did not Pass
Yes 17003 1605
No 1544 356
stats <- data.frame(df[1,2] /(df[1,2] + df[2,2]),
                    df[1,3] /(df[1,3] + df[2,3]),
                   (df[1,2] + df[2,3])/(sum(df[1:2,2:3])))
names(stats) <- c("True Positive Rate", "False Positive Rate", "Accuracy")

stats %>% 
  kable() %>% 
  kable_styling(full_width = F)
True Positive Rate False Positive Rate Accuracy
0.916752 0.81846 0.8464502

GPA cut score of 3.0

df <- matrix_table(data = data_eng_cl, 
                   thresh = 3.0,
                   outcome = "c10btvcretengc")
df %>% 
  kable() %>% 
  kable_styling(full_width = F)
Predicted to Pass Passed Did not Pass
Yes 14458 1061
No 4089 900
stats <- data.frame(df[1,2] /(df[1,2] + df[2,2]),
                    df[1,3] /(df[1,3] + df[2,3]),
                   (df[1,2] + df[2,3])/(sum(df[1:2,2:3])))
names(stats) <- c("True Positive Rate", "False Positive Rate", "Accuracy")

stats %>% 
  kable() %>% 
  kable_styling(full_width = F)
True Positive Rate False Positive Rate Accuracy
0.7795331 0.5410505 0.7488785

Put everyone in College-level

df <- matrix_table(data = data_eng_cl, 
                   thresh = 0,
                   outcome = "c10btvcretengc")
df %>% 
  kable() %>% 
  kable_styling(full_width = F)
Predicted to Pass Passed Did not Pass
Yes 18547 1961
No 0 0
stats <- data.frame(df[1,2] /(df[1,2] + df[2,2]),
                    df[1,3] /(df[1,3] + df[2,3]),
                   (df[1,2] + df[2,3])/(sum(df[1:2,2:3])))
names(stats) <- c("True Positive Rate", "False Positive Rate", "Accuracy")

stats %>% 
  kable() %>% 
  kable_styling(full_width = F)
True Positive Rate False Positive Rate Accuracy
1 1 0.9043788

Math

GPA cut score of 2.5

df <- matrix_table(data = data_math_cl, 
                   thresh = 2.5,
                   outcome = "c10btvcretmathc")
df %>% 
  kable() %>% 
  kable_styling(full_width = F)
Predicted to Pass Passed Did not Pass
Yes 12879 2250
No 909 357
stats <- data.frame(df[1,2] /(df[1,2] + df[2,2]),
                    df[1,3] /(df[1,3] + df[2,3]),
                   (df[1,2] + df[2,3])/(sum(df[1:2,2:3])))
names(stats) <- c("True Positive Rate", "False Positive Rate", "Accuracy")

stats %>% 
  kable() %>% 
  kable_styling(full_width = F)
True Positive Rate False Positive Rate Accuracy
0.9340731 0.863061 0.8073193

GPA cut score of 3.0

df <- matrix_table(data = data_math_cl, 
                   thresh = 3.0,
                   outcome = "c10btvcretmathc")
df %>% 
  kable() %>% 
  kable_styling(full_width = F)
Predicted to Pass Passed Did not Pass
Yes 11452 1640
No 2336 967
stats <- data.frame(df[1,2] /(df[1,2] + df[2,2]),
                    df[1,3] /(df[1,3] + df[2,3]),
                   (df[1,2] + df[2,3])/(sum(df[1:2,2:3])))
names(stats) <- c("True Positive Rate", "False Positive Rate", "Accuracy")

stats %>% 
  kable() %>% 
  kable_styling(full_width = F)
True Positive Rate False Positive Rate Accuracy
0.8305773 0.6290756 0.757487

Put everyone in College-level

df <- matrix_table(data = data_math_cl, 
                   thresh = 0,
                   outcome = "c10btvcretmathc")
df %>% 
  kable() %>% 
  kable_styling(full_width = F)
Predicted to Pass Passed Did not Pass
Yes 13788 2607
No 0 0
stats <- data.frame(df[1,2] /(df[1,2] + df[2,2]),
                    df[1,3] /(df[1,3] + df[2,3]),
                   (df[1,2] + df[2,3])/(sum(df[1:2,2:3])))
names(stats) <- c("True Positive Rate", "False Positive Rate", "Accuracy")

stats %>% 
  kable() %>% 
  kable_styling(full_width = F)
True Positive Rate False Positive Rate Accuracy
1 1 0.8409881

Subgroups

Age

bind_rows(rocResultsTestEold %>% mutate(Age = "Age 21+" , Subject = "English"), 
          rocResultsTestEyng %>% mutate(Age = "Age 0-20", Subject = "English"),
          rocResultsTestMold %>% mutate(Age = "Age 21+" , Subject = "Math"), 
          rocResultsTestMyng %>% mutate(Age = "Age 0-20", Subject = "Math")) %>% 
  mutate(Predictors = case_when(Predictors == "ACT English test"       ~ "Subject-specific ACT test",
                                Predictors == "ACT Math test"          ~ "Subject-specific ACT test",
                                Predictors == "ACT English test + GPA" ~ "Subject-specific ACT test + GPA",
                                Predictors == "ACT Math test + GPA"    ~ "Subject-specific ACT test + GPA",
                                TRUE ~ Predictors)) %>% 
  ggplot(aes(x = 1 - specificity, y = sensitivity)) +
  geom_line(aes(linetype = Predictors)) +
  geom_abline(color = unname(unlist(mdrc_colors["red","c"]))) +    
  coord_equal() +
  facet_grid(cols = vars(Age), labeller = "label_value",
             rows = vars(Subject)) +
  theme_bw() +
  theme(strip.text.y = element_text(angle = 0))+
  labs(x = "False Positive Rate",
       y = "True Positive Rate",
       linetype = "Predictors",
       title = 'ROC Curves by Age') +
  guides(color = "none")

Gender

bind_rows(rocResultsTestEmen    %>% mutate(Gen = "Men"  , Subject = "English"), 
          rocResultsTestEwomen  %>% mutate(Gen = "Women", Subject = "English"),
          rocResultsTestMmen    %>% mutate(Gen = "Men"  , Subject = "Math"), 
          rocResultsTestMwomen  %>% mutate(Gen = "Women", Subject = "Math")) %>% 
  mutate(Predictors = case_when(Predictors == "ACT English test" ~ "Subject-specific ACT test",
                                Predictors == "ACT Math test" ~ "Subject-specific ACT test",
                                Predictors == "ACT English test + GPA" ~ "Subject-specific ACT test + GPA",
                                Predictors == "ACT Math test + GPA" ~ "Subject-specific ACT test + GPA",
                                TRUE ~ Predictors)) %>% 
  ggplot(aes(x = 1 - specificity, y = sensitivity)) +
  geom_line(aes(linetype = Predictors)) +
  geom_abline(color = unname(unlist(mdrc_colors["red","c"]))) +    
  coord_equal() +
  facet_grid(cols = vars(Gen), labeller = "label_value",
             rows = vars(Subject)) +
  theme_bw() +
  theme(strip.text.y = element_text(angle = 0))+
  labs(x = "False Positive Rate",
       y = "True Positive Rate",
       linetype = "Predictors",
       title = 'ROC Curves by Gender') +
  guides(color = "none")

Pell Eligibility

bind_rows(rocResultsTestEpelln  %>% mutate(Pell = "Not Eligible"  , Subject = "English"), 
          rocResultsTestEpelly  %>% mutate(Pell = "Eligible"      , Subject = "English"),
          rocResultsTestMpelln  %>% mutate(Pell = "Not Eligible"  , Subject = "Math"), 
          rocResultsTestMpelly  %>% mutate(Pell = "Eligible"      , Subject = "Math")) %>% 
  mutate(Predictors = case_when(Predictors == "ACT English test" ~ "Subject-specific ACT test",
                                Predictors == "ACT Math test" ~ "Subject-specific ACT test",
                                Predictors == "ACT English test + GPA" ~ "Subject-specific ACT test + GPA",
                                Predictors == "ACT Math test + GPA" ~ "Subject-specific ACT test + GPA",
                                TRUE ~ Predictors)) %>% 
  ggplot(aes(x = 1 - specificity, y = sensitivity)) +
  geom_line(aes(linetype = Predictors)) +
  geom_abline(color = unname(unlist(mdrc_colors["red","c"]))) +    
  coord_equal() +
  facet_grid(cols = vars(Pell), labeller = "label_value",
             rows = vars(Subject)) +
  theme_bw() +
  theme(strip.text.y = element_text(angle = 0))+
  labs(x = "False Positive Rate",
       y = "True Positive Rate",
       linetype = "Predictors",
       title = 'ROC Curves by Pell Eligibility') +
  guides(color = "none")

Race/Ethnicity

bind_rows(rocResultsTestEcolor  %>% mutate(Race = "Students of Color", Subject = "English"), 
          rocResultsTestEwhite  %>% mutate(Race = "White Students"   , Subject = "English"),
          rocResultsTestMcolor  %>% mutate(Race = "Students of Color", Subject = "Math"), 
          rocResultsTestMwhite  %>% mutate(Race = "White Students"   , Subject = "Math")) %>% 
  mutate(Predictors = case_when(Predictors == "ACT English test" ~ "Subject-specific ACT test",
                                Predictors == "ACT Math test" ~ "Subject-specific ACT test",
                                Predictors == "ACT English test + GPA" ~ "Subject-specific ACT test + GPA",
                                Predictors == "ACT Math test + GPA" ~ "Subject-specific ACT test + GPA",
                                TRUE ~ Predictors)) %>% 
  ggplot(aes(x = 1 - specificity, y = sensitivity)) +
  geom_line(aes(linetype = Predictors)) +
  geom_abline(color = unname(unlist(mdrc_colors["red","c"]))) +    
  coord_equal() +
  facet_grid(cols = vars(Race), labeller = "label_value",
             rows = vars(Subject)) +
  theme_bw() +
  theme(strip.text.y = element_text(angle = 0))+
  labs(x = "False Positive Rate",
       y = "True Positive Rate",
       linetype = "Predictors",
       title = 'ROC Curves by Race / Ethnicity') +
  guides(color = "none")

bind_rows(rocResultsTestEwhite  %>% mutate(Race = "White"   , Subject = "English"),
          rocResultsTestEblack  %>% mutate(Race = "Black"   , Subject = "English"),
          rocResultsTestEhisp   %>% mutate(Race = "Hispanic", Subject = "English"),
          rocResultsTestEasianother %>% mutate(Race = "Other"   , Subject = "English"),
          ) %>% 
  mutate(Predictors = case_when(Predictors == "ACT English test" ~ "Subject-specific ACT test",
                                Predictors == "ACT Math test" ~ "Subject-specific ACT test",
                                Predictors == "ACT English test + GPA" ~ "Subject-specific ACT test + GPA",
                                Predictors == "ACT Math test + GPA" ~ "Subject-specific ACT test + GPA",
                                TRUE ~ Predictors)) %>% 
  ggplot(aes(x = 1 - specificity, y = sensitivity)) +
  geom_line(aes(linetype = Predictors)) +
  geom_abline(color = unname(unlist(mdrc_colors["red","c"]))) +    
  coord_equal() +
  facet_wrap("Race") +
  theme_bw() +
  theme(strip.text.y = element_text(angle = 0),
        axis.text.x = element_text(angle = 45, hjust = 0.8, vjust = 0.9)) +
  labs(x = "False Positive Rate",
       y = "True Positive Rate",
       linetype = "Predictors",
       title = 'ROC Curves by Race / Ethnicity (English)') +
  guides(color = "none")

bind_rows(rocResultsTestMwhite  %>% mutate(Race = "White"   , Subject = "Math"),
          rocResultsTestMblack  %>% mutate(Race = "Black"   , Subject = "Math"),
          rocResultsTestMhisp   %>% mutate(Race = "Hispanic", Subject = "Math"),
          rocResultsTestMasianother %>% mutate(Race = "Other"   , Subject = "Math"),
          ) %>% 
  mutate(Predictors = case_when(Predictors == "ACT English test" ~ "Subject-specific ACT test",
                                Predictors == "ACT Math test" ~ "Subject-specific ACT test",
                                Predictors == "ACT English test + GPA" ~ "Subject-specific ACT test + GPA",
                                Predictors == "ACT Math test + GPA" ~ "Subject-specific ACT test + GPA",
                                TRUE ~ Predictors)) %>% 
  ggplot(aes(x = 1 - specificity, y = sensitivity)) +
  geom_line(aes(linetype = Predictors)) +
  geom_abline(color = unname(unlist(mdrc_colors["red","c"]))) +    
  coord_equal() +
  facet_wrap("Race") +
  theme_bw() +
  theme(strip.text.y = element_text(angle = 0),
        axis.text.x = element_text(angle = 45, hjust = 0.8, vjust = 0.9)) +
  labs(x = "False Positive Rate",
       y = "True Positive Rate",
       linetype = "Predictors",
       title = 'ROC Curves by Race / Ethnicity (Math)') +
  guides(color = "none")

AUC ROC by Subgroup

The best performing model for the full sample was the model that used all available information to make predictions about college-level success. The tables below show the AUC ROC values from this best performing model but using each subgroup instead of the full sample.

bind_rows(rocResultsTestEwhite      %>% mutate(Subgroup = "White"         , Subject = "English"),
          rocResultsTestMwhite      %>% mutate(Subgroup = "White"         , Subject = "Math"),
          rocResultsTestEblack      %>% mutate(Subgroup = "Black"         , Subject = "English"),
          rocResultsTestMblack      %>% mutate(Subgroup = "Black"         , Subject = "Math"),
          rocResultsTestEhisp       %>% mutate(Subgroup = "Hispanic"      , Subject = "English"),
          rocResultsTestMhisp       %>% mutate(Subgroup = "Hispanic"      , Subject = "Math"),
          rocResultsTestEasianother %>% mutate(Subgroup = "Other"         , Subject = "English"),
          rocResultsTestMasianother %>% mutate(Subgroup = "Other"         , Subject = "Math"),
          rocResultsTestEpelln      %>% mutate(Subgroup = "Not Eligible"  , Subject = "English"), 
          rocResultsTestEpelly      %>% mutate(Subgroup = "Eligible"      , Subject = "English"),
          rocResultsTestMpelln      %>% mutate(Subgroup = "Not Eligible"  , Subject = "Math"), 
          rocResultsTestMpelly      %>% mutate(Subgroup = "Eligible"      , Subject = "Math"),
          rocResultsTestEmen        %>% mutate(Subgroup = "Men"           , Subject = "English"), 
          rocResultsTestEwomen      %>% mutate(Subgroup = "Women"         , Subject = "English"),
          rocResultsTestMmen        %>% mutate(Subgroup = "Men"           , Subject = "Math"), 
          rocResultsTestMwomen      %>% mutate(Subgroup = "Women"         , Subject = "Math"),
          rocResultsTestEold        %>% mutate(Subgroup = "Age 21+"       , Subject = "English"), 
          rocResultsTestEyng        %>% mutate(Subgroup = "Age 0-20"      , Subject = "English"),
          rocResultsTestMold        %>% mutate(Subgroup = "Age 21+"       , Subject = "Math"), 
          rocResultsTestMyng        %>% mutate(Subgroup = "Age 0-20"      , Subject = "Math")) %>% 
  filter(Predictors == "All predictors") %>%
  select(Subject, Subgroup, AUC_ROC) %>% 
  distinct() %>% 
  mutate(AUC_ROC = round(AUC_ROC,3)) %>% 
  pivot_wider(id_cols = Subgroup, 
              names_from = Subject, 
              values_from = AUC_ROC) %>% 
  kable() %>% 
  kable_styling(full_width = F) %>% 
  pack_rows("Race/Ethnicity",1,4) %>% 
  pack_rows("Pell Eligibility",5,6) %>% 
  pack_rows("Gender",7,8) %>% 
  pack_rows("Age",9,10)
Subgroup English Math
Race/Ethnicity
White 0.622 0.667
Black 0.663 0.638
Hispanic 0.544 0.704
Other 0.549 0.457
Pell Eligibility
Not Eligible 0.595 0.654
Eligible 0.654 0.627
Gender
Men 0.636 0.662
Women 0.666 0.642
Age
Age 21+ 0.556 0.624
Age 0-20 0.738 0.701