ROC Curves
The graphs in this section show the ROC curves for the better
performing models (the models with an AUC ROC great than 0.65).
An ROC curve plots false positive rate on the x-axis against the true
positive rate on the y-axis, as the threshold changes. The
first point in the bottom left corner of the ROC curve corresponds to
the TPR and FPR at a threshold of 1. Next, imagine we change the
threshold to 0.95. Then all students with a predicted value above 0.95
are predicted to pass the college-level course, and all students with a
predicted value below 0.95 are predicted to fail. At this threshold,
most of the students we classify as passing will be observed
positives, so we expect a high true positive rate and a low false
positive rate. As we lower the threshold, we will classify more and more
student as passing, which increases the false positive rate and
decreases the true positive rate. A good model has a high true
positive rate and a low false positive rate, so points to
the left and up are markers of a good model.
We usually compare a model’s ROC curve to a line that goes straight
up the diagonal. This diagonal line corresponds to the performance of a
random classifier (e.g., a model that predicts as well as a coin flip).
Ideally, a trained model informed by the patterns in the data performs
much better than a random classifier.
English
rocResultsTestE %>%
filter(highROC == 1 | learnerName == "glm_predset_act_engl_1") %>%
ggplot(aes(x = 1 - specificity, y = sensitivity)) +
geom_line(aes(linetype = predSet_info)) +
geom_abline(color = unname(unlist(mdrc_colors["red","c"]))) +
coord_equal() +
theme_bw() +
labs(x = "False Positive Rate",
y = "True Positive Rate",
linetype = "Predictors",
title = 'ROC Curve (Passing College-level English)') +
guides(color = "none")
English - Focused
rocResultsTestE %>%
filter(learnerName %in% c("glm_predset_gpa_1", "glm_predset_act_engl_1")) %>%
ggplot(aes(x = 1 - specificity, y = sensitivity)) +
geom_line(aes(linetype = predSet_info, color = predSet_info)) +
geom_abline(color = unname(unlist(mdrc_colors["red","c"]))) +
scale_color_manual(values = unname(unlist(mdrc_colors["blue",c(2,3)])))+
coord_equal() +
theme_bw() +
labs(x = "False Positive Rate",
y = "True Positive Rate",
linetype = "Predictors",
color = "Predictors",
title = 'ROC Curve (Passing College-level English)')
Math
rocResultsTestM %>%
filter(highROC == 1 | learnerName == "glm_predset_act_math_1") %>%
ggplot(aes(x = 1 - specificity, y = sensitivity)) +
geom_path(aes(linetype = predSet_info)) +
geom_abline(color = unname(unlist(mdrc_colors["red","c"]))) +
coord_equal() +
theme_bw() +
labs(x = "False Positive Rate",
y = "True Positive Rate",
linetype = "Predictors",
title = 'ROC Curve (Passing College-level Math)')
Math - Focused
rocResultsTestM %>%
filter(learnerName %in% c("glm_predset_gpa_1", "glm_predset_act_math_1")) %>%
ggplot(aes(x = 1 - specificity, y = sensitivity)) +
geom_line(aes(linetype = predSet_info, color = predSet_info)) +
geom_abline(color = unname(unlist(mdrc_colors["red","c"]))) +
scale_color_manual(values = unname(unlist(mdrc_colors["blue",c(2,3)])))+
coord_equal() +
theme_bw() +
labs(x = "False Positive Rate",
y = "True Positive Rate",
linetype = "Predictors",,
color = "Predictors",
title = 'ROC Curve (Passing College-level Math)')
Examining GPA Thresholds
If GPA is used to place students into college-level courses, it is be
useful to explore which GPA cut-offs will place more students
accurately.
Consider the following 2 x 2 table. First, (in the rows) it
classifies students by their placement recommendation for whether they
should be in a college-level course or not. This recommendation may be
based a single measure, like high school GPA, or on a combination of
multiple measures. Then, (in columns) students are classified by their
observed success in the college-level course. For a given threshold,
students with GPAs above it are recommended to be placed in the
college-level course, and students with GPAs below it are not
recommended for the college-level course.
Table 1: 2x2 Table of Placement Recommendations vs. Success in
College-Level Course
|
Student succeeded |
Student did not succeed |
College-level |
a. Correct (true positive) |
b. Incorrect (false positive) |
Non-college-level |
c. Incorrect (false negative) |
d. Correct (true negative) |
The 2 x 2 table assumes that a placement is correct if a student was
placed in the college-level course of focus and passed the course. The
2x2 table also assumes that a placement is correct if a student was not
placed into the college-level course and would not have passed the
course had they taken it. In the 2 x 2 table, these correct placements
are represented in cells (a) and (d). Otherwise, there are two types of
incorrect placements: A student who is placed in the college-level
course who does not pass is considered overplaced - cell (b). A student
who is not placed in the college-level course but would have passed if
they had taken the college level course is considered underplaced - cell
(c).
The challenge in filling in this table is that we often cannot
observe students who fall into cells (c) and (d). That is, for students
who do not not take a college-level course, we cannot know their
college-level course outcome. However, we address this gap in our data
by restricting our analysis only to students who took the college-level
course of focus. By relying only on these students, we were able to fill
in the columns of the 2x2 table using their observed outcomes. And we
are able to fill in the rows of the 2x2 table by specifying different
GPA cut scores (or thresholds).
For each threshold, we can fill in a new 2x2 table, and from the 2x2
table, we can compute various performance metrics:
- The true positive rate: Among students who would succeed in a
college-level course, this is the proportion correctly placed
(a/(a+c)).
- The false positive rate: Among students who would not succeed in a
college-level course, this is the proportion incorrectly placed
(b/(b+d)).
- The accuracy: the proportion of all students who were correctly
placed ((a+d)/(a+b+c+d)).
Because our approach only relies on students who enrolled in
college-level courses, the generalizability of the findings is limited.
The findings do not include information about the extent to which
students who did not enroll in college-level would have succeeded if
they had taken the college-level course.
English
GPA cut score of 2.5
df <- matrix_table(data = data_eng_cl,
thresh = 2.5,
outcome = "c10btvcretengc")
df %>%
kable() %>%
kable_styling(full_width = F)
Predicted to Pass
|
Passed
|
Did not Pass
|
Yes
|
17003
|
1605
|
No
|
1544
|
356
|
stats <- data.frame(df[1,2] /(df[1,2] + df[2,2]),
df[1,3] /(df[1,3] + df[2,3]),
(df[1,2] + df[2,3])/(sum(df[1:2,2:3])))
names(stats) <- c("True Positive Rate", "False Positive Rate", "Accuracy")
stats %>%
kable() %>%
kable_styling(full_width = F)
True Positive Rate
|
False Positive Rate
|
Accuracy
|
0.916752
|
0.81846
|
0.8464502
|
GPA cut score of 3.0
df <- matrix_table(data = data_eng_cl,
thresh = 3.0,
outcome = "c10btvcretengc")
df %>%
kable() %>%
kable_styling(full_width = F)
Predicted to Pass
|
Passed
|
Did not Pass
|
Yes
|
14458
|
1061
|
No
|
4089
|
900
|
stats <- data.frame(df[1,2] /(df[1,2] + df[2,2]),
df[1,3] /(df[1,3] + df[2,3]),
(df[1,2] + df[2,3])/(sum(df[1:2,2:3])))
names(stats) <- c("True Positive Rate", "False Positive Rate", "Accuracy")
stats %>%
kable() %>%
kable_styling(full_width = F)
True Positive Rate
|
False Positive Rate
|
Accuracy
|
0.7795331
|
0.5410505
|
0.7488785
|
Put everyone in College-level
df <- matrix_table(data = data_eng_cl,
thresh = 0,
outcome = "c10btvcretengc")
df %>%
kable() %>%
kable_styling(full_width = F)
Predicted to Pass
|
Passed
|
Did not Pass
|
Yes
|
18547
|
1961
|
No
|
0
|
0
|
stats <- data.frame(df[1,2] /(df[1,2] + df[2,2]),
df[1,3] /(df[1,3] + df[2,3]),
(df[1,2] + df[2,3])/(sum(df[1:2,2:3])))
names(stats) <- c("True Positive Rate", "False Positive Rate", "Accuracy")
stats %>%
kable() %>%
kable_styling(full_width = F)
True Positive Rate
|
False Positive Rate
|
Accuracy
|
1
|
1
|
0.9043788
|
Math
GPA cut score of 2.5
df <- matrix_table(data = data_math_cl,
thresh = 2.5,
outcome = "c10btvcretmathc")
df %>%
kable() %>%
kable_styling(full_width = F)
Predicted to Pass
|
Passed
|
Did not Pass
|
Yes
|
12879
|
2250
|
No
|
909
|
357
|
stats <- data.frame(df[1,2] /(df[1,2] + df[2,2]),
df[1,3] /(df[1,3] + df[2,3]),
(df[1,2] + df[2,3])/(sum(df[1:2,2:3])))
names(stats) <- c("True Positive Rate", "False Positive Rate", "Accuracy")
stats %>%
kable() %>%
kable_styling(full_width = F)
True Positive Rate
|
False Positive Rate
|
Accuracy
|
0.9340731
|
0.863061
|
0.8073193
|
GPA cut score of 3.0
df <- matrix_table(data = data_math_cl,
thresh = 3.0,
outcome = "c10btvcretmathc")
df %>%
kable() %>%
kable_styling(full_width = F)
Predicted to Pass
|
Passed
|
Did not Pass
|
Yes
|
11452
|
1640
|
No
|
2336
|
967
|
stats <- data.frame(df[1,2] /(df[1,2] + df[2,2]),
df[1,3] /(df[1,3] + df[2,3]),
(df[1,2] + df[2,3])/(sum(df[1:2,2:3])))
names(stats) <- c("True Positive Rate", "False Positive Rate", "Accuracy")
stats %>%
kable() %>%
kable_styling(full_width = F)
True Positive Rate
|
False Positive Rate
|
Accuracy
|
0.8305773
|
0.6290756
|
0.757487
|
Put everyone in College-level
df <- matrix_table(data = data_math_cl,
thresh = 0,
outcome = "c10btvcretmathc")
df %>%
kable() %>%
kable_styling(full_width = F)
Predicted to Pass
|
Passed
|
Did not Pass
|
Yes
|
13788
|
2607
|
No
|
0
|
0
|
stats <- data.frame(df[1,2] /(df[1,2] + df[2,2]),
df[1,3] /(df[1,3] + df[2,3]),
(df[1,2] + df[2,3])/(sum(df[1:2,2:3])))
names(stats) <- c("True Positive Rate", "False Positive Rate", "Accuracy")
stats %>%
kable() %>%
kable_styling(full_width = F)
True Positive Rate
|
False Positive Rate
|
Accuracy
|
1
|
1
|
0.8409881
|
Subgroups
Age
bind_rows(rocResultsTestEold %>% mutate(Age = "Age 21+" , Subject = "English"),
rocResultsTestEyng %>% mutate(Age = "Age 0-20", Subject = "English"),
rocResultsTestMold %>% mutate(Age = "Age 21+" , Subject = "Math"),
rocResultsTestMyng %>% mutate(Age = "Age 0-20", Subject = "Math")) %>%
mutate(Predictors = case_when(Predictors == "ACT English test" ~ "Subject-specific ACT test",
Predictors == "ACT Math test" ~ "Subject-specific ACT test",
Predictors == "ACT English test + GPA" ~ "Subject-specific ACT test + GPA",
Predictors == "ACT Math test + GPA" ~ "Subject-specific ACT test + GPA",
TRUE ~ Predictors)) %>%
ggplot(aes(x = 1 - specificity, y = sensitivity)) +
geom_line(aes(linetype = Predictors)) +
geom_abline(color = unname(unlist(mdrc_colors["red","c"]))) +
coord_equal() +
facet_grid(cols = vars(Age), labeller = "label_value",
rows = vars(Subject)) +
theme_bw() +
theme(strip.text.y = element_text(angle = 0))+
labs(x = "False Positive Rate",
y = "True Positive Rate",
linetype = "Predictors",
title = 'ROC Curves by Age') +
guides(color = "none")
Gender
bind_rows(rocResultsTestEmen %>% mutate(Gen = "Men" , Subject = "English"),
rocResultsTestEwomen %>% mutate(Gen = "Women", Subject = "English"),
rocResultsTestMmen %>% mutate(Gen = "Men" , Subject = "Math"),
rocResultsTestMwomen %>% mutate(Gen = "Women", Subject = "Math")) %>%
mutate(Predictors = case_when(Predictors == "ACT English test" ~ "Subject-specific ACT test",
Predictors == "ACT Math test" ~ "Subject-specific ACT test",
Predictors == "ACT English test + GPA" ~ "Subject-specific ACT test + GPA",
Predictors == "ACT Math test + GPA" ~ "Subject-specific ACT test + GPA",
TRUE ~ Predictors)) %>%
ggplot(aes(x = 1 - specificity, y = sensitivity)) +
geom_line(aes(linetype = Predictors)) +
geom_abline(color = unname(unlist(mdrc_colors["red","c"]))) +
coord_equal() +
facet_grid(cols = vars(Gen), labeller = "label_value",
rows = vars(Subject)) +
theme_bw() +
theme(strip.text.y = element_text(angle = 0))+
labs(x = "False Positive Rate",
y = "True Positive Rate",
linetype = "Predictors",
title = 'ROC Curves by Gender') +
guides(color = "none")
Pell Eligibility
bind_rows(rocResultsTestEpelln %>% mutate(Pell = "Not Eligible" , Subject = "English"),
rocResultsTestEpelly %>% mutate(Pell = "Eligible" , Subject = "English"),
rocResultsTestMpelln %>% mutate(Pell = "Not Eligible" , Subject = "Math"),
rocResultsTestMpelly %>% mutate(Pell = "Eligible" , Subject = "Math")) %>%
mutate(Predictors = case_when(Predictors == "ACT English test" ~ "Subject-specific ACT test",
Predictors == "ACT Math test" ~ "Subject-specific ACT test",
Predictors == "ACT English test + GPA" ~ "Subject-specific ACT test + GPA",
Predictors == "ACT Math test + GPA" ~ "Subject-specific ACT test + GPA",
TRUE ~ Predictors)) %>%
ggplot(aes(x = 1 - specificity, y = sensitivity)) +
geom_line(aes(linetype = Predictors)) +
geom_abline(color = unname(unlist(mdrc_colors["red","c"]))) +
coord_equal() +
facet_grid(cols = vars(Pell), labeller = "label_value",
rows = vars(Subject)) +
theme_bw() +
theme(strip.text.y = element_text(angle = 0))+
labs(x = "False Positive Rate",
y = "True Positive Rate",
linetype = "Predictors",
title = 'ROC Curves by Pell Eligibility') +
guides(color = "none")
Race/Ethnicity
bind_rows(rocResultsTestEcolor %>% mutate(Race = "Students of Color", Subject = "English"),
rocResultsTestEwhite %>% mutate(Race = "White Students" , Subject = "English"),
rocResultsTestMcolor %>% mutate(Race = "Students of Color", Subject = "Math"),
rocResultsTestMwhite %>% mutate(Race = "White Students" , Subject = "Math")) %>%
mutate(Predictors = case_when(Predictors == "ACT English test" ~ "Subject-specific ACT test",
Predictors == "ACT Math test" ~ "Subject-specific ACT test",
Predictors == "ACT English test + GPA" ~ "Subject-specific ACT test + GPA",
Predictors == "ACT Math test + GPA" ~ "Subject-specific ACT test + GPA",
TRUE ~ Predictors)) %>%
ggplot(aes(x = 1 - specificity, y = sensitivity)) +
geom_line(aes(linetype = Predictors)) +
geom_abline(color = unname(unlist(mdrc_colors["red","c"]))) +
coord_equal() +
facet_grid(cols = vars(Race), labeller = "label_value",
rows = vars(Subject)) +
theme_bw() +
theme(strip.text.y = element_text(angle = 0))+
labs(x = "False Positive Rate",
y = "True Positive Rate",
linetype = "Predictors",
title = 'ROC Curves by Race / Ethnicity') +
guides(color = "none")
bind_rows(rocResultsTestEwhite %>% mutate(Race = "White" , Subject = "English"),
rocResultsTestEblack %>% mutate(Race = "Black" , Subject = "English"),
rocResultsTestEhisp %>% mutate(Race = "Hispanic", Subject = "English"),
rocResultsTestEasianother %>% mutate(Race = "Other" , Subject = "English"),
) %>%
mutate(Predictors = case_when(Predictors == "ACT English test" ~ "Subject-specific ACT test",
Predictors == "ACT Math test" ~ "Subject-specific ACT test",
Predictors == "ACT English test + GPA" ~ "Subject-specific ACT test + GPA",
Predictors == "ACT Math test + GPA" ~ "Subject-specific ACT test + GPA",
TRUE ~ Predictors)) %>%
ggplot(aes(x = 1 - specificity, y = sensitivity)) +
geom_line(aes(linetype = Predictors)) +
geom_abline(color = unname(unlist(mdrc_colors["red","c"]))) +
coord_equal() +
facet_wrap("Race") +
theme_bw() +
theme(strip.text.y = element_text(angle = 0),
axis.text.x = element_text(angle = 45, hjust = 0.8, vjust = 0.9)) +
labs(x = "False Positive Rate",
y = "True Positive Rate",
linetype = "Predictors",
title = 'ROC Curves by Race / Ethnicity (English)') +
guides(color = "none")
bind_rows(rocResultsTestMwhite %>% mutate(Race = "White" , Subject = "Math"),
rocResultsTestMblack %>% mutate(Race = "Black" , Subject = "Math"),
rocResultsTestMhisp %>% mutate(Race = "Hispanic", Subject = "Math"),
rocResultsTestMasianother %>% mutate(Race = "Other" , Subject = "Math"),
) %>%
mutate(Predictors = case_when(Predictors == "ACT English test" ~ "Subject-specific ACT test",
Predictors == "ACT Math test" ~ "Subject-specific ACT test",
Predictors == "ACT English test + GPA" ~ "Subject-specific ACT test + GPA",
Predictors == "ACT Math test + GPA" ~ "Subject-specific ACT test + GPA",
TRUE ~ Predictors)) %>%
ggplot(aes(x = 1 - specificity, y = sensitivity)) +
geom_line(aes(linetype = Predictors)) +
geom_abline(color = unname(unlist(mdrc_colors["red","c"]))) +
coord_equal() +
facet_wrap("Race") +
theme_bw() +
theme(strip.text.y = element_text(angle = 0),
axis.text.x = element_text(angle = 45, hjust = 0.8, vjust = 0.9)) +
labs(x = "False Positive Rate",
y = "True Positive Rate",
linetype = "Predictors",
title = 'ROC Curves by Race / Ethnicity (Math)') +
guides(color = "none")
AUC ROC by Subgroup
The best performing model for the full sample was the model that used
all available information to make predictions about college-level
success. The tables below show the AUC ROC values from this best
performing model but using each subgroup instead of the full sample.
bind_rows(rocResultsTestEwhite %>% mutate(Subgroup = "White" , Subject = "English"),
rocResultsTestMwhite %>% mutate(Subgroup = "White" , Subject = "Math"),
rocResultsTestEblack %>% mutate(Subgroup = "Black" , Subject = "English"),
rocResultsTestMblack %>% mutate(Subgroup = "Black" , Subject = "Math"),
rocResultsTestEhisp %>% mutate(Subgroup = "Hispanic" , Subject = "English"),
rocResultsTestMhisp %>% mutate(Subgroup = "Hispanic" , Subject = "Math"),
rocResultsTestEasianother %>% mutate(Subgroup = "Other" , Subject = "English"),
rocResultsTestMasianother %>% mutate(Subgroup = "Other" , Subject = "Math"),
rocResultsTestEpelln %>% mutate(Subgroup = "Not Eligible" , Subject = "English"),
rocResultsTestEpelly %>% mutate(Subgroup = "Eligible" , Subject = "English"),
rocResultsTestMpelln %>% mutate(Subgroup = "Not Eligible" , Subject = "Math"),
rocResultsTestMpelly %>% mutate(Subgroup = "Eligible" , Subject = "Math"),
rocResultsTestEmen %>% mutate(Subgroup = "Men" , Subject = "English"),
rocResultsTestEwomen %>% mutate(Subgroup = "Women" , Subject = "English"),
rocResultsTestMmen %>% mutate(Subgroup = "Men" , Subject = "Math"),
rocResultsTestMwomen %>% mutate(Subgroup = "Women" , Subject = "Math"),
rocResultsTestEold %>% mutate(Subgroup = "Age 21+" , Subject = "English"),
rocResultsTestEyng %>% mutate(Subgroup = "Age 0-20" , Subject = "English"),
rocResultsTestMold %>% mutate(Subgroup = "Age 21+" , Subject = "Math"),
rocResultsTestMyng %>% mutate(Subgroup = "Age 0-20" , Subject = "Math")) %>%
filter(Predictors == "All predictors") %>%
select(Subject, Subgroup, AUC_ROC) %>%
distinct() %>%
mutate(AUC_ROC = round(AUC_ROC,3)) %>%
pivot_wider(id_cols = Subgroup,
names_from = Subject,
values_from = AUC_ROC) %>%
kable() %>%
kable_styling(full_width = F) %>%
pack_rows("Race/Ethnicity",1,4) %>%
pack_rows("Pell Eligibility",5,6) %>%
pack_rows("Gender",7,8) %>%
pack_rows("Age",9,10)
Subgroup
|
English
|
Math
|
Race/Ethnicity
|
White
|
0.622
|
0.667
|
Black
|
0.663
|
0.638
|
Hispanic
|
0.544
|
0.704
|
Other
|
0.549
|
0.457
|
Pell Eligibility
|
Not Eligible
|
0.595
|
0.654
|
Eligible
|
0.654
|
0.627
|
Gender
|
Men
|
0.636
|
0.662
|
Women
|
0.666
|
0.642
|
Age
|
Age 21+
|
0.556
|
0.624
|
Age 0-20
|
0.738
|
0.701
|