Authors: Shima Salehi, Eric Burkholder, G. Peter Lepage, Steven Pollock, Carl Wieman
First author’s institution: Stanford University
Journal: Physical Review, Physics Education Research 15 020114 (2019) [Open Access].
Students are not a homogeneous group; they come from many different backgrounds and have had various academic preparations and experiences. Yet, many studies have ignored these differences when studying an educational technique or intervention and just reported how the technique or intervention affects an average student (such as this one and this one). In order to make our classes more inclusive and equitable, studies should instead look at how interventions affect subpopulations of students, which is becoming more standard in PER.
One area where subpopulation analysis has not been thoroughly used is demographic performance gaps. For example, prior work has established gender differences on physics conceptual inventories and class performance but has not taken into account factors that may be correlated with gender that could instead explain this performance gap such as prior academic preparation. Failure to recognize these possible intervening factors can lead to bias or negative expectations toward a specific demographic. The goal of today’s study is then to see if various performance gaps can instead be attributed to prior preparation differences between various demographic groups. As the title of the paper suggests, the performance gaps can largely be attributed to gaps in preparation between demographic groups.
To reach their conclusion, the authors collected data from physics 1 courses (which typically cover classical mechanics) at three large, research-intensive universities (one highly selective east coast university (HSEC), one highly selective west coast university (HSWC), and one public university in the middle of the United States (PM)). The data included the students’ gender, whether they identify as an underrepresented racial or ethnicity minority (URM), whether they were a first generation student, their math ACT or SAT score, their pre-course conceptual inventory score (measured by the Force and Motion Conceptual Inventory or the Force Concept Inventory), and their final exam score. The authors then used three multivariable linear regression models to predict the students’ final exam scores: the first model only used the students’ demographic information, the second model used their demographic information and math ACT/SAT score, and the third model used the same parameters as the second model along with the students’ conceptual inventory (CI) score.
When modeling the students’ final exam scores using only their demographics, the researchers found significant differences in the average score of the various demographic groups (male vs female, URM vs non-URM, and first generation student or non-first generation student). However, when taking into account the math ACT/SAT scores, the performance gap between URM and non-URM students disappeared and the performance gap between first generation and non-first generation students decreased. When also considering the conceptual inventory scores, the gender performance gap disappeared. These results were consistent across institutions and the year in which the course was offered (figure 1).
To better understand these results, the authors used structural equation modeling to test the mediation relationship between demographics, the final exam scores, and prior preparation. Using structural equation modeling allowed the researchers to see if there were any demographic effects on final exam performance or if the only reason the gaps appeared in the first place was because demographic factors are associated with prior preparation and that prior preparation was what really affected the final exam scores. Sure enough, with the exception of first generation student status at one of the universities, the measures of prior preparation were found to be mediating variables, meaning that the reason a demographic performance gap was found was due to the demographics’ association with prior performance (see figure 2).
Finally, the authors wanted to see what the impacts of having a “better” preparation were in terms of passing the class. The authors assumed that scoring in the bottom quartile on the final exam would not result in a sufficient grade to move on to the next course and hence treated it as failing. They then found that a student who was in the bottom quartile of prior preparation measures was four times as likely to fail the course as a student who was in the other quartiles of prior preparation measures (figure 3).
The key takeaways of this work are then that the performance gaps between different demographics groups can be attributed to prior preparation rather than to demographic factors. From observing the courses used in this study, the authors noted the courses were targeted toward better prepared students, making the course extra challenging for a less prepared student. Therefore, courses and teaching methods should be structured to match the preparation of incoming students to ensure that all students have the chance to be successful. In addition, future research investigating possible performance gaps needs to consider measures of prior preparation rather than just demographic information and those measures should be both general and subject-specific.
Figures used under Creative Commons Attribution 4.0 International license.
I am a postdoc in education data science at the University of Michigan and the founder of PERbites. I’m interested in applying data science techniques to analyze educational datasets and improve higher education for all students