What can explain the gender gap on conceptual inventories?

Title: Gender gap on concept inventories in physics: What is consistent, what is inconsistent, and what factors influence the gap?
Authors: Adrian Madsen, Sarah B. McKagan, Eleanor C. Sayre
First author’s institution: American Association of Physics Teachers (AAPT)
Journal: Physical Review Special Topics- Physics Education Research 9 020121 (2013)

While today’s article is a few years old, it raises some important questions for researchers and instructors that still haven’t been resolved: namely, why do men outperform women on conceptual inventories (“the gender gap”)? Conceptual inventories are highly studied tests used to determine whether a student understands a concept and are key tools in physics education research and in comparing different teaching methods such as active learning and lecturing. However, many studies at many institutions around the world have found that men outperform women on these tests. For example, men tend to score 13% higher on pre-tests (given before instruction) and 12% higher on post-tests (given at the end of the semester) than women do on mechanics (forces and energy) conceptual inventories (Figure 1). On the other hand, men score 3.7% higher on pre-tests and 8.5% higher on post-tests than women do on electricity and magnetism (E&M) conceptual inventories. The goal of today’s paper is to explore what factors may be influencing this gap.

Figure 1: Gender gap on various conceptual inventories. The FCI and FMCE are mechanics-based while the BEMA and CSEM are electric and magnetism-based. Any bar with non-zero height indicates a gender gap. (Figure 3 in paper)

The researchers looked at six areas that could possibly influence the gender gap: background and preparation, the gender gap on other measures, differences in personal beliefs and answers a “scientist” would give, teaching methods, stereotype threat, and the construction of the test questions. The authors then reviewed studies that investigated the gender gap in each of these areas to determine how an area could influence the gender gap.

So what did they find? Well, it turns out that the evidence is rather inconclusive.  When considering background and preparation, one study found that between 70% of the gender gap on a post-test mechanics assessment and 62% of the gender gap on a post-test electricity and magnetism conceptual inventories could be accounted for based on differences in men and women’s math scores (determined from the ACT, SAT, or a math placement exam) and their performance on the pre-test. However, other measures of preparation and background such as years of high school physics or calculus and high school grade point average were not found to contribute to the gender gap. On other hand, a different study grouped male and female students based on their pre-test score on a mechanics conceptual inventory found that there was no gender gap between male and female students in the same group on the post-test.

Next, the researchers considered that the gender gap on conceptual inventories may be a result of a gender gap in performance in the students’ physics courses. Various studies have found that men outperform women on in-class exams by a few percentage points, which is consistent with the gender gap on the E&M conceptual inventories but not with the gender gap on the mechanics inventories.  The researchers also considered overall class grade but some studies have found males do slightly better in physics courses than females do while other studies have found no difference. Men and women were also found to fail or withdraw from their physics courses at the same rate and experience similar passing rates. Taken together, these factors only can account for a small part of the gender gap on mechanics inventories.

But what if how the course was taught mattered? Maybe something about the course was the reason males outperformed females. Previous works have suggested that women receive significant benefits by taking active learning courses, meaning the gender gap should be decreased in classes with large amounts of student interactivity. And again, the results are mixed. One study found that a highly interactive course decreased the post-test gender gap, another study found that partially interactive and highly courses did not change the pre- or post-test gender gaps and another study found that an interactive course actually increased the post-test gender gap compared to traditional lecture courses! Adding to the mixed results, one instructor gave half of his interactive class one E&M conceptual inventory and the other half a different E&M conceptual inventory and found that only one of them showed a post-test gender gap.

At this point, you’re probably thinking, “well if one inventory showed a gender gap and the other didn’t, maybe something is wrong with the tests themselves and the tests somehow favor males.” This was the next area the researchers looked and but found this not to be the case. One study rewrote a mechanics conceptual inventory to include more feminine and everyday contexts but found no significant difference in the women’s performance compared to the original mechanics conceptual inventory. While the men did perform better on some questions and worse on others, the overall effect was that the gender gap was unchanged. In addition, one study found that three items on a mechanics conceptual inventory were substantially biased for or against women, but even when these items were removed, the gender gap was still present. Finally, the researchers considered that maybe women answer the questions based on their beliefs while men may answer  the questions based on how they think a “scientist” would answer. While this may contribute to the gender gap, the authors conclude it only accounts for a small part at most.

The last area the researchers investigated to try to explain the gender gap was the presence of stereotype threat. For example, maybe asking women to indicate their gender caused them to recall stereotypes of women being bad at math and science and that caused them to perform worse. However, there was no definitive pattern in the gender gap based on how gender information was collected (before the test, after the test or from a university database). To account for the strength of the stereotype threat, the researchers looked at how the gender gap varied with the fraction of the students taking the test that were women (Figure 2). They found no significant relationship on either the pre- or post-tests. Finally, value affirmation exercises have been used to try to combat stereotype threats and while one study found these to eliminate the gender gap on a mechanics conceptual inventory, a study in subsequent semesters found the gender gap to still be present.

Figure 2: The gender gap as a function of the percent of group taking the test that is female. The regression lines shown in the figure are not statistically significant (Figure 5 in paper)

So what does this all mean and why do we care? Well it means that whatever is responsible for the gender gap is a complex phenomenon. The researchers note that background and preparation seem to have the largest influence on the gender gap but many of the studies used performance on conceptual inventories as a measure of background and preparation so the results should be taken lightly. In addition, testing anxiety and stereotype threat likely influence performance on conceptual inventories so the influence of background and preparation may just be an artifact of these.

In terms of classrooms and research, the authors note that instructors should work to eliminate the gender gap but there is no simple way to do so. Regardless of the gender gap, the authors say that conceptual inventories are still valid to use but that caution should be used if the inventories are used to compare teaching methods and the differences in scores are not large. From a research perspective, the authors stress the need for more replication studies to see which factors influence the gender gap. They also stress that the gender distribution of a class could impact studies comparing the effectiveness of different teaching style since men do outperform women and a higher fraction of women in the course could lead to a conclusion about a teaching method different from the conclusion reached if there were a higher fraction of men in the course.

Figures are used under Creative Commons Attribution 3.0 License

Leave a Reply

Your email address will not be published. Required fields are marked *