Title: Interactive-engagement Versus Traditional Methods: A Six-thousand-student Survey of Mechanics Test Data for Introductory Physics Courses
Author: Richard R. Hake
Topic and Selected Reading Number: Interactive Engagement in General, 168
Journal: American Journal of Physics, 66, 64 (1998) [Closed Access]
Since you are reading a physics education blog, you are probably familiar with the idea of active learning and the reasons for choosing to use active learning in the classroom instead of opting for a more traditional lecture. Today’s paper from the Selected Readings in PER document is the paper that arguably showed active learning’s benefits to the physics community at large.
Back in the 1990s when this paper was written, physics education was still in its beginning stages focusing on assessing what students knew and trying to develop new approaches toward teaching. The results of conceptual inventories such as the Force Concept Inventory (FCI) showed that students weren’t learning the concepts of Newtonian mechanics (forces, energy, momentum) in traditional lectures and hence scored poorly on the conceptual inventories. Various active learning and interactive engagement (IE) techniques were then developed to try to increase the amount of concepts students would learn in the class. Their effectiveness was then measured by comparing scores on the FCI or a similar conceptual inventory to the scores of students in more traditional lecture scores. Today’s paper is an analysis of the scores on the FCI from students in traditional lectures and students in IE courses, defined as courses that featured hands-on learning activities with immediate feedback through discussion with the course instructors or their peers.
Specifically, today’s paper is an analysis of FCI scores from 14 traditional courses and 48 IE courses, representing over 6,000 students, a huge number for an educational study. To compare scores from different courses, the author devised a measure called normalized gain, which is shown in figure 1.
Normalized gain can be thought of as the fraction of what students learned to what they could have learned. For example, if the class average on the FCI at the start of the semester was 20% and the class average on the FCI at the end of the semester was 40%, the normalized gain would be (40-20)/(100-20)=0.25 meaning the students learned 25% of what they possibly could have. Hake defined a normalized gain less than 0.3 as low, a normalized gain between 0.3 and 0.7 as middle and a normalized gain above 0.7 as high.
So what did Hake find? Using his measure of normalized gain, he found that the traditional lecture courses had a normalized gain of <g>=0.23±0.04 while the IE courses had a normalized gain of <g>=0.48±0.14, meaning the students in IE courses did significantly better on the FCI than students in the traditional courses did. The story becomes even clearer the more the data is analyzed. For example, Hake found that all of the traditional courses had normalized gains in the “low” region while only 15% of the IE courses were in the same range while the rest of the IE courses were in the “middle” region. Hake attributed the low normalized gains in some of the IE courses to problems implementing the IE methods, such as the instructor not being sufficiently trained in the method or the students not taking the class activities seriously.
Next, Hake looked at the normalized gains across educational levels, such as high school, two-year colleges, and four-year universities. He found that while pre-semester scores rose with increasing student level (from high school to university), the normalized gains were basically the same for all three levels. In each case, the IE courses produced higher normalized gains than the traditional courses did. In the case of the high school students, Hake found that honors physics courses tended to produce higher normalized gains than a non-honors course taught using the same instructional method (traditional or IE), suggesting that the motivation of the students plays a role in how much they learn from the course. Interestingly, the non-honors IE courses produced larger normalized gains than the honors physics courses taught in the traditional way.
Since all of the results have been based on the FCI, which features mainly qualitative questions, Hake also looked at scores on the Mechanics Baseline test, which is a more quantitative test. Even though IE courses tend to have a greater conceptual focus than traditional courses do, students in the IE courses still outperformed students in the traditional courses on the Mechanics Baseline test, suggesting that conceptual instruction could actually enhance students’ ability to solve quantitative problems.
All of the findings so far suggest that interactive engagement methods are superior to traditional methods when comparing the amount students learn during the course. But what if the apparent difference in the normalized gain could just be the result of some type of errors? To address this, Hake considered both random errors and systematic errors. He found that random errors could account for the “spread” or standard deviation of the normalized gain for the traditional courses but not for the IE courses, meaning either a systematic error was present or there really was a difference.
First, he considered the FCI itself and that the test was somehow contributing to the difference such as students making lucky guesses on questions or getting an answer correct based on faulty reasoning. However, a revised version of the FCI was later used in IE courses at two universities and the normalized gains were consistent with those found using the original FCI, suggesting the test wasn’t responsible for the difference.
Next, Hake considered that maybe instructors in the IE courses had “taught to the test” and included identical or near identical questions to those on the FCI in their classroom assignments or tests or had spent more time on the material on the FCI than an instructor in a typical course would. From surveys given to course instructors, Hake found that the first of these propositions was not true since all the instructors had indicated they have tried to avoid “teaching to the test.” For the second proposition, Hake matched traditional and IE courses at the same institution that spent roughly the same fraction of the course on ideas included on the FCI and found the difference in their normalized gains was consistent with the differences observed in the overall data. Interestingly, the differences in normalized gains between the IE and traditional courses did not change based on the fraction of the course that was devoted to ideas on the FCI.
Finally, Hake considered the students themselves; maybe there was something about the students in the IE courses that contributed to the difference. First, Hake considered that maybe students in the IE courses were motivated to do well on the FCI. However, the normalized gains in IE courses where there was no grade incentive for doing well on the FCI were not different from IE courses where there was a grade incentive for doing well. Alternatively, maybe the students in the IE courses did better because they knew they were in an experimental condition and being observed, a Hawthorne effect. If this were the case, then institutions where IE courses have been taught for a few semesters should show a smaller effect since the IE course would no longer seem to be special but just another course. Yet, the normalized gain in the IE courses that had been in place for a few semesters was practically the same as the normalized gain in newer IE courses, suggesting that this was not the case. Given that the most likely sources of error had been excluded, Hake concluded that the observed differences in normalized gain were actually a result of the teaching method and hence, IE instructional methods were more effective than their traditional counterparts.
Despite the many positive benefits of interactive engagement courses, such as improved conceptual understanding Hake wrote about nearly twenty years ago, traditional instruction is still very common in physics. A recent Science [Closed Access] article suggests that a majority of physics courses (and STEM courses in general) are still taught in the traditional style, meaning there is still much work to do in terms of providing students an education based on research-backed methods rather than “the way it has always been done.”
I am a postdoc in education data science at the University of Michigan and the founder of PERbites. I’m interested in applying data science techniques to analyze educational datasets and improve higher education for all students