Title: Investigating students’ behavior and performance in online conceptual assessment
Authors: Bethany R. Wilcox and Steven J. Pollock
First Author’s Institution: University of Colorado, Boulder
Journal: Physical Review Physics Education Research, 15 020145 (2019)
Research based assessments(RBAs) are typically administered in class and on paper. This takes time out of precious class time as well as puts additional demands on instructors. One possible method for mitigating this situation is to deliver RBAs in a computer based format and allow students to attempt these at their convenience. But such a move can present new problems: students’ participation rate can be affected, students might use external resources to help answer questions, students might save questions from the test and post them online which could adversely affect test integrity, and so on. In today’s paper the authors aim to estimate the extent to which students engage in various behaviors when RBAs are conducted in a computer format and when students are allowed to attempt these assessments at their time and place of choice. This work is part of a larger body of work that attempts to see whether moving RBAs online would shift the average performance of students compared to in-class RBAs.
The authors decided to use 4 RBAs: 2 at the introductory level namely, Force Concept Inventory (FCI) and Conceptual Survey of Electricity and Magnetism (CSEM), and 2 at the upper-division level namely, Quantum Mechanics Conceptual Assessment (QMCA) and Colorado Upper-division Electrostatics Diagnostic (CUE). The authors gave 1543 students from 2 introductory physics classes and 336 students from 10 upper division classes, spread over 8 institutions, link to an appropriate test. The students could complete the test at their place of choice anytime before a certain deadline.
The authors used the Qualtrics platform to deliver the tests and the students attempted the tests in a regular web browser. The web pages were arranged in such as way that the test closely resembled the paper format test: each page of the paper format test was presented as a single web page and students could navigate between pages. The authors decided to track the following behaviors of the test takers using JavaScript code embedded in the web pages: copying the text of the question, printing a web page from within the browser, and moving to another browser tab (or window) and staying there for more than 4 seconds (the authors call this an extended browser hidden event). For a sub-section of students, namely the students involved in the second semester of the study, which included the whole of the introductory students population involved in the study, the authors were also able to track which question had its text copied along with whether the text was copied on not.
Results
Participation
The overall participation of students in RBAs increased in the new format when compared with paper-in-class format. Typical participation rate when RBAs are conducted in class and on paper varies between 60% to 85% for both introductory courses and for upper-division courses. In this study this rate is between 79% to 97% (across all classes that participated in the study). The increased participation is a good thing as more people participate the better our understanding of how much students have learned gets, which is ultimately the goal of RBAs.
Printing from within the browser
Out of the 1879 responses, the software platform detected 5 responses that had at-least one print event. Of these 2 are from introductory sessions and 3 are from upper-division sessions. The 2 introductory sessions had only 1-2 distinct print events, suggesting that only a few questions were saved. All 3 upper-division sessions identified above had a large number of print events, consistent with the idea that students are deliberately saving the questions.
One of the 3 students mentioned above presented the print outs during office hours as they wanted to discuss the answers to the questions. Thus, the desire to understand the answer to the questions is one motivation for students to print the questions.
One issue with students printing questions is that they might appear online along with answers. This can affect the integrity of the test questions. To check whether these printed questions made it to external websites, the authors did the simplest thing one could do: search Google! Several weeks after the last of the RBAs were administered, the authors “Googled” the text of questions from all 4 RBAs used in the study. They did not find any of the questions from the 2 upper-division RBAs, except as examples listed in research papers. But for the two introductory-level RBAs the authors found exact replicas of question prompts, along with solutions available for purchase. The authors say that the solutions available online predate the current study.
It is possible that the availability of solutions for the 2 upper-division RBAs reflects the fact that these two RBAs are much older than the other two, and that the solutions to the other 2 may appear online in the future.
Clicking into another browser tab
Clicking into another browser tab while answering questions is a sure sign of distraction. How common are these distractions? About 50% of respondents in both introductory and upper-division classes had at-least 1 event where they left the RBA window for more than 4 seconds and did not come back and stay in the RBA window for more than 4 seconds (extended browser hidden event). Of these, about 1/3 had only 1 such event. About 2/3 of the time the students did not stay away for more than 1 min and for only 10% of the time did anyone stay away for more than 5 min. In other words, students did get distracted but not too much!
To answers whether browser hidden events affected students’ performance, the authors compare how the z-scores of students who had browser hidden events compare with those who didn’t. For introductory assessments, the average z-score of students with at least 1 browser hidden event was 0.26 higher than those who didn’t have any browser hidden event. This is about quarter of a standard deviation higher. For the upper-division assessments, the average z-score of students with browser hidden events was lower by 0.19 compared to the average z-score of students without browser hidden event. The difference was statistically significant for the intro-group, but the effect size was small, while it was not statistically significant for the upper-division group. In short, on average, the difference between those who had browser hidden events and those who didn’t seem to be inconclusive and small.
Copying text of a question and moving to another window
1/10th of the students had at-least one copy event, that is, they copied the text of the question that was displayed in the browser. For 2/3 of these cases, the student switched to another browser tab and stayed there for more than 4 sec. The authors interpret this as being consistent with students trying to find answers online — of-course they can’t prove it. Even though we can’t prove or disprove whether students looked up answers or not we can ask a few exploratory questions.
Do students with copy events perform differently than those without them? For introductory students, the former category had a z-score 0.45 higher than the latter. For upper-division students the former category has a z-score 0.46 lower than the latter. Both are statistically significant with moderate effect sizes.
In the second semester of this study (which included all introductory sessions) the authors used JavaScript code to track which question each copy event was associated with. This allows us to ask whether students who copied the text of a particular question was more likely to get that question correct than someone who didn’t copy the text of the question. Turns out that, people who copied the text of a question are more likely to get that question correct than those who didn’t copy the text. This is a statistically significant results.
Another way to probe any effects of copying text of question is check whether each student performed better, on average, on questions with a copy event than on questions without a copy event. To answer this question the authors focused on the 147 students who had at least one copy event, and calculated for each students the z score on questions with copy event and on questions without copy event. They then took the average of the z-scores with copy and z-score without copy. The average z-core using questions with a copy event was 0.44 higher than the z-score on questions without copy. The difference is statistically significant with a moderate effect-size.
These are very suggestive, but of-course not conclusive, that the introductory students with copy events who also had extended browser hidden events were trying to look up answers while the upper-division students who did the same were either distracted or tried to look up answers but couldn’t find any.
RBA assessments are used to assess how students as a group can perform without external aid. RBAs are not recommended as instruments for making judgments on an individual’s learning. So the relevant question here is: does the effect we see above influence the overall quality of the RBAs? The authors answer this question by comparing the whole population of introductory students with those students in the introductory population with copy events in two ways. First they simply drop all students who had a copy event and recalculate the the average score of all the other students. The average drops by only 1.1%. This is a very small drop and is not statistically significant. Second, the authors assume that on any question with a copy event the student would have got zero if they had not copied the question and moved to another tab. This is a worst-case scenario that assumes everyone with such behavior did look up the answer. They then recalculated the average for the whole class. This dropped the average by 1.2%. Again a small drop.
These results suggest the following worst-case interpretation: for an individual student copying text seems to make a difference in their scores, but for the class as a whole the effect is negligible. This is clearly because only a small fraction of students engage in “copying” and typically the “copying” occurs only on a small number of question.
Comparison with available historical data
For 1 introductory class and 2 upper-division classes the authors had data from previous years, taught by the same instructor, where the same RBAs were given on paper and in class. Comparison between online and paper based scores for these cases show that online scores are about 5% lower. The effect size is small and is statistically significant only for the introductory class. The authors suggest that, if the effect is real, then combined with higher participation rates, this effect could be due to a larger number of lower performing students participating in online RBAs compared to in-class and on-paper RBAs.
Takeaways for instructors
- Only a small percentage (less than 5%) of students printed the questions from within the browser in a manner suggesting that they were trying to save the questions. Some of these students may have been saving questions to study them later. The authors couldn’t find any of the questions from upper-division RBAs online (Google). But they did find questions and answers from the introductory RBAs online; these RBAs are much older than the upper-division RBAs.
- Even though clicking away from the RBA browser tab is common, long distractions are very rare. The amount of time spent away from RBA window only had a small positive correlation on RBA score, and only for the introductory students.
- Copying of the text of the question was observed in about 10% of students. Of these 3/4 indulged in behavior that could be interpreted as looking up the question on external resources, even though this cannot be confirmed. Upper-division students with such copy events were on average associated with lower scores than students without such copy events.The trend reversed for introductory students: those students with such copy events were associated with higher scores that than those without such copy events.
- For introductory students the authors were able to associate copy events with the exact question that was copied. From such data they were able to conclude that students “who copied text for a particular question more often got that question correct”. In addition, students who copied scored more, on average, on the group of items they copied than on the group of items they didn’t copy.
- The improvements to students’ scores, assuming they did engage in looking up answers in external resources, is on average modest and impacts only a small fraction of students. Due to this the impact of negative behavior on the class average, which is what is of interest in RBAs, is negligibly small. Of course this is only true for the population of students in this study — another group of students might produce a different result.
- The authors were able to compare a portion of this study with historical data and the online part showed 5% drop in overall average score compared to historical paper-based data. This might represent an advantage when we consider this together with the increase in participation, as the increase in participation could very well be from population of students that weren’t participating
when RBAs were being administered on paper and in class.
Even though this study has many obvious limitations, such as that only behaviors within
the browser were tracked and what students actually did when they were in an “extended browser hidden event” is speculative, and so on, the study does represent a step forward in the physics education community’s attempts to reduce the burden on both teachers and students in executing RBAs. The finding that the average performance doesn’t seem to be affected by negative behaviors is especially encouraging and is worthy of further study.
Header image from bluefieldphotos bp, used under CC-BY-2.0
Prasanth Nair is a freelance software developer with strong interests in STEM education research, especially Physics Education Research.