I’ve spent way too much time trying to understand this now-retracted study, Elevated ceiling heights reduce the cognitive performance of higher-education students during exams, and now the dear reader must suffer as well. When the study came out last summer, it grabbed a lot of headlines about high ceilings causing reduced exam scores. It was temporarily retracted when the authors discovered some data errors, and then the revised paper was not deemed worthy by the editors, leading to permanent retraction. A few concerns:
- The paper clearly states that while exam room ceiling height is used in the model, it’s only a proxy for a variety of highly correlated variables such as room area and volume, ventilation, number of students in the room, etc. So it’s sad that the title and press coverage focused on ceiling height. It’s nice that the revised preprint was re-titled to use “enlarged scale of the interior built environment” instead of “ceiling height.”
- The effect is very tiny. In the original paper, they report the ceiling height effect to be between 0.185 score percentage points per meter. In the revised preprint, they include four models with effect size estimates of -0.22, -0.11, -0.07, and 0.10.
- The effect is in the wrong direction! That original estimate for the ceiling height parameter is positive, indicating scores increase with higher ceilings. Presumably that was part of the self-caught error in the original paper, but even in the revised preprint the last model has a positive effect and is the one that accounts for the most covariates.
Data
One great thing about this paper is that the data and R code are shared. And the data repository includes both the original and corrected data. Here’s a look at exam scores versus ceiling height as a whole, without accounting for any covariates.

You can already tell a few things about the data just from this simple scatter plot.
- There are only a few discrete room heights (The x-spread around each height value is just jitter I added to reduce overstriking).
- Most of the exams occurred in big rooms at 6.5m high or 9.5m high, both pretty high.
- There isn’t a noticeable effect at this scale.
A slight negative effect exists and can be seen if we zoom in the y scale.

As more covariates are added, the ceiling height effect gets murkier. Here’s year and unit mixed in as grouping variables.

I won’t try to go deeper into assessment of the ceiling height effect. Besides the already evident irregularity in the variable distributions, there’s also the matter of some students appearing multiple times in the data set, so it’s not easy to properly analyze. The authors account for the repeat students by using a fixed effect in the model.
Course work effect
Not surprisingly, the main factor in the exam score is the student’s coursework score. But the effect is interestingly non-linear.

I can imagine explanations like students slacking off on the coursework and cramming for the exam, but whatever the explanation the trend seems plausible.
Critical boundary
The data set also includes the overall grade, and looking at a dot plot reveals an interesting pattern. There’s one dot per student, but only showing 2018-2019 data to keep the number of dots manageable.

It looks like lots of just-below-50 grades got bumped up, presumably because 50 is the equivalent of a passing grade in these courses.
