As an outsider to the conventions of academia and research journals, I find it strange but interesting when papers about data science methods are published in journals for other domains. That seems to be the case with a recent paper in the Journal of Intelligence showing the merits of change-point detection instead of smooth curve fitting. The paper is Tracing Cognitive Processes in Insight Problem Solving: Using GAMs and Change Point Analysis to Uncover Restructuring by psychology researchers Graf, et al, and the topic is using a change-point detection R package to visualize a sudden change in a response instead of a smoother curve.

Their smoking gun is the following graph which shows the average response in a subject’s focus after a hint was given to a puzzle being solved.

The complaint with the smoother is that the curve starts rising before the stimulus is given. That’s true, but I have two minor qualms.

- From a discovery point of view (to “uncover restructuring”), the smoother does a fine job of pointing to the region of interest,
- In this case, the researchers know when the hint was applied, so there’s nothing to uncover, so why not use two disjoint curves?

The paper does provide enough raw data to explore this graph. The original uses a GAM (generalized additive model) smoother, and here is my recreation with spline smoother along with the data points.

And here are separate smoothers for the before and after segments of the data.

Either way, there’s a good indication that the hint made a difference. I’m not familiar with GAM, but I’m guessing from the confidence intervals that it’s assuming constant variance across the domain, which is clearly not the case. My smoothers are showing bootstrapped confidence regions which do reflect the variance difference.

## Experiment details

The experiment asked subjects to solve a matchstick equation puzzle and used eye tracked to determine which part of the puzzle they focused on during certain binned intervals. There were five regions: three numerals and two operators. If they didn’t solve it within 5 minutes, they were given a hint directing them toward the operators. The graphs above are for one of the operators and for the non-solvers. Presumably the period after the hint is also 5 minutes, but the greater variation within intervals makes me suspect a shorter time period.

## New discovery?

Playing around with the data, I noticed something else interesting. The percentages of time spent in the five areas of interest do not add up to 100%, which I assume represents times when the subjects were looking away from the puzzle. What’s interesting is that solvers have a different pattern of looking away than the non-solvers. Here are area of focus percentages as stacked areas.

It’s as if many of the solvers gain an advantage by looking away from the puzzle for a few moments.

(The blue uptick at the end if where solvers hone in on the area of the solution.)