Staircase moving average mystery

A recent article from Washington Post’s Department of Data, When America was ‘great,’ according to data, explored a few nostalgia-related surveys, including this song rating result from the research paper, The power of nostalgia: Age and preference for popular music.

The paper authors asked over 1000 people to each rate 34 songs, one top song from each even numbered year from 1950 to 2016. The main result of favoring songs released during one’s teen years is not surprising. However, I was curious about the apparent staircase pattern in the moving average line, most pronounced around a song age of 40, as seen in this close up.

Looking more closely, you can see there’s also a pattern in the dots, and drawing a connecting line through the dots highlights that zigzag pattern.

In general, the even song ages have higher ratings than the nearby odd song ages. The WaPost trend line is an 11-year moving average. On an even year, that average will have 5 even ages and 6 odd ages, and vice versa for an odd year. For instance, the song age 40 average will include the ratings for these 11 song ages: 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45. As a result, the even-age averages will have a negative bias relative to the odd-age averages, resulting in the staircase pattern.

That technically explains the staircase pattern but only transfers the mystery to a new question: why do even song ages have higher ratings that odd song ages? And since the song release years for all these songs are even, the parity of the song age depends completely on the parity of the person’s birth year. So an equivalent question is, why do people born in odd years rate songs higher?

Here’s a view of even vs odd birth years for all the song ratings, using box plots and blue lines as the means.

It’s hardly significant in any practical sense of the word and I can’t imagine an explanation, but the pattern is so consistent there very well may be a logical explanation. Maybe it is more related to song-age evenness rather than birth-year evenness. Here are the ratings by song age for each parity group:

So while that’s still a mystery to me, I would like to return to another aspect of the original chart: why use a moving average instead of a modern smoother? I often use a spline smoother, but for better comparison, here’s a loess smoother using mean as the local statistic over the same interval size.

It’s practically the same, but avoids the staircase pattern and does better at the peak.

Raw Data Studies

Staircase moving average mystery

Leave a ReplyCancel reply

Staircase moving average mystery

Leave a ReplyCancel reply

Discover more from Raw Data Studies