To stack or not to stack

The following chart appeared in the May 19 issue of The Morning newsletter from The New York Times. It shows results from Gallup polls on abortion legalization polling since 1994 in support of the statement that opinions have a “remarkable stability” over time.

Four overlaid line chars of abortion survey responses over time.

The trends do indeed look mostly flat; however, connecting all the values with line segments adds a lot of noisy detail to look past to see the trend. I’m also wondering about stacking the results so that similar responses can be visually combined as sums.

Smooth it

It wasn’t too hard to find the raw data on the Gallup Abortion page. It’s marked up as an HTML <table> which makes for an easy import. The only data cleaning effort was to calculate a date from the polling period, which appeared in a form such as “2020 May 1-13” and “1999 Apr 30-May 2”. I used the center of the date range for graphing.

My go-to alternative to a connected line chart is a smooth trend line with data points in the background. The data points provide both support for the amount of smoothness and a sense of the cardinality and variation. The smoothers are the main focus, so I usually give the data points some transparency, which also helps with overstriking.

Four overlaid smooth trend lines with data dots in matching colors.

I copied the other design choices such as the colors, label wording and label placement. The label color is different only because JMP uses the line color for the label color; one can argue whether that’s good or bad.

The trends were not that challenging to see in the original, so any improvement is only marginal. The climb of the “illegal in all” during the first ten years is clearer, I think. My main confusion for this and the original is in grouping the responses, such as comparing legal versus illegal responses. The semantic coloring helps by using shades of green for all the legal responses. (For a different perspective, one could interpret the “Legal only in a few” response as “Illegal under most” and make it a lighter shade of red.)

It might also help to add “circumstances” to the end of each label instead of relying on the chart subtitle for that part, but I’ll stick with the original label text for exploring forms.

Stack it

I’m usually not a fan of stacked area charts since the responses in the middle are hard to perceive due to their uneven baselines (see sine illusion). However, for related responses, stacked areas do make it possible to estimate the sums of adjacent values, and that seems applicable for these responses. There is a definite ordering to the responses and some vagueness between choices. Here’s a quick area chart with the same responses and colors.

Stacked area chart of abortion poll responses over time.

That does make it easier to compare the legal/illegal groupings (which is not necessarily a fault of the original chart if that wasn’t a goal of the original, as explained by Nick Desbarats). However, it does bring back the raggedness of the connected lines, and the ragged top especially suffers from the “no opinion” group not being represented.

Smooth it and stack it

To address those issues, I saved out the smoothers from my original chart as new data table columns and then stacked those along with a fifth column that captures the remainder.

Smoothed stacked area chart of abortion poll responses over time.

Having the “Illegal in all” response anchored to the top also makes that trend easier to read over time, thanks to the fixed baseline. Seems good, but we have lost the data points as companions to the smoothers. It’s tricky to do, but here’s an attempt at projecting the original residuals from the smooth fits back onto the stacked areas.

Smoothed stacked area chart of abortion poll responses over time with data point.

It’s not bad, at least for estimating the degree of variation. There is a new twist, though. Since the “Illegal in all” group has a baseline at the top of the graph, maybe those points from be offset from the top, essentially flipping them across the curved line. Maybe some directional affordance is needed before this combination gets real use.

I just realized the area charts appear to have flatter trends than the line charts because of the scale differences: 0..100% versus 0..50%. I could have made the area charts taller, but that’s another general advantage of the non-stacked lines: higher data resolution.

Leave a Reply