The warming stripes representation of long term temperature trends created by climatologist Ed Hawkins has proven to be an effective data visualization despite an intentional lack of precision. Erica Bugden and Francis Gagnon called it “putting the audience before the rules” in Do you really understand the influential warming stripes? Often, the stripes appear without a color legend or even the year labels.
Recently, Hawkins shared the follow version with one stripe per country, arranged by continent, and it reminded me of a previously abandoned project of mine to organize the stripes by latitude.
One complaint for this chart is that by focusing on countries, the geographic granularity is quite variable. That is, China and Belgium each have one row even though they summarize vastly different sized regions. In fact, the largest regions in terms of area, Asia and Pacific, get the smallest sections of the graph, because they have fewer countries. And I have to wonder if the Americas section size is being inflated by all the Caribbean island nations.
The challenge with removing those cultural connections is that the data collections sites themselves are culturally connected, such as occurring more often in population centers. My earlier project was to disentangle the cultural from the geographic, and Dan Zvinca informed me that the Berkeley Earth organization had already done that hard work. They provide historical climate data on an equally-spaced geographic grid. I’m sure they’ve made the hard decisions about how to combine disparate samples in a more intelligent way than I could.
Reading that data, however, is not as easy as with most data. It’s in a format called NetCDF, which has some mixed support as a sharing format. It’s in binary instead of text which improves precision, compactness and speed at the cost of other conveniences. Fortunately for me, NetCDF is a variation of HDF5 which JMP can import. I just had to add “.hdf5” to the file name for JMP to open it.
HDF5 is built for high-dimensional data, and from a traditional data table perspective, HDF5 data looks like many data tables, each representing some slice of the larger multi-dimensional data set. In this case, for 2064 months of temperature data over 15984 grid locations, the individual tables were
- longitudes: 15984 x 1
- latitudes: 15984 x 1
- land mask: 15984 x 1, what fraction of the grid location’s area was covered by land
- time: 2064 x 1, a fractional year value
- climatology: 12 x 15984, the monthly mean temperature for each grid location
- temperature: 2064 x 15984, the temperature anomaly for each grid location and time combination
With a little data merging and reshaping, I was able to convert those into two tidy tables, one with data on each grid location and another with yearly temperature anomalies. In keeping with the warming stripes goal, I was only interested in the yearly data.
To check the equal-spaced grid locations, I made this graph with each location colored according to its land coverage.
Looks reasonable. Some islands and narrow land masses (like Florida and Italy) are too small to get a land-only grid location. The unevenness of the dots is due to the map projection which is not classified as “equal area.” It’s the Kavrayskiy VII “compromise” projection, but it does a decent job with area evenness, except near the poles.
Here’s a close-up on Europe using an equal-area projection (Albers):
Before looking at the time-based warming stripes, we can apply the recent (last five year average) temperature data to each grid point on the map.
I sized the dots relative to their latitude and made them large enough to touch in the most places, which mimics a contour plot in appearance. It’s interesting how the land masses still show up in most places. The overall picture is alarmingly red, of course, and the light spot in the North Atlantic is not encouraging either since I’m guessing it’s Greenland ice melt.
Here’s the same scatterplot using an equal-area projection (Hammer). The dots are more evenly-spaced, and you can even make out Antarctica now.
I should note that I’m following with Berkeley Earth definition of temperature anomaly, which uses a baseline period of 1951 – 1980. The “standard” warming stripes use a baseline period of 1971-2000. The standard version also uses a color scale that’s based on the standard deviation at the location being shown, but I’m using the same color scale for all locations.
Linking the grid locations with the historical temperature data, we can get to the goal of warming stripes by latitude.
To make it a little less noisy, I grouped the latitudes into 10° intervals. I got what I wanted, but there is still one noticeable misrepresentation: the surface area represented by each band decreases away from the equator, yet each has the same area in the graph. I can try to address that with proportionally sized bands.
Another downside of my chart is that by using a single global color scale, the larger changes are the poles swamp the other latitudes, which are still experiencing significant changes. Perhaps if I were a climatologist, I would know how to make a fair, global relative scale. But I’ll make an effort following the use of standard deviations in the standard warming stripes coloring. Here I color by the average z score across each latitude interval.
Getting back to the inspirational chart which had country and continent breakdowns, what if I applied the latitude breakdown across continents? To get that, I filtered on mostly-land grid locations and split them into four longitudinal regions.
The blank areas correspond to areas without enough land-based grid locations. Unfortunately, I think the arrangement of heat maps looks too much like a map, which makes it hard to remember the x axis is time (not longitude). Still fun to make, and now I have a better understanding of the challenges around the seemingly-simple warming stripes.