I’m a long-time fan and supporter of Neil Sloane‘s Online Encyclopedia of Integer Sequences (OEIS), and I recently learned, via a Numberphile video, about a graphical oddity called Sloane’s Gap. The gap is a visual artifact seen when plotting the number of occurrences of each number appearing in the OEIS. The 2011 paper that introduced the gap included this figure.

I don’t know that the gap’s existence has been well-explained. The numbers above the gap tend to be more interesting, such as prime numbers, perfect powers, and numbers with many divisors. Numberphile suggested the gap may reflect a tendency of people to record sequences containing such interesting numbers, rather than an underlying mathematical explanation.
Since the OEIS has grown since 2011, I downloaded the latest sequences to see the current state of the gap.

Using newer data means the counts are higher than before, but the overall shape is the same, including the gap. Other differences:
- I applied some transparency to the dots.
- I only counted each number once per sequence. That mainly affects the low numbers. Otherwise the count of 1’s would be 1.4 million instead of 238,727.
Here’s a version with primes, powers, and superabundant numbers colored differently, and with a few outliers labeled.

It’s eerie how all the primes as above the gap.
It’s hard to compare the relative specialness of outliers. I believe it was Howard Wainer who emphasized the need to “flatten” data for better visual comparison. To do that, we can compute an equation for the curve as the paper does (log-log regression ignoring the first 300 numbers), and graph the residuals as ratios against the curve.

8191 appears to be quite a special prime. Perhaps not coincidentally, it’s the only Mersenne prime in the 300 – 10,000 range. 10,000 gets a boost because some sequences are related to numbers that can be interpreted in binary and sometimes other bases.
Technical detail
I’m using the sequences as downloaded from the site, which means most are truncated to only include the first 50 terms or so. Since most sequences have terms in increasing order, higher numbers are more likely to be undercounted. For instance, A000027 is the sequence of positive integers, but the downloaded encyclopedia (and web view) only includes the first 77 terms.
A proper counting would first extend each sequence to cover the plotted domain. That would be generally impossible since some sequences are not completely understood, but it still might be interesting to extend those sequences that are well-understood and computable.
And, of course, there are numbers outside the 1 – 10,000 range. The counting becomes less reliable since they generally occur later in their sequences and are omitted from the download.
Personal trivia
Sloane mentioned me by name while displaying a graph I made of the “forest fire” sequence in an older Numberphile video, Amazing Graph II (time-stamped link). I almost didn’t recognize my name since he guessed a Chinese pronunciation of “Xan.” Given that he said I was one of his friends, maybe I can refer to him as just “Neil” now.
