Seeing Uniformity

During 2023, I thought I noticed a tendency of my two-digit authentication challenge codes (10-99) to skew toward higher values, especially those starting with 9. In 2024, I recorded all the challenge codes I saw (135), and I kept it up through 2025 (120 more). I felt that the high tendency continued for a while and then went away, but how can I quantify what I thought I was seeing now that I have two years of data?

Visual assessment

The simplest visual assessment of uniformity may be a histogram. Binning limits the kinds of pattern we can detect, but it’s at least a good starting point. Here are separate histograms for each year using bins of size 10.

The 2024 distribution looks a little suspicious, with a conspicuous dearth of codes starting with “1”. Interpreting any bias toward higher values is muddied by the unusually large ‘30s’ bin. The 2025 distribution has less variation and appears more compatible with an underlying uniform distribution.

Because these are discrete values rather than continuous measurements, both the graphics and the tests will be coarser than usual.

Another useful and underutilized graphical representation of a distribution is a Cumulative Distribution Function (CDF) plot. A CDF plot shows, for each value, the fraction of observations at or below it. To assess uniformity, we can plot the empirical CDF against the theoretical CDF for each year.

Again, 2024 has more deviation from uniformity.

Perhaps a downside of the CDF plot is that it’s not very space efficient. That is, the values being compared (deviation from the diagonal) are very small compared to the graph size. Plus, due to the sine illusion (where we perceive the perpendicular distance rather than the y-axis aligned distance), it’s harder to assess the distance between the curves. Both of those issues can be addressed by applying John Tukey’s advice,

Whatever the data, we can try to gain understanding by straightening or by flattening. When we succeed in doing one or both, we almost always see more
clearly what is going on.

In this case, we can just subtract the two curves so our eyes don’t have to work against their nature. That transformation doesn’t change the data or the test, but it puts the relevant variation onto a scale our eyes are better at judging.

So while the 2024 codes do look suspiciously non-uniform, is there any way to quantify it and detect any transition over time?

Statistical Assessment

It turns out that both of the graphical techniques above have matching statistical tests. The binned histograms correspond to a chi-squared test on the count distribution. I get a likelihood ratio probability (p-value) of 0.013 for 2024 and 0.956 for 2025. So 2024 is solidly beyond the usual 0.05 threshold for rejecting uniformity. (Technically, this is a G-test, which is a more precise formulation of the chi-squared test for sample sizes under 1000.)

For the CDF graph, the corresponding statistic is the Kolmogorov–Smirnov Distance, which is basically (after adjusting for discreteness) the maximum vertical distance between the ECDF and the theoretical CDF, which is nicely apparent in the flattened view graph. The p-values for the K-S distances for the two years are similar to those of the likelihood ratios (0.006 and 1.000). It’s counterintuitive that a measurement at a single position on the curve can summarize the deviation of the whole curve, but it works because of the two-way cumulative nature of the curve (starting and ending at zero). As a result, each point is dependent on all the other points.

Uniformity over time

So it does seem plausible that the 2024 values were somehow biased. Trying to detect a specific transition to uniformity is a bit trickier. One approach I came up with is to plot a running value for the Kolmogorov–Smirnov distance over a centered window of values around the current value. Here’s what that looks like for a moving window having 50 values on either side (but one side may be truncated by the boundary).

The black line shows the critical threshold for a single distance measure; the threshold gets larger near the edges since the window gets clipped (and thus contains fewer data values). Since the graph contains many distance tests, it’s expected that even truly random sequences exceed the threshold at some point. In other words, the black line is only a rough visual reference, and we shouldn’t read too much into any single excursion.

However, I can’t help but notice that the running distance value is near or above the critical line for all of 2024 and is comfortably below that for most of 2025. Just as I was about to pat myself on the back for proving my original premise of bias, I tried one more diagnostic: see how different random sequences would look. Below is the same plot overlaid with the same distance calculation for 100 sequences of random integers.

Now my data doesn’t look so special.

Maybe it’s not so bad. With 100 curves, you’d expect a few to be more extreme than a global 0.05 threshold (not shown) and most of them to exceed the point-wise 0.05 threshold (dotted line). At least latter part of my curve is securely in the dark central region of curves. Besides, I only suspected the 2024 data to be a little off, not defectively so.

Whether or not there was something off in my data, this investigation has been a good reminder that deviations are surprisingly common, even in truly random data.

Fediverse Reactions

Leave a Reply

Discover more from Raw Data Studies

Subscribe now to keep reading and get access to the full archive.

Continue reading