Quordle/Octordle starting triplets

As a follow-up to my Deep Wordle exploration of optimal Wordle play, I investigated a different strategy for variations like Quordle, Octordle and Sedecordle, where you use the same guesses to find multiple solutions in parallel. In those games, I start with a fixed triplet of three starting words and then hope to guess one solution in each subsequent guess. So that would be Quordle in 7, Octordle in 11 and Sedecordle in 19.

I’ve had pretty good success with the triplet “chomp grind salty”, achieving the 7 goal over half the time. It includes 15 distinct common letters and strategically omits “e” and “u”, hoping they can be guessed when needed. How can I quantify what a great triplet I found?

Scoring triplets

While it’s not practical to fully examine every possible triplet of guesses against every possible set of 4 or 8 or 16 solutions, I tried a few different approximate scoring methods:

  1. Average bucket size for each triplet
  2. Maximum bucket size for each triplet
  3. Average game length, from simulated games
  4. Probability of solve-in-seven, from simulated games

Triplets

I constrained the triplets to my set of about 3600 “common” five-letter words [commomn.txt on github]. I made it by merging the 2300 Wordle solution words with other lists of common words. Many of the additions are plurals of four-letter words.

Selecting three words using fifteen distinct letters resulted in 4 million triplets. At one point, I disallowed rare letters “jkqxz” which reduced the triplet count to 1 million, but with only a 4x difference, I stuck with the full letter set.

Buckets

By “bucket size” I’m referring to the way each Wordle response partitions the potential solutions into groups I’m calling buckets. There are three responses per letter (gray, yellow and green) and so 3×3×3×3×3 = 243 possible responses per word, creating 243 buckets of potential solutions. For three words, the upper bound for the number of buckets is 243 × 243 × 243 = 14.3 million.

Since the words have distinct letters, many of those clue combinations can’t happen. For instance, at most 5 of the 15 letter responses can be non-gray. It would be an interesting combinatorial counting problem to determine a better upper bound, but fortunately allowing for all 14.3 million possibilities wasn’t a burden to the scoring. For what it’s worth, I saw 84,357 non-empty buckets using legal Wordle words.

Average bucket size

A good triplet will split the 2300 possible solution words into many buckets with very few words in each bucket, ideally one (or zero). So as an approximate triplet quality score, I computed the average size of the non-zero buckets. Here’s the distribution of that average over all 4 million triplets.

Histogram showing distribution of the average number of potential solutions per response to each of 4M triplet of wordle guesses. A small amount are in the 1.25-1.5 bin. Peak is in 2.25-2.5 bin, tapering off quickly.

The triplets with the lowest averages were:

chimp robed slant  1.418
boned clasp right  1.423
chimp gored slant  1.424
bored chimp slant  1.428
birch moped slant  1.428

My “chomp grind salty” came in ranked 107,173. Not so great, after all, but out of 4 million triplets, it’s in the top 2.5%. 😀

Maximum bucket size

Maximum bucket size is probably not so interesting since it likely arises from a very obscure solution set and the result is an integer which means there are a lot of ties. Nonetheless, it was nice to see the distribution (clipped to 31 in the histogram), and there was one surprising result. Only a single triplet had a maximum bucket size of 5: charm gifts poled.

Histogram showing distribution of the maximum number of potential solutions per response to each of 4M triplet of wordle guesses.

Simulated game scores

With such small average bucket sizes after three guesses, it should be possible to simulate a few more moves to get a better sense of the playability of each triplet. We can’t play all 2300 × 2300 × 2300 × 2300 = 28 trillion solution sets (even more for Octordle), but a random sampling should be informative. My Mac is able to run 50,000 game simulations for a single guess triplet in under a second. Still too slow to cover all 4 million triplets, so I ran the game simulation on the top 100,000 triplets.

The average game score (in turns count) and the probability of solving in seven turns were highly correlated, especially at the top. Here are the top five triplets by probability, also showing the average turns count.

chimp	robed	slant	53.5%	7.562
bored	chimp	slant	53.3%	7.570
chimp	gored	slant	53.0%	7.575
coped	glint	marsh	52.8%	7.582
birch	moped	slant	52.8%	7.579

Amazingly, “chimp robed slant” is still at the top, by both measures. Only “coped glint marsh” is new to the top five, having ranked 28th in the average bucket size score. 186 triplets achieved a solve-in-seven success rate of 50%, but I feel like those values are a bit low. My daily stats for “chomp grind salty” show a 50+% solve-in-seven success rate, but the simulations give it at 41% chance. I’m guessing it’s a result of the simplified solving behavior use in the simulations. Even with that downward bias, the score should be meaningful for relative quality.

I manually added “chomp grind salty” just for kicks, and it did move up 60,000 spots in the rankings, which saves my pride a little bit.

Simulation versus bucket size scoring

The two scoring methods, bucket size and game simulation, are reasonably correlated. Here are the two measures against each other for the top 100,000 that I simulated.

Scatter plot of the bucket size score on the Y and the simulation score on the X. Looks like a linear correlation but widening at the high values.

I had to crank the transparency way down to get a sense of the core density, but then it’s hard to see the outlying points. Here’s a 2-D HDR plot to show both the Highest Density Regions and the outliers.

HDR plot of the bucket size score on the Y and the simulation score on the X. Looks like a linear correlation but widening at the high values.

That’s “chimp robed slant” in the bottom left. In spite of the randomness, I tend to trust the simulated games as a truer measure of the quality of each triplet since it’s more grounded in game play.

It’s all good

Though I’ve highlighted the top scoring triplets, the real take-away is that there were about 1 million triplets of very similar quality (first histogram). So pick one you enjoy. I think I’ll start using “build graph often” which has a respectable 7.88 average turns per game.

Bonus chart: letter sets

Each triplet uses 15 distinct letters, and many of them use the same letters. In other words, there are many anagram triplets. The most common letter set among the top 100,000 triplets was “acdehilmnoprstu” with 1753 anagrams! You might think that anagram triplets would have very similar scores, but there is more variation than I expected. Here are 1-D HDR plots for the best scoring letter sets, denoted by their difference from the most common letter set.

HDR plots of the simulation score on the X for each letter group on the Y.

Leave a Reply