Site Update and New Attempt Estimation

Sep 05, 2023

The first iteration of J!ometry was a Google Sheet with some graphs I’d post from time to time. The second iteration took that process and put it in a local database, which I could publish to online CSVs and build some graphs around. The third iteration smoothed out the graph process. In each case, the changes were largely organic to whatever I was pushing on at the time, and the underlying calculations remained the largely same — and since those originated in a spreadsheet, they were structurally unsound, meant to fit into single formulas and unwieldily difficult to understand.

Now I am launching the next iteration of the data and the site. Under the covers, there’s some cool stuff:

The data pipeline verifies J! Archive and box score data against each other to catch its own misinterpretations.
The fundamental data structures go from game, round, and sometimes clue to game, round, clue, buzz opportunity, and wager opportunity. This allows the attempt estimation to more easily consider when there are multiple possibilities for attempts on a single clue.
It’s easier for the pipeline to accept different forms of attempt estimation that are based on the combination of box score and J! Archive data. Although direct calculation is still supported, there is an emphasis on allowing simulations and averaging results.
Data is delivered to the website in a larger variety of files at different levels of data. In particular, the addition of round-based files allows for game and round statistics to be treated more like each other, which will allow for more round-level statistics to be shown on the site more easily.

But there’s some cool stuff that’s directly visible, too:

The new attempt estimation more directly includes the possibility of multiple attempts on a clue. This has had the general effect of reducing estimated attempt value (although not always).
Some graphs that have thus far only been posted on Substack are now available on the site itself.
Tabular data has been broken into smaller chunks in order to make it more coherent to read.
In addition, smaller-width tables, some work on formatting them for better readability, and work on redrawing graphs at proper scale have made the site look and feel better on mobile devices.
Game pages have some new game state information for Daily Doubles and Final Jeopardy.

Attempt Estimation

As noted above, the data pipeline now leans on simulating in order to estimate. What exactly does it do?

For a particular contestant and round, use J! Archive data to classify all buzz opportunities within the round:
1. Known attempts the contestant made. These are their buzzes, with correct and incorrect responses.
2. Known non-attempts. These are any buzz opportunities after the contestant has already given a response (almost always incorrect, but possibly correct if there are later scoring changes) and the last opportunity to buzz on Triple Stumper clues where the contestant doesn’t buzz at all.
3. All other buzz opportunities are possible attempts.
Subtract the count of known attempts from the attempt count recorded in the box score. This is the number of attempts to allocate.
Send all of the above data to a weighting function that provides a weight for each of the possible attempts in 1c.
Choose a possible attempt at random, weighted from 3. For this simulation, this becomes a known attempt.
Repeat 2-4 until there are no attempts left to allocate.
Repeat 1-5 half a million times, and return the percentage of time each possible attempt was selected.

The weighting function used by the core site metrics is simple. The weight returned for each attempt possibility is 8 minus its row on the board (1 for the top, 5 for the bottom). The weights do not refer to what’s already known or selected, even though they could, so each possibility is treated effectively independently. This is a hugely simplifying assumption to make, as the number of possible dependencies is dizzying, and it’s not clear which ones to prioritize beyond the ideas covered in the bottom of the board being more difficult. As for the weights themselves, my early work on estimating how difficulty changes down the board was that — if you assumed everyone’s knowledge is independent of each other — the correct response rates at each row correspond with roughly a 70% chance of a particular contestant knowing a top row clue and decreasing at roughly 10 percentage points down the board to 30% at the bottom. It’s rough, but the whole thing is rough as there’s no actual known attempt allocations to compare to and train on.

The data structures are there to start considering actual attempt allocation if Jeopardy! starts publishing them, though.

Further Notes

Although a lot of things became easier to put on the site with these changes, a few things didn’t fit quite right on the first try. Some of them were things I wasn’t sure I (or anyone) was using anymore, so I opted in favor of pushing the updates out and continuing to adapt. Let me know if something you liked is missing or if you see a spot for improvement. Sometimes those improvements are hard and aren’t done for a reason, but sometimes they’re easier than you might assume.

Finally, you can now find J!ometry on Bluesky as @j-ometry.com (I learned how to do the AT Protocol’s version of verification and it was easy and cost nothing beyond having the domain). You can additionally find me personally at @tkfocht.bsky.social though I’m going to be pushing Jeopardy! content to the J!ometry account.