Mar 20-24, 2023: The Process(es)
Melissa Klapper wins three, and DD wagering takes another turn in the spotlight
Attempt Statistics
Total attempts decreased slightly this week, falling to 112.6 per game from 113.4 per game last week. On the scale of movement this season, this is effectively level. A small increase from the lowest contestants offset a small decrease from the leading contestants, keeping the total similar but indicating a slightly more competitive game.
The ebbs and flows of the spread between max and min attempts are strange to me. Changes at max are easy to explain — players who attempt a lot and are successful with it will return as champions, where they will continue to attempt a lot and be successful. I don’t have a sense of what would drive the bottom upwards without the top increasing as well except for something in contestant selection. March 2021 is interesting as a prolonged time of constrained spread, and we saw that period produce both several 4-game champions and about two weeks of shorter turnover. I do remember that as a particularly exciting time to be watching. The last month or so today might be trending towards that constriction again, but I don’t think any trends here are actually systemic.
Difficulty by Row
I felt like I saw an increased amount of chatter the last couple weeks about clues not falling in an order reflecting their difficulty, so I ran some numbers. Here’s the get rate (what percentage of clues read had at least one correct response) for non-DD clues for each row in regular play, split by TOC qualification periods.
Jeopardy Round
2019 2021 2022 2023
$200 96.6 95.8 94.7 97.5
$400 92.1 92.2 91.7 92.3
$600 89.5 87.6 88.3 90.4
$800 86.3 84.1 85.7 88.6
$1000 77.3 74.8 76.3 83.5
Double Jeopardy Round
2019 2021 2022 2023
$400 95.2 93.5 93.5 94.2
$800 89.0 89.5 88.5 89.1
$1200 84.0 84.3 83.7 84.4
$1600 76.3 78.4 78.3 80.9
$2000 66.2 65.8 68.4 77.0
These are all aggregate measures, so any individual category in any individual game might be misordered within this, but every single percentage here decreases as you go down the board. Something is going mostly right overall.
One possible objection is to hypothesize that clues at the bottom of the board aren’t truly that much more difficult, but that the higher penalty for being incorrect is leading players to be more cautious and not ring in at times where, if they did, they would be correct. I think the increases we see here in 2022 and particularly 2023 (wow) are good circumstantial evidence for that being an effect that exists but has become less pronounced as play grows more aggressive. That said, there’s still an alignment going down.
The spread of percentages may also be less than you’d think, but keep in mind that each of these is a measure of if any player had a correct response, and the difficulty slope for a single player is different. For instance, if player’s get rates are independent of each other — this is highly dubious, as they’re definitely sometimes correlated and definitely sometimes negatively correlated — a 97.5% get rate overall reduces to a 70.8% rate for each player, while a 83.5% get rate overall reduces to a 45.2% rate per player. That’s a significantly higher spread than the overall rates’, even for the smallest spread in the dataset. We know from attempt data that most players are attempted no more than two-thirds of the time, so I believe in some of this independence rather than “everyone is attempting everything.”
Finally, the get rate increases in the 2023 period are noticeable, and I think this might actually be what people are reacting to. As the measurable spread at bottom of the board decreases, the likelier it is that particular categories will feel misordered. Note that every row’s get rate has increased from the 2022 period to the 2023 period. I think these increases are more likely a reflection of player approach to the game than an overall change in the actual difficulty of the clues.
Daily Double Wagers
There was also a lot of discussion about Daily Double wagering this week based on a couple of poor wagers. Yes, I think they were poorly made, and I will stand by that. Sometimes bad wagers can work out anyway, and sometimes they don’t, but analytics is largely about process, and evaluating process to understand how results happen. The process here was sometimes bad.
But I think that’s mostly okay! Maybe it’s because I come to analytics from a background in baseball, and baseball is mostly about failing, failing, failing, and trying again. When you play Jeopardy!, you are taking your first hacks at big league pitching, and most people only get one shot. (Karen Morris made a similar metaphor using college football on her Twitter thread this week.) One of my coworkers in scouting has been very into my time on and thoughts about the show, and once when I had pulled up J!ometry stats, he asked me if they could actually be used for prediction.
I told him yes, but only very roughly, because you get so little data on anyone. Andrew He once compared it to watching someone play for four innings and trying to predict the rest of their season. For a lot of people, it’s even less. If you pulled up the video to watch any professional player and watching just two or three at bats, what conclusions would you actually be able to draw? You’d take what you can glean about the underlying process, not the direct results.
Arguing that wager theory is a skill anyone should be able to perfect doesn’t reflect the nature of being in the moment, under this particular pressure for the first and probably only time, and trying to factor in any and every thought you’ve ever had on it. I know I didn’t get it right every time the first time I was on, and it was really only with hindsight that I could see it. It’s why wagering was the main thing I worked on before the tournament. But there’s no particular reason to fixate on wagering as the perfectible skill. It’s just the one where the seams of the process are most visible. Part of what I want to do here is to make other seams more visible, so that we have a similar sense of the value of attempts, of balancing going for buzzes with reasoned guessing versus maintaining high accuracy, of clue selections for Daily Doubles, of parsing Final Jeopardy clues.
This Week’s Champions
Melissa Klapper
Average buzz score: $14800 (t-13th out of 31 champions this period)
Average attempt value: $34957 (13th)
Average conversion value percent: 42.7% (18th)
Average buzz value percent: 51.4% (25th)
Average accuracy value percent: 82.0% (10th)
Average DD score: $875 (24th)
Average DD+: +0.60 (6th)
Average end DJ score: $15675 (18th)
Melissa won three games for $59100, leading with a crush in each of the first two. She narrowly missed a runaway game on Monday with a DD3 I consider miswagered, but her correct FJ rendered it moot. She was able to use her other correct FJ, on Wednesday, to overtake Karen Morris for the win.
Melissa rates roughly in the middle of champions this year in scoring, with timing difficulty balanced by good accuracy. Melissa also struck an unusual mix of finding Daily Doubles and struggling with them at times, and her three misses on Thursday helped Alec Chao defeat her.
Alec Chao
Average buzz score: $11500 (25th out of 31 champions this period)
Average attempt value: $27303 (28th)
Average conversion value percent: 41.1% (20th)
Average buzz value percent: 71.5% (1st)
Average accuracy value percent: 57.4% (29th)
Average DD score: $0 (28th)
Average DD+: -1.44 (31st)
Average end DJ score: $11500 (29th)
Who knew buzzing behind your back could be so profitable? Maybe Wil Wheaton, but definitely Alec Chao. Alec ranks first among this period’s champions in timing, both by count and by value. Unfortunately, this was tempered with a low accuracy, something Friday’s slaughterhouse of a Double Jeopardy round did not help. Alec was also hampered by not finding any Daily Doubles across his two games, ranking him last among this period’s champions in DD+. Without those opportunities, Tamara Ghattas was able to pass him in Friday’s match, leaving him at one win and $15505.
Tamara Ghattas
Average buzz score: $8400 (30th out of 31 champions this period)
Average attempt value: $33647 (19th)
Average conversion value percent: 25.0% (31st)
Average buzz value percent: 54.7% (19th)
Average accuracy value percent: 45.7% (31st)
Average DD score: $5400 (5th)
Average DD+: +0.78 (1st)
Average end DJ score: $13800 (24th)
Tamara was effectively the only player scoring during Double Jeopardy in her game, going from in the hole to a crush lead. In the process, she doubled up on DD2 and extended her lead to that crush with a smaller DD3, but her buzz score was tops for the game as well, even though it doesn’t rank well overall.
Looking forward, Tamara’s attempt rate is competitive, but her willingness to be incorrect (and incur low accuracy as a result) is likely to work against her. Her overall conversion ranks last among champions because of that low accuracy.