I invented a game, and it goes like this. We’re going to pick a 20 digit number by taking turns choosing each digit. I choose the first digit, then you choose the second digit, I choose the third and so on. Once we’ve chosen all the digits, we use our number as the seed to a random number generator. The random number generator picks a number between 0 and 1, and if the number is greater than 0.5 then I win; if it’s less than 0.5 then you win.
Obviously this isn’t meant to be a “fun” game, it’s more of an open-ended math problem. What’s the strategy? Is there a strategy? Who wins?
The idea behind the random number generator, is that it’s deterministic, and yet opaque. Given any particular seed, the random number generator will consistently pick the same result—either you win, or I do. But there’s no particular pattern to it. It behaves as if the result were randomly chosen. The only way to predict the game’s outcome is to individually plug in each random seed into the random number generator. However, this might be intractable, as there are 10^20 possible seeds.
This game is deterministic, finite, and perfect information—much like Chess. However, it appears that the only real strategy is brute force, by plugging in seeds into the random number generator.
Who wins?
Nonetheless, we know with near certainty who will win. The last player wins. That’s because on the last turn, you have ten options. You can just plug in each of the ten options into the random number generator to see which one wins. It’s very likely (99.9%) that at least one of your options wins, so you just pick that one.
On the other hand, there are still the 0.1% of cases where the last player loses. If the second-to-last player looks ahead two moves, they might be able to engineer the situation. They have 10 options, and if each option has 0.1% chance of winning, then there’s about 1% chance that the second-to-last player has a winning move.
And then what happens when the players look 3 moves ahead? 4 moves ahead?
A mathematical solution
Let’s define the quantity pi, the probability that the first player will win when the game is i turns long. This is assuming perfect ability to look infinite number of turns into the future.
Let’s also define the number N, which is the number of digits that you can choose from. We were initially thinking about N=10, since there are ten digits 0-9, but we might want to change N later on.
Another rule in my initial statement of the game, was that you would win if the generated random number was greater than 0.5. Let’s turn this into a variable. The last player wins if the random number generator produces a number greater than x.
So suppose we have a game with i turns. The first player looks at their N available options, and each option is equivalent to an (i-1)-turn game where the second player goes first. For each of the available (i-1)-turn games, the second player wins with probability pi-1 (and these probabilities are assumed to be independent). The first player can win if they can choose a move that causes the second player to lose. So we can calculate the probability pi in terms of the probability pi-1. The equation looks like this:
pi = 1 – (pi-1)N
If you think about the 1-turn game, you will find that
p1 = 1 – xN
So it is convenient to define p0 = x.
I don’t have an explicit formula for the probabilities, but it’s easy to input the iterative expression into a spreadsheet. For N = 10 and x = 0.5, here are the first several probabilities:
p0 = 0.5
p1 = 0.999023438…
p2 = 0.009722821…
p3 = 0.999999999999999999992…
p4 ~ 7*10^-20
This follows our intuition that the last player nearly certainly wins. And it gets worse the longer the game is. However, you can level the playing ground by increasing x or by decreasing N. I found that if you let N=2 and x=(sqrt(5)-1)/2, then neither player clearly comes out ahead no matter how long the game.
Imperfect information
Obviously, in practice the players can’t look an infinite number of moves ahead. So we can consider the case where players can only look M moves ahead. If there are more than M turns left, then the players can do no better than choose randomly. So the first turn of the game that actually matters is when there are M turns remaining. At that point, the current player will have pM probability of winning.
What happens when one player can look more turns into the future than can the other player?
Here’s another scenario. What if each player can check 100 random numbers per turn? So when there are two turns left (with N=10), they can look ahead all the way to the end. But when there are 3 turns left, they can only see 10% of the possible outcomes. There must be a way to sample outcomes to maximize the probability of winning.
Unfortunately I’m going to leave these questions as exercises to the reader. I think they’re solvable problems, it just might take longer than I’m willing to write out.
Discussion
I’m mostly interested in this game as a pure math problem. However, I do think it’s worth pointing out that this is a game that doesn’t really involve any chance, since we’re using a deterministic random number generator. And yet, the best way to analyze the game is in terms of probabilities. The probabilities don’t really represent chance, but rather they represent our lack of knowledge of the behavior of the random number generator.
In a way, it’s rather analogous to chess. Although Chess is deterministic, we lack knowledge about the gamestate 10 moves in advance. In a sense, Chess is not just a game of skill, but also a game of luck.
But unlike Chess, I wouldn’t say the Random Number Game is a game of skill at all. In Chess, I think you can look at a board and tell whether it’s a good position for white or black. In the Random Number Game, there really isn’t any way to tell who’s winning until there are only a few turns left.
jenorafeuer says
Back when I was young, I read one of Martin Gardner’s books (looking it up, it was The Unexpected Hanging and Other Mathematical Diversions, and was a reprint of a 1962 ‘Mathematical Games’ column). It described a little game called ‘hexapawn’ with a 3×3 chessboard and chess pawns, with the winning condition being successfully getting a pawn to the final rank.
It also then described a ‘matchbox computer’ based on an earlier tic-tac-toe device called MENACE (Matchbox Educable Naughts And Crosses Engine) where the idea was that you had a matchbox for every possible board position, and a coloured bead inside the box for every possible move. For the ‘computer’ to make a move, you would find the matchbox that had the picture of the current board configuration on the top, shake it, and take out one random bead, and make that move.
To make the machine ‘learn’, whenever it loses, the last bead taken out would be removed permanently. Optionally, if the machine wins, the last bead taken out would have an extra one put back in. If the machine gets to a box that has no beads in its decision tree, it resigns, and counts that as a loss, taking out the previous bead. Eventually, the machine will eliminate all losing move sequences and will always win, draw, or resign on its first move. It’s really brute force machine learning, and I think it should be pretty obvious why your article reminded me of that.
Also, I think your comment about chess and skill at the end pretty much boils down to the fact that chess is rather more amenable to heuristics that allow one to make do with incomplete future information, while the random number game isn’t. In fact, the better the random number generator is as a source of ‘randomness’, the less amenable it should be to heuristics. Anything that qualifies as ‘cryptographically random’ should essentially have no possible heuristics less computationally expensive tan a brute force search.
Perfect Number says
“I found that if you let N=2 and x=(sqrt(5)-1)/2, then neither player clearly comes out ahead no matter how long the game.”
Oh this is cool. How did you come up with the number (sqrt(5)-1)/2? (Seems like the kind of thing one could spend a lot of time playing around with in matlab.)
This reminds me of another problem (maybe a Putnam problem?) where there’s an empty 2010×2010 matrix, and Bob first fills in a number, and then Alice fills in a number, and they take turns until the whole thing is filled. Bob wins if the determinant is nonzero, Alice wins if it is 0. And the winning strategy for Alice is based on the fact that a matrix with 2 identical columns will have a determinant of 0, so she just has to always fill in the same number as Bob just did, in the column next to him. And when my friends and I were discussing this, we were like “wow Alice is no fun to play with.”
jenorafeuer says
@Perfect Number:
(sqrt(5)-1)/2 is the classic ‘golden ratio’, and it’s the solution to x=1/(x+1). Take a rectangle with x:1 ratio, cut a square off one end, and you get a smaller rectangle with the same ratio.
Siggy says
@jenorafeuer,
I think that’s what’s called a model-free reinforcement algorithm. Although, it’s usually intractable to have a matchbox for every possible gamestate, so gamestates need to be grouped using various attributes, using heuristics. As you pointed out, the random number game is not amenable to heuristics.
@Perfect Number,
N=2 and x=(sqrt(5)-1)/2 are a fixed point in the iterative equation. With these values of N and x, we find that
pi = 1 – pi-1^2 = 1 – x^2 = x
for all i.
When writing this up, I tried a handful of parameters in a spreadsheet. The fixed point is a repulsor–the more turns in the game, the more one player is assured of a winning position.
Perfect Number says
The golden ratio- ah this makes sense