DonaldRauscher.com

A Blog About D4T4 & M47H

538 Riddler: Dice Poker Riddler

01 January ’17

In this week's Riddler, we have another game theory problem. We can describe each player's strategy with a 6 number tuple. For player A, represents the probability that player A raises given a roll of i. For player B, represents the probability that player B calls a raise from player A given a roll of i. Each player's expected winnings can be expressed as follows:

We can start by analyzing the pure strategies. Pure strategies explicitly define how a player will play a game (e.g. do X if opponent does Y). In the above definition, a pure strategy is one where and are binary. For our game, each player has potential pure strategies. Using the above formula, we can calculate player A's winnings for a every pair of pure strategies, then search the resulting 64x64 grid () for Nash equilibria. Because this is a zero-sum game, potential pure Nash equilibria will be "saddle" point(s) satisfying the following conditions:

In other words, they will be column maximums (meaning player A will have no reason to deviate) and row minimums (meaning player B will have no reason to deviate). Interestingly, as is seen in the visual below, there are no pure Nash equilibria! We can see this by looking at a few common strategies. Player A's most common row-maximizing strategy is always raising: (1,1,1,1,1). However, if player B knew player A was using this strategy, they would respond by only calling if they had at least a 2, netting them $0.11 of expected winnings. And if player A knew player B was using this strategy, they would respond by only raising when they had a 4 or higher, netting them $0.17 of expected winnings and triggering yet another change to player B's strategy.

Though there is not a pure strategy Nash equilibrium, there must exist a mixed strategy Nash equilibrium. A mixed strategy is simply a linear combination of pure strategies, the coefficients representing how often that strategy is to be used. Von Neumann's minimax theorem tells us that this equilibrium is at the minimax. We can construct a pair of linear programs to find the first and second player strategies ( and respectively):

Solving these linear programs, we find that player A's optimal strategy is to always raise when they have a 5 or 6 and raise of the time when they have a 1 (i.e. bluff). Very cool! Player B's optimal strategy is to always call when they have a 5 or 6 and call , , and of the time when they have a 2, 3, and 4 respectively. At this equilibrium, player A's expected winnings are $0.093.

Calculations below in R. Also need to credit Laurent Lessard's blog post on this problem; as always, he did an awesome job laying out the underlying math.

suppressMessages(library(dplyr))
suppressMessages(library(tidyr))
suppressMessages(library(ggplot2))
suppressMessages(library(lpSolve))

# win / lose / split matrix
E <- matrix(0, nrow=6, ncol=6)
E[upper.tri(E)] <- 1
E[lower.tri(E)] <- -1

# a is probability of raise, b is probability of call if raised; both are vectors of length 6
winnings <- function(a, b){
  a_matrix <- matrix(rep(a, times=6), ncol=6, byrow=TRUE)
  b_matrix <- matrix(rep(b, times=6), ncol=6, byrow=FALSE)
  temp <- (1-a_matrix)*E + a_matrix*(2*b_matrix*E + 1*(1-b_matrix))
  return(mean(temp))
}

# make exhaustive pure strategy state space
pure_strat <- t(expand.grid(lapply(1:6, function(x) 0:1)))
n_pure_strat <- ncol(pure_strat)
pure_strat_cross <- expand.grid(A=1:n_pure_strat, B=1:n_pure_strat)
pure_strat_cross$W <- with(pure_strat_cross, mapply(function(a, b) winnings(pure_strat[,a], pure_strat[,b]), A, B))

# find pure nash (if exists)
pure_strat_cross <- pure_strat_cross %>%
  group_by(A) %>% mutate(min_W_per_A = min(W)) %>% ungroup() %>%
  group_by(B) %>% mutate(max_W_per_B = max(W)) %>% ungroup() %>%
  mutate(is_nash_eq = (W == min_W_per_A & W == max_W_per_B))

sum(pure_strat_cross$is_nash_eq)
[1] 0
ggplot() +
  geom_tile(data=pure_strat_cross, aes(x = A, y = B, fill = W)) +
  scale_fill_gradient(low="red", high="green") +
  geom_tile(data=filter(pure_strat_cross, W == max_W_per_B), aes(x = A, y = B), colour="darkgreen", alpha=0, size=0.75) +
  geom_tile(data=filter(pure_strat_cross, W == min_W_per_A), aes(x = A, y = B), colour="darkred", alpha=0, size=0.75) +
  geom_tile(data=filter(pure_strat_cross, is_nash_eq), aes(x = A, y = B), fill="gold")

# most used strategies and opposite player responses
top_A_strategy <- pure_strat_cross %>% filter(W == max_W_per_B) %>%
  group_by(A) %>% summarise(num_B = n_distinct(B)) %>%
  arrange(-num_B) %>% head(1) %>% .$A

t(pure_strat[,top_A_strategy])
     Var1 Var2 Var3 Var4 Var5 Var6
[1,]    1    1    1    1    1    1
t(pure_strat[,pure_strat_cross %>% filter(A == top_A_strategy & W == min_W_per_A) %>% .$B])
     Var1 Var2 Var3 Var4 Var5 Var6
[1,]    0    0    1    1    1    1
[2,]    0    1    1    1    1    1
top_B_strategy <- pure_strat_cross %>% filter(W == min_W_per_A) %>%
  group_by(B) %>% summarise(num_A = n_distinct(A)) %>%
  arrange(-num_A) %>% head(1) %>% .$B

t(pure_strat[,top_B_strategy])
     Var1 Var2 Var3 Var4 Var5 Var6
[1,]    0    1    1    1    1    1
t(pure_strat[,pure_strat_cross %>% filter(B == top_B_strategy & W == max_W_per_B) %>% .$A])
     Var1 Var2 Var3 Var4 Var5 Var6
[1,]    0    0    0    0    1    1
[2,]    0    0    0    1    1    1
# mixed strategy nash equilibrium
P <- matrix(pure_strat_cross %>% arrange(A, B) %>% .$W, byrow=TRUE, ncol=n_pure_strat)

obj <- c(rep(0, n_pure_strat), 1)
A_base <- rbind(
  matrix(c(rep(1, n_pure_strat), 0), nrow=1), # sum of probabilities equals 1
  cbind(diag(n_pure_strat), 0) # probabilities >= 0
)
A1 <- rbind(A_base, cbind(t(P), -1)) # minimax constraints
A2 <- rbind(A_base, cbind(-P, 1)) # minimax constraints
dir <- c("=", rep(">=", 2*n_pure_strat))
b <- c(1, rep(0, 2*n_pure_strat))

lp_A <- lp(direction="max", objective.in=obj, const.mat=A1, const.dir=dir, const.rhs=b)
lp_B <- lp(direction="min", objective.in=obj, const.mat=A2, const.dir=dir, const.rhs=b)

# solutions and expected winnings
head(lp_A$solution, -1) %*% t(pure_strat)
          Var1 Var2 Var3 Var4 Var5 Var6
[1,] 0.6666667    0    0    0    1    1
head(lp_B$solution, -1) %*% t(pure_strat)
     Var1 Var2      Var3      Var4 Var5 Var6
[1,]    0  0.5 0.8333333 0.3333333    1    1
lp_A$objval
[1] 0.09259259