Can You Solve Dead Drop in the First Round? Pt. 2

If you haven’t read the first post on this topic, “Can You Solve Dead Drop in the First Round?,” I looked at the card game Dead Drop and ran some simulations to determine if it’s possible to use conditional probability to successfully determine the drop card in the first round, as the article’s title suggests. It was long and full of math. I figured, how can I top that?

First, here’s a quick refresher on my last post.

  • I simulated 100 games of Dead Drop in order to determine if you could approximate or successfully guess the drop card in the first round using conditional probability.
  • I also tested the method of picking a random number to successfully guess the drop.
  • I found that 57% of the time you could approximate which card is the drop (when cards have equal probabilities), but it was equally likely to successfully guess the card with conditional probability as it was by picking a random number.

What I want to explore in this post is:

  • Did the outcomes during my 100 simulations match mathematical expectations?
  • Was my sample size big enough to make those conclusions?
  • Is it really equally likely to successfully guess a card using conditional probability as it is to pick a random number? That’s crazy.
  • How does the setup for a two-player game change all of this?

This post dives into the weeds of hypothesis testing, which is not for the faint of heart, so if you’d rather just read my conclusions, click here.

Great Expectations

First, I wanted to look at the number of times certain cards were the drop during my simulations and see if the outcomes matched their mathematical expectations.

fiq1-dead-drop-pt2

It should look more like this:

fiq2-dead-drop-pt2

So maybe my sample size was too small.

What if I were to run 500 simulations?

fiq3-dead-drop-pt2

That looks about right. It’s still a little different from the mathematical expectations (the outcomes of 2 and 3 are uneven despite being equally likely), but it follows the Law of Large Numbers. Knowing this, how do these new findings affect the results of my experiments in the last post?

Let’s revisit these questions:

What is the probability of success to approximate the drop (in cases of equal probabilities)?

Out of 500 games, using probability only correctly pointed towards potential candidates for the drop 270 times. That means only 54% of the time you’ll be able to narrow down the drop card in the first round.

What is the probability of success for guessing the drop correctly?

Out of 500 games, using probability only successfully guessed the drop 116 times. That means only 23.2% of the time you’ll be able to guess the drop card in the first round.

Finally, what is the probability of success using the method of picking a random number to successfully guess the dead drop?

Out of 500 games, using the method of picking a random number successfully guessed the drop 101 times, or 20.2% of the time. In the last post, I found that picking a random number and successfully guessing was equally likely to using probability to successfully guess.

Let’s look at the new findings in a good ol’ two-way cross-classification table.

Success of Methods in Predicting the Drop
Failed Succeeded
Probability Prediction 384 116
Random Prediction 399 101

Hypothesis Testing

Now for some statistics! Say we want to test if the proportion of successes in my simulation is associated with the method of using probability for prediction. In other words, is there a significant difference between using one method over another? In order to test a hypothesis, I need a null hypothesis – the opposite of what I’m trying to prove. In statistics, a null hypothesis is represented by \(H_0\) and the alternative, my hypothesis, is represented by \(H_1\).

\(H_0\): There is no difference between the proportions.

\(H_1\): My hunch is right. There is a difference.

Let’s say I want to have 95% confidence that I’m right.

Now that we have our hypotheses, how do we actual test them? In statistics you use what is called the Chi-Square Test. It’s pretty simple and it looks like this:

\[\sum \frac{(observed – expected)^2}{expected}\]

That big E-looking thing (Sigma in Greek) just means you sum up everything that comes after it. “Observed” is each number in the table above, and “expected” is calculated by multiplying the sums of the row and column for each observation and dividing it by the number you get when you add up every number in the table. This results in what is called a test statistic.

If you want to look at a complicated process in R for calculating a test statistic, check out my GitHub repository for this post. There is an easy way of doing it, but I like doing the things the hard way. For now, all you have to know is our test statistic is 1.3242227.

Now let’s consult a Chi-Square Distribution Table. We have to compare our test statistic to a table value that matches our degrees of freedom and \(\alpha\) level of 0.05.

  • Degrees of freedom: Remember the two-way table up there. We get our degrees of freedom using the following formula: (number of rows – 1) * (number of columns – 1). Our degrees of freedom is 1.
  • \(\alpha\) level: What the heck is an \(\alpha\) (alpha) level and why is 0.05 important? I said I wanted 95% confidence. The table value is going to be a number like my test statistic. If my test statistic is larger than that number, then I can be confident that I’m right and the null hypothesis gets rejected. However, if it’s smaller, then I’m wrong: the proportions are equal. So 0.05 is the probability level (think of it as 5%). If my test statistic is greater than the table value, that means the probability of my hypothesis occurring by chance is less than 5%.

Looking at the table, it appears the value is 3.841, which is larger than my test statistic. I can calculate what’s called a p-value. Here I’m looking for a value smaller than 0.05 to see if there is any statistical significance. But alas, I already know the answer…

1-pchisq(test.statistic,df=df)
## [1] 0.2498356

It looks like I’m wrong. I cannot reject the null hypothesis.  So even with a larger sample size, my original findings appear to be correct. It’s equally likely to guess the drop using a random number as it is guessing with the card with the highest probability of occurring.

But What About a Two-Player Game?

Does a two-player game change the results of my experiment? In both the three-player and four-player versions, each player starts the game knowing the identities of six cards. But in the two-player version, each player holds a hand of five cards and the stash contains two cards.

I rewrote the simulate.game() function in R to take number of players as an argument.

# function to simulate the game
simulate.game <- function(x) {
  
  # warning if x does not meet specific parameters  
  if(x == 1 | x > 4) { 
    stop("x must be a number between 2 and 4") 
  }

  deck <- c(rep(0,4),rep(1,3),rep(2,2),rep(3,2),4,5)

  # the stash
  stash <- sample(deck,x,replace = F)
  deck <- deck[-remove.values(stash,deck)]

  # the drop
  drop <- sample(deck,1,replace = F)
  deck <- deck[-remove.values(drop,deck)]
  
  # remaining cards in deck
  r <- length(deck)
  
  # player1's hand
  player1 <- sample(deck,floor(r/x),replace = F)
  deck <- deck[-remove.values(player1,deck)]
  
  # player2's hand
  player2 <- sample(deck,floor(r/x),replace = F)
  deck <- deck[-remove.values(player2,deck)]
  
  # player 3's hand
  if(length(deck) > 0) {
    player3 <- sample(deck,floor(r/x),replace = F)
    deck <- deck[-remove.values(player3,deck)]
  } else { player3 <- NA }
  
  # player 4's hand
  if(length(deck) > 0){
    player4 <- sample(deck,floor(r/x),replace = F)
    deck <- deck[-remove.values(player4,deck)]
  } else { player4 <- NA }
  
  # function outcome is a list of all results
  list(player1 = player1, player2 = player2, player3 = player3, player4 = player4,
       stash = stash, drop = drop)
}

Here are the outcomes for 500 simulations of a two-player game. Again, not perfectly matching the mathematical expectations, but close.

fiq4-dead-drop-pt2

Let’s revisit these questions for the last time (promise!):

What is the probability of success to approximate the drop (in cases of equal probabilities)?

Out of 500 simulations of a two-player game, using probability only correctly pointed towards potential candidates for the drop 273 times. That means in a two-player game, only 54.6% of the time you’ll be able to narrow down the drop card in the first round.

What is the probability of success for guessing the drop correctly?

Out of 500 games, using probability only successfully guessed the drop 129 times. That means only 25.8% of the time you’ll be able to guess the drop card in the first round.

Finally, what is the probability of success using the method of picking a random number to successfully guess the dead drop?

Out of 500 games, using the method of picking a random number successfully guessed the drop 92 times, or 18.4% of the time. As you can see from the 500 simulations of a three-player game, there’s a difference with knowing one extra card at the beginning of the game.

Success of Methods in Predicting the Drop
Failed Succeeded
Probability Prediction 371 129
Random Prediction 408 92

But is it statistically significant?

This time, our test statistic is 7.9519514. The degrees of freedom and \(\alpha\) level remain the same from above, and if you recall the table value was 3.841. So our test statistic is larger than the table value! What about our p-value?

1-pchisq(test.statistic2,df=1)
## [1] 0.004803555

With a small p-value, it appears there is a statistically significant difference with 95% confidence in using probability vs. random number prediction during a two-player game.

Conclusions

The sample size in my original post may have been a little small, but after increasing the number of simulations to 500, I still reached the same conclusions (after a little testing to be sure).

However, in a two-player game, where the players have access to one additional card, successfully guessing the drop with conditional probability was a little more likely than using a random number to predict it.

So go nuts, and impress your friends 20% (or 25%) of the time. But remember, the whole point is gathering more intelligence and updating your assumptions. Oh, and having fun.