PDA

View Full Version : Statistical Anomaly - need a little help here


PiedPiper
21st January 2011, 01:52 PM
I apologize if this thread is off topic for this forum; I'm still quite new here (see post count) and I read the Membership Agreement carefully before posting. Feel free to nuke this thread from orbit if necessary :).

I need a little help (for my own sanity) puzzling out a statistical anomaly. It's been bugging me for years, ever since I took undergraduate statistics, and I haven't found an answer yet. I even wrote a computer program (in Fortran, no less - how retro of me) to simulate 10,000,000 iterations of the following "game" I'm about to describe, and the simulation came out with the exact same results as the math. What I don't understand is the "why", as it flies in the face of logic, and while I accept that some statistics do that, it would be fantastic to have some input and maybe even some "closure" on this issue for me.

Stated simply, the "game" goes like this:

You take ten marbles, and put them into a bag. The marbles are either red or blue, a random number of at least one each, but adding up to ten marbles total. The bag is shaken up, and someone takes out a single marble at random. If the marble is red, the game stops. If the marble is blue, it's put aside (not back in the bag), and another marble is taken out. If it's red, the game stops. If it's blue, it's put aside and another marble is taken out. Etc, etc. The game is over when a red marble has been drawn.

At the end of the game, you count up the number of blue marbles out of the bag, and that is your score for the game. So lets look at some examples of gameplay. For the sake of argument, we're going to say that there are 9 blue marbles in the bag, and one red marble.

Game#1: Boom, red marble right away. Game stops. No blue marbles are out of the bag, so the score is 0.

Game #2:
Draw 1: Blue marble, game continues.
Draw 2: Blue marble, game continues.
Draw 3: Red marble. Game stops. Two blue marbles are out of the bag, so the score is 2.

So lets take a look at the odds here, the statistics of this game. There are 9 blue marbles and 1 red marble.

Odds of scoring 0 in the game (hitting the 1 red marble right away) are 1 in 10, 10%.

Odds of scoring 1 in the game (drawing one blue marble first, and then hitting the red on the second draw): (9/10) - which is the number of blue marbles divided by the number of marbles total - multiplied by (1/9) - the odds of hitting the (one) red marble in the remaining pool of 9. (9/10)*(1/9) = (1/10) = 1 in 10 = 10%. Strange, no? The odds of straight out hitting that red marble first draw are 10%, 1 in 10. To achieve a score of 1, you had to avoid hitting the red marble that first time, leaving it in the bag, and draw a blue first, then the red.

Taking this trend to the extreme, scoring 9, how would that work?

Draws:
Blue, blue, blue, blue, blue, blue, blue, blue, blue, red.

Math (probabilities):
(9/10)*(8/9)*(7/8)*(6/7)*(5/6)*(4/5)*(3/4)*(2/3)*(1/2)*(1/1, the red marble, only one left) - all of which cancels out to....1 in 10.

So I could go on at length about the different iterations of this game, and how the odds "make sense to me" if there are for example 2 red marbles and 8 blue, instead of 1 and 9, but what it boils down to is this:

According to the math (and the computer simulation I did), it's equally likely that:

1. Someone will hit the red marble with their first draw.

AND

2. Someone will avoid the red marble, drawing a blue; again, they'll avoid the red marble, drawing a blue; they'll do this again; and again; and again; and again; until they've "avoided" drawing that single red marble 9 times until finally it's the only thing left in the bag for them to draw out.

How can these two scenarios be equally likely?

Help :(

drkitten
21st January 2011, 02:01 PM
According to the math (and the computer simulation I did), it's equally likely that:

1. Someone will hit the red marble with their first draw.

AND

2. Someone will avoid the red marble, drawing a blue; again, they'll avoid the red marble, drawing a blue; they'll do this again; and again; and again; and again; until they've "avoided" drawing that single red marble 9 times until finally it's the only thing left in the bag for them to draw out.

How can these two scenarios be equally likely?

Think of it this way. Instead of drawing a marble from a bag, you take all the marbles out in random order and put them in ten boxes, labelled 1..10.

What are the odds that the single red marble is in any given box? 1 in 10.

PiedPiper
21st January 2011, 02:08 PM
Well, that makes sense to me, sure. Here's the weird thing tho: let's change the game a little, just to show you what I mean by it "making sense to me" at different ratios of red to blue.

Lets say you have 8 red marbles and 2 blue marbles; same rules.
Odds of scoring zero: 8/10.
Odds of scoring 1: (2/10)*(8/9) = 17.78%.
Odds of scoring 2: (2/10)*(1/9)*(8/8) = 2.22%
Total of 100%, which is satisfying.

So in this version of the game, getting that higher score became more difficult. You had to avoid hitting all those red marbles that second draw around, pluck out that single remaining blue one, and then finish up by pulling out one of the red marbles that were the only color left in the bag.

I don't have the spreadsheet here to calculate this all the way down, but the trend (going from 9 red marbles to 1 red marble) is this:

Getting higher scores has lower probability. This trend continues as the number of red marbles decreases, until you reach the limit of the game - which is where there is only one red marble, the example I gave in my original post. Then, all of a sudden, all the odds equal out.

That's what I'm having trouble with.

drkitten
21st January 2011, 02:14 PM
Well, that makes sense to me, sure. Here's the weird thing tho: let's change the game a little, just to show you what I mean by it "making sense to me" at different ratios of red to blue.

Lets say you have 8 red marbles and 2 blue marbles; same rules.
Odds of scoring zero: 8/10.
Odds of scoring 1: (2/10)*(8/9) = 17.78%.
Odds of scoring 2: (2/10)*(1/9)*(8/8) = 2.22%
Total of 100%, which is satisfying.

So in this version of the game, getting that higher score became more difficult. You had to avoid hitting all those red marbles that second draw around, pluck out that single remaining blue one, and then finish up by pulling out one of the red marbles that were the only color left in the bag.

I don't have the spreadsheet here to calculate this all the way down, but the trend (going from 9 red marbles to 1 red marble) is this:

Getting higher scores has lower probability. This trend continues as the number of red marbles decreases, until you reach the limit of the game - which is where there is only one red marble, the example I gave in my original post. Then, all of a sudden, all the odds equal out.

That's what I'm having trouble with.

Right. That's because the marbles are equally (uniformly) distributed, but the first marble is not.

You can get the same effect -- I believe the old Wizards game by Avalon Hill used this mechanic -- by rolling a pair of dice and counting the higher of the two dice.

There's one chance in 36 of rolling a '1', one chance in 18 of rolling a '2', and something like 11 chances in 36 of rolling a 6.

The odds against rolling a '1' get longer as you add more dice, the odds against rolling a '6' get shorter, until you're rolling a handful of dice and getting a '6' is almost a sure thing.

rjh01
21st January 2011, 06:17 PM
Change the same again. There is still only one red marble, but this time there is only one blue one. 50 / 50 chance of which one you pick out. No problems there.

One red, two blue. 1/3 chance of red. Or 2/3 * 1/2 = 1/3. Understand these simplified games and you understand the more complex game in the OP.