JREF Homepage Swift Blog Events Calendar $1 Million Paranormal Challenge The Amaz!ng Meeting Useful Links Support Us
James Randi Educational Foundation JREF Forum
Forum Index Register Members List Events Mark Forums Read Help

Go Back   JREF Forum » General Topics » Science, Mathematics, Medicine, and Technology
Click Here To Donate

Notices


Welcome to the JREF Forum, where we discuss skepticism, critical thinking, the paranormal and science in a friendly but lively way. You are currently viewing the forum as a guest, which means you are missing out on discussing matters that are of interest to you. Please consider registering so you can gain full use of the forum features and interact with other Members. Registration is simple, fast and free! Click here to register today.

Reply
Old 2nd May 2012, 06:30 PM   #1
PiedPiper
Scholar
 
Join Date: Nov 2010
Posts: 98
Statistics anomaly that I just can't figure out - any math experts?

Hello all,

I really could use some help finding a satisfying reason *why* this apparent anomaly (described below) occurs. The question takes the form of a game, and one of the statistical results sticks out to me as inaccurate.

I posted something similar to this a while back, and some of the members here were very nice and tried to explain the issue to me. While I thought (at the time) that I had finally understood the issue, I realize now that I still have no clue why this is happening, and it's driving me crazy.

I have a background in mathematics, but it's only at a very low level; I have a high level of background in organic chemistry, however, so I'm comfortable with science.

I've done the math by hand (as you'll see below), and I've even gone so far as to write a FORTRAN program (hey, this was a long time ago) to simulate 10 million "plays" of the "game", and compiled all of the results. It matched the long-hand math predictions, but I just don't understand *why* it's happening. I know common sense sometimes doesn't go hand in hand with some scientific results, but this is so weird...

If anyone can help me out (PMs are fine, or replies to this thread) I'd really appreciate it. No, this isn't a homework problem, it's just something I've wondered about for 25+ years now and I'd love to know the reasoning behind it.

So, it's a game. You take a bag, and you put 10 marbles in the bag. The marbles can either be white or black, but there has to be at least one of each color. You reach in and take out a random marble, without peeking when you draw. If it's white, the game ends, and you win. If it's black, you put it to one side, and draw again. You keep this up until you pull out a white marble, which stops the game (you "win").

So lets take an example. 9 white marbles, 1 black.

Probability of someone pulling a white on the first draw: 9/10, or 90%.
Probability of someone pulling a black marble first, then pulling a white:
(1/10)*(9/9) = 10%. Total of 100%, as you'd expect.

Go on down a little bit, let's put 5 white marbles and 5 black.

Probability of someone pulling a white on the first draw: 5/10, or Probability of someone pulling a white on the *third* draw: (5/10)*(4/9)*(5/8) = 13.87%.

It's more "difficult" to pull a white marble on the third draw than it was on the first draw. You have to avoid pulling a white on the first draw; you have to hit a black, so you can keep on going. You then have to avoid all the whites again, and hit another black. Then you have to avoid drawing any of the remaining blacks, and hit one of the whites.

Going to the extreme here, for this example:

With 5 white and 5 black marbles, probability of someone pulling all the blacks first and then "winning" on the sixth draw by pulling out one of the remaining marbles (which are all white at this point):

Probability = (5/10)*(4/9)*(3/8)*(2/7)*(1/6)*(5/5) = 0.388%. It gets *really* difficult to pull this off, because you have to keep avoiding all those whites in the bag, and you have to pick off the dwindling blacks, getting all of them (while avoiding all the white marbles) before finally the bag has nothing but whites, and you pull one out to end the game.

This makes good sense to me. And the math and computer simulations match it. The logic of "having to avoid pulling a white too early" making the probability less likely for a certain combination makes sense to me.

Here's the difficulty:

Take the example of 9 black marbles, and 1 white.

Probability of drawing white on 1st pull: 1/10, or 10%.

Probability of drawing white on second pull: (9/10)*(1/9). You have to pull a black one first, and there's 9 of them out of the 10 total, and then you have to pull the single white marble out of the group of 9. So it's (9/10)*(1/9) = ... it equals 10%. The same chance as pulling the white on the 1st pull.

Taking it to the extreme: probability that with 9 black marbles and 1 white marble, that you hit the white on your very last pull. You have to avoid "hitting" that white marble for 9 pulls in a row, and according to the math for all the other marble combinations, it gets more and more difficult to do this. Common sense tells me that is correct as well.

Let's see here: for the extreme example, white on last pull:

(9/10)*(8/9)*(7/8)*(6/7)*(5/6)*(4/5)*(3/4)*(2/3)*(1/2)*(1/1).

That equals 10%.

In fact, when you have one white marble in the bag and 9 black, every possibility is 10%. Yet for any other marble ratio (5 white 5 black, 3 white 7 black, 8 white 2 black, etc), you see this pattern where it gets progressively more difficult (probability goes down) to successfully pull off that extreme example.

In this example (1 white, 9 black) - the math is telling me that it's just as likely that I hit white the first draw out, as it is that have to pull 10 times and keep avoiding the white marble nine draws in a row, pulling only blacks. In the examples with the other marble ratios, this probability became less likely as you went further down the line. Yet in what should be the most difficult example (only 1 white in the bag), all the logic seems to go out the window.

For the sake of my sanity, can someone please explain to me why this is happening?
PiedPiper is offline   Quote this post in a PM   Nominate this post for this month's language award Copy a direct link to this post Reply With Quote Back to Top
Old 2nd May 2012, 06:48 PM   #2
Kevin_Lowe
Penultimate Amazing
 
Kevin_Lowe's Avatar
 
Join Date: Feb 2003
Location: Queensland
Posts: 10,266
It might be easier to visualise if you imagine the marbles laid out in a line rather than jumbled up in a sack.

There are ten possible arrangements of marbles, from the arrangement with the white marble first and then nine black marbles, to the arrangement with the nine black marbles first and the white one at the end.

What are the odds that you'll get that last arrangement with the white marble last? One in ten. So what are the odds you'll pick up exactly nine black marbles? The same, exactly one in ten.

What are the odds that you'll get the arrangement with the white marble second last? One in ten. So what are the odds you'll pick up exactly eight black marbles? The same, exactly one in ten.

The same goes for the other eight possible arrangements. Each has exactly a 10% chance of occurring.

With only one marble, the chance of hitting a white marble is distributed perfectly evenly amongst the ten spots available. Each has exactly the same chance of being the white marble.
__________________
Thinking is skilled work....People with untrained minds should no more expect to think clearly and logically than people who have never learned and never practiced can expect to find themselves good carpenters, golfers, bridge-players, or pianists.
-- Alfred Mander
Kevin_Lowe is offline   Quote this post in a PM   Nominate this post for this month's language award Copy a direct link to this post Reply With Quote Back to Top
Old 2nd May 2012, 07:26 PM   #3
jasonpatterson
Philanthropic Misanthrope
 
Join Date: Apr 2009
Location: Space, The Final Frontier
Posts: 2,180
Great explanation Kevin. I was trying to think of a way to explain this using permutations but didn't come up with a useful visual.

Using your example for explaining why the probability decreases with multiple marbles helps as well. Supposing two white balls out of 10, there are far more ways of arranging the balls if the first ball is white than if the first 4 are black and the 5th is white. In the first case we've got 1 ball pinned down and 9 possible arrangements for the other balls. In the second, we've got 5 balls pinned down and only 5 possible arrangements for the other balls. The probability of the first event should be 9/5ths the probability of the second.

When we do the math:
First ball is white for 2/10 white balls : P1 = 2/10 = 1/5
Fifth ball is white for 2/10 white balls : P5 = 8/10 * 7/9 * 6/8 * 5/7 * 2/6 = 10/90 = 1/9
P1/P5 = 9/5

This gets messier with more than 2 balls, of course, but the idea is the same, the greater the number of possible arrangements of balls that exist for a given draw, the higher the probability, and as we move the first white ball's spot farther along in the draw order, the more we cut the remaining number of arrangements.

I don't know if this makes any sense to anyone but me, but there ya go.
__________________
Sandra's seen a leprechaun, Eddie touched a troll, Laurie danced with witches once, Charlie found some goblins' gold.
Donald heard a mermaid sing, Susie spied an elf, But all the magic I have known I've had to make myself.
- Shel Silverstein

Last edited by jasonpatterson; 2nd May 2012 at 07:27 PM.
jasonpatterson is offline   Quote this post in a PM   Nominate this post for this month's language award Copy a direct link to this post Reply With Quote Back to Top
Old 2nd May 2012, 11:07 PM   #4
xtifr
Muse
 
xtifr's Avatar
 
Join Date: Apr 2012
Location: Sol III
Posts: 563
My brother recently spent a frustrating afternoon trying to convince a friend that the odds of rolling any doubles on a pair of dice was the same as the odds of rolling a 6 with one die. The guy just would not believe it. It seemed like too much of a coincidence for him.

Sometimes things in statistics really do turn out to be simpler than they look.
__________________
"Those who learn from history are doomed to watch others repeat it."
-- Anonymous Slashdot poster
"The problem with defending the purity of the English language is that English is about as pure as a cribhouse whore."
-- James Nicoll
xtifr is offline   Quote this post in a PM   Nominate this post for this month's language award Copy a direct link to this post Reply With Quote Back to Top
Old 2nd May 2012, 11:45 PM   #5
bjornart
Master Poster
 
bjornart's Avatar
 
Join Date: Nov 2002
Posts: 2,395
Let's go all the way and visualise it visually.
4 balls total, 3 B(lack) and 1 W(hite)

Possible results are:

WBBB
BWBB
BBWB
BBBW

1/4 for a win in either position.

4 balls total, 2 B(lack) and 2 W(hite)

Possible results are:

WWBB
WBWB
WBBW
BWWB
BWBW
BBWW

Of course you'd stop once you drew a white, but the full sequences still exist in potentia.
1/2 for a win in one, 1/3 for a win in two and 1/6 for a win in three.
__________________
Well, I DON'T CARE WHAT YOU LIKE TO BELIEVE, GODDAMMIT! I DEAL IN THE FACTS!
-Cecil Adams
bjornart is offline   Quote this post in a PM   Nominate this post for this month's language award Copy a direct link to this post Reply With Quote Back to Top
Old 3rd May 2012, 04:54 AM   #6
sol invictus
Philosopher
 
sol invictus's Avatar
 
Join Date: Oct 2007
Location: Nova Roma
Posts: 8,417
Originally Posted by PiedPiper View Post
For the sake of my sanity, can someone please explain to me why this is happening?
To summarize your question: you're asking about the probability P of pulling all the blacks before you pull a white. You're saying P=.1 for 9:1 (9 blacks and 1 white), P=.00388 for 5:5, but then back up to P=.1 for 1:9. How can this be?

When there are many blacks, pulling them all first is hard because that's many in a row, but easy because most marbles are black. Those offset to give a not-so-small result. You're doing something likely (p>1/2) many times in a row, not very hard.

When there are very few blacks, pulling them all first is easy because that's only a few in a row, but hard because most marbles are white. Those offset to give a not-so-small result. You're doing something fairly unlikely (p<<1/2) only once or a few times, not very hard.

When there are roughly equal numbers, pulling each black has a probability of 1/2 or less, and gets smaller and smaller as you go. Pulling all the blacks in a row is then very unlikely, because you've got to do something relatively unlikely (P<1/2) many times in a row. It's like flipping tails many times in a row, only even harder (since p decreases the more blacks you've pulled).

Last edited by sol invictus; 3rd May 2012 at 05:05 AM.
sol invictus is offline   Quote this post in a PM   Nominate this post for this month's language award Copy a direct link to this post Reply With Quote Back to Top
Old 3rd May 2012, 04:54 AM   #7
Dancing David
Penultimate Amazing
 
Dancing David's Avatar
 
Join Date: Mar 2003
Location: Central Illinois
Posts: 34,702
Originally Posted by xtifr View Post
My brother recently spent a frustrating afternoon trying to convince a friend that the odds of rolling any doubles on a pair of dice was the same as the odds of rolling a 6 with one die. The guy just would not believe it. It seemed like too much of a coincidence for him.

Sometimes things in statistics really do turn out to be simpler than they look.
number of two dice combinations, 6x6=36
number of doubles, 6
odds of rolling double 6/36=1/6

Hmmm, looks fishy to me!

__________________
Hell, dynamiting fish in a barrel is more challenging. - Ladewig
I suspect you are a sandwich, metaphorically speaking. -Donn
And a shot rang out. Now Space is doing time... -Ben Burch
You built the toilet - don't complain when people crap in it. _Kid Eager
Dancing David is offline   Quote this post in a PM   Nominate this post for this month's language award Copy a direct link to this post Reply With Quote Back to Top
Old 3rd May 2012, 05:03 AM   #8
Tubbythin
Illuminator
 
Join Date: Mar 2008
Posts: 3,206
Originally Posted by Dancing David View Post
number of two dice combinations, 6x6=36
number of doubles, 6
odds of rolling double 6/36=1/6

Hmmm, looks fishy to me!

Number on first dice = x.
Probability number on second dice will be x = number of x's on second dice/number of numbers on second dice =1/6.
Tubbythin is offline   Quote this post in a PM   Nominate this post for this month's language award Copy a direct link to this post Reply With Quote Back to Top
Old 3rd May 2012, 05:16 AM   #9
jojonete
 
Join Date: Sep 2007
Posts: 149
Ok, it's explained but I'll try a somehow different wording. After 25+ years of thinking about this, I think it well deserves many different wordings, don't you think?

The OP insists a lot in stopping after finding the first white ball. I say: just don't stop. Pull out all the balls, place them in a line in the order they were pulled and take note of the first white ball only.

In the 9-black 1-white case, the OP says that, intuitively, it should be easier to "hit" the white ball in (say) the first attempt than to "keep avoiding" it for nine attempts. However, "hitting" the white ball in the first attempt means to automatically "keep avoiding" it for the remaining nine attempts.

In a way, this is somehow similar to the question: pick a deck of cards and deal half the cards to player 1 and the other half to player 2. What is more probable, that player 1 has all the clubs or that he/she has no clubs at all? The question gets easily answered after one realizes that player 1 getting no clubs means player 2 gets all of them. In the same way, hitting the white ball in the n-th position means to "keep avoiding" it for all other nine positions. It is equally hard to avoid the white balls in any given 9 positions, and it is no harder when those positions happen to be the first nine.

Not sure this is easier or harder to understand, as I said I'm just trying a different wording.
jojonete is offline   Quote this post in a PM   Nominate this post for this month's language award Copy a direct link to this post Reply With Quote Back to Top
Old 3rd May 2012, 01:16 PM   #10
Meridian
Thinker
 
Join Date: Aug 2007
Posts: 131
My understanding is that the question is the following: why is it that when there is only one white marble, then, unlike in any other case, the point where you first get a white marble is equally likely to be any draw? (Whereas, with more than one, it's more likely to be an early draw.)

A way of seeing the answer is as follows: as suggested may times above, imagine putting all the marbles in a row in a random order (corresponding to the order you draw them in if you keep going). For the first white marble to be in position n, two things have to happen: (a) the marble in position n is white, and (b) the marbles in positions 1..n-1 are black. Now (a) always has the same chance for all n - the order is random, so any position is as likely to be white as any other. If there are at least two white marbles, then (b) becomes less and less likely as n increases. But if there is only one, then condition (a) already implies condition (b) - if the nth marble is white, the ones before it are always black. So with one white, only condition (a) matters, and it has the same probability for all n.

[To be precise, with more than one white, what matters is that the conditional probability of (b) given (a) decreases with n.]

A continuous version of this is the observation that if you take n independent random numbers each uniformly distributed on the interval from 0 to 1, then for n at least 2, the minimum of these numbers has a distribution that is biased towards small numbers. But for n=1 it is just uniform.
Meridian is offline   Quote this post in a PM   Nominate this post for this month's language award Copy a direct link to this post Reply With Quote Back to Top
Old 4th May 2012, 09:24 AM   #11
PiedPiper
Scholar
 
Join Date: Nov 2010
Posts: 98
Thanks to everyone for all the careful, thoughtful replies. My sanity is restored . Thanks JREF!
PiedPiper is offline   Quote this post in a PM   Nominate this post for this month's language award Copy a direct link to this post Reply With Quote Back to Top
Old 4th May 2012, 02:21 PM   #12
kalen
Your Daddy
 
kalen's Avatar
 
Join Date: Mar 2004
Location: Classified
Posts: 933
delete
__________________
No way! Yahweh!

Last edited by kalen; 4th May 2012 at 02:22 PM. Reason: wrong
kalen is offline   Quote this post in a PM   Nominate this post for this month's language award Copy a direct link to this post Reply With Quote Back to Top
Old 4th May 2012, 04:19 PM   #13
xtifr
Muse
 
xtifr's Avatar
 
Join Date: Apr 2012
Location: Sol III
Posts: 563
Originally Posted by jojonete View Post
The OP insists a lot in stopping after finding the first white ball. I say: just don't stop.
Aha! Before you said that, I was struggling to figure out what the OP was confused about, which made it hard for me to answer him. I'm so used to analyzing this sort of problem in terms of the complete set of possible results that I'd forgotten what sort of confusion can arise if you neglect to do so.
__________________
"Those who learn from history are doomed to watch others repeat it."
-- Anonymous Slashdot poster
"The problem with defending the purity of the English language is that English is about as pure as a cribhouse whore."
-- James Nicoll
xtifr is offline   Quote this post in a PM   Nominate this post for this month's language award Copy a direct link to this post Reply With Quote Back to Top
Old 4th May 2012, 04:21 PM   #14
jasonpatterson
Philanthropic Misanthrope
 
Join Date: Apr 2009
Location: Space, The Final Frontier
Posts: 2,180
Originally Posted by PiedPiper View Post
My sanity is restored . Thanks JREF!
Go over to the CT subforum, JREF can fix that sanity for you too.
__________________
Sandra's seen a leprechaun, Eddie touched a troll, Laurie danced with witches once, Charlie found some goblins' gold.
Donald heard a mermaid sing, Susie spied an elf, But all the magic I have known I've had to make myself.
- Shel Silverstein
jasonpatterson is offline   Quote this post in a PM   Nominate this post for this month's language award Copy a direct link to this post Reply With Quote Back to Top
Old 4th May 2012, 04:41 PM   #15
CapelDodger
Penultimate Amazing
 
CapelDodger's Avatar
 
Join Date: Sep 2001
Location: Cardiff, South Wales
Posts: 16,740
Originally Posted by Tubbythin View Post
Number on first dice = x.
Probability number on second dice will be x = number of x's on second dice/number of numbers on second dice =1/6.
And the two dice are independent. I think what people tend to miss is independence. They implicitly think that rolling three sixes in a row makes it less likel that you'll roll a six next, because four sixes in a row is unlikely from a start. They don't factor in that three have already happened.
__________________
It's a poor sort of memory that only works backward - Lewis Carroll (1832-1898)

God can make a cow out of a tree, but has He ever done so? Therefore show some reason why a thing is so, or cease to hold that it is so - William of Conches, c1150
CapelDodger is offline   Quote this post in a PM   Nominate this post for this month's language award Copy a direct link to this post Reply With Quote Back to Top
Old 8th May 2012, 07:32 AM   #16
Tubbythin
Illuminator
 
Join Date: Mar 2008
Posts: 3,206
Originally Posted by CapelDodger View Post
And the two dice are independent. I think what people tend to miss is independence. They implicitly think that rolling three sixes in a row makes it less likel that you'll roll a six next, because four sixes in a row is unlikely from a start. They don't factor in that three have already happened.
Bayes would probably tell you it was more likely than any other number... unless you are 100% certain it isn't biased.
Tubbythin is offline   Quote this post in a PM   Nominate this post for this month's language award Copy a direct link to this post Reply With Quote Back to Top
Reply

JREF Forum » General Topics » Science, Mathematics, Medicine, and Technology

Bookmarks

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT -7. The time now is 11:30 PM.
Powered by vBulletin. Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
© 2001-2012, James Randi Educational Foundation. All Rights Reserved.

Disclaimer: Messages posted in the Forum are solely the opinion of their authors.