PDA

View Full Version : [Merged] Odds Standard for Preliminary Test


Pages : 1 [2]

Gr8wight
26th August 2008, 12:51 PM
Not necessarily. If, indeed, Pavel can perform at a 70% hit rate where 50% would be expected by chance, that wouldn't be much of a nightclub act.


No, but it'd be good enough to make a killing at the tables in Vegas. See, Rodney, nightclub acts are just that: acts. Applicants for the MDC are claiming their "abilities" are not acts. Yet, when those things that nightclub performers do in their acts are controlled for in an MDC challenge test, those "abilities" always - ALWAYS* - disappear. That is the whole point of the JREF One Million Dollar Challenge, and why having it presided over by an accomplished stage magician in the best possible scenario. It demonstrates that all these people* who claim paranormal abilities are really doing nothing more than a cheesy nightclub act. Even if they aren't aware of that fact for themselves.









*so far

Rodney
26th August 2008, 01:13 PM
No, but it'd be good enough to make a killing at the tables in Vegas.
How so?

fls
26th August 2008, 01:27 PM
How so?

Seeing what card is coming up next for you in Blackjack or Poker would be very useful.

Linda

Startz
26th August 2008, 01:38 PM
Which protocol was that, and what was the reason for the rejection?

http://forums.randi.org/showthread.php?postid=3874502#post3874502

My memory is that the main reason for rejection was that the protocol was "too complicated."

Gr8wight
26th August 2008, 01:51 PM
No, but it'd be good enough to make a killing at the tables in Vegas.

How so?

Seeing what card is coming up next for you in Blackjack or Poker would be very useful.

Rodney didn't really need that spelled out for him, Linda. He only asked that question to allow him to conveniently ignore the the rest of my post. Standard Rodney tactics. He can't address the meat of my post, so he chooses to address the throw-away lead-in comment, and hopes everyone will miss the fact that he completely ignored the most salient point.

Coveredinbeeees
26th August 2008, 02:06 PM
Seeing what card is coming up next for you in Blackjack or Poker would be very useful.

Linda

Ah, but if you're distracted by strong odours, bright colours and similar images to the one on the card in question then you're out of luck. Also the dealer might not let you hold the shoe in your hand and write down guesses for 3 minutes before deciding whether to hit or stick.

Pavel's ability, as described, would be a bit of a chocolate teapot in the casino.

Rasmus
26th August 2008, 02:14 PM
How so?

Playing roulette gives you a 1:37 (or 1:38) chance of guessing the right number, and the payout is 1:36 (1:37).

Now imagine you could predict the number in 7 of 10 cases, i.e. about 26 times as good as chance! Walk into a casino with 1000$ -heck even 100 or 50 would make you rich pretty damned fast. Just bet 10% of what you came with on whichever number you think ought to come up first. On most days, you will lose 3 times and win 7 times, i.e. you'd reach 2560$ if you started off with 100. (And if I got the payouts right ...)

Of course, you could simply bet half or a third of your winnings on any subsequent bet. Let's go with a third, shall we?

You bet 10$ and earn 360$. Of these, you bet 120$ and get 4320$. (If you lose, just repeat the bet up to two times, or start over with your 10$ from the beginning...). If you win just one more game for 1440$ you get kicked out of the casino 51,840$ richer. After just three bets! (And that's just what you get in the last bet. There's a few thousand bucks in the previous ones, of course. I am too lazy to add those up.)

Now, chances of the first three bets working out at 0.7 each are just over 0.3. So it probably wouldn't be too hard to distribute the bets over a few days and several different casinos to max out your winnings before being banned, right?

I would quit my day jobs if I had odds like that, and Mr Randi and his cheapo challenge would be the least of my worries.

Jackalgirl
26th August 2008, 03:55 PM
The fundamental point that most here are missing is that the preliminary test should not be so rigid as to eliminate applicants who actually do have a paranormal ability.

I know that this has been pointed out before, but the ultimate point of the Challenge is to challenge people who claim regular and consistent paranormal ability. In other words, it's aimed squarely at the Sylvia Brownes and Uri Gellars of the world.

I know that the wording says that it's open to anyone who can demonstrate a paranormal power under controlled testing conditions. But as has also been pointed out, there's a limit. This isn't a long-term research vehicle. There's a limit on the amount of time that the JREF is willing to pour into a single individual (especially when they have other applicants to consider), and that time limit controls the amount of testing that is available in order to reach the 1:1000 chance bar.

This means that some individuals -- those individuals whose proported psychic power is very subtle or inconsistent and who therefore need a really long time in order to demonstrate that their abilities are functioning significantly above chance -- are not going to be suitable candidates for the JREF Challenge.

As for Pavel: I totally agree with you that Pavel should not accept a protocol he does not feel absolutely confident that he could pass -- if he cannot accept a protocol that is designed according to JREF's requirements in terms of time vs. probability, then he is not a good candidate for the JREF Challenge.

This doesn't mean, however, that he would be unsuitable for other skeptical societies' challenges, or that someone who is interested in the proper research (by this I mean "properly controlled) of these sorts of things should be discouraged from working with Pavel to test his abilities over the long term.

Rodney
26th August 2008, 05:38 PM
Seeing what card is coming up next for you in Blackjack or Poker would be very useful.

Linda

As far as I know, Pavel hasn't claimed that he can see what card is coming up in Blackjack or Poker, nor has he claimed he can see where a Roulette wheel will land. Rather, he has claimed that, given two envelopes, one containing a photograph and one containing a blank sheet of photographic paper, he can identify the photograph significantly more often than half the time.

Rodney
26th August 2008, 05:47 PM
I know that this has been pointed out before, but the ultimate point of the Challenge is to challenge people who claim regular and consistent paranormal ability. In other words, it's aimed squarely at the Sylvia Brownes and Uri Gellars of the world.

I know that the wording says that it's open to anyone who can demonstrate a paranormal power under controlled testing conditions. But as has also been pointed out, there's a limit. This isn't a long-term research vehicle. There's a limit on the amount of time that the JREF is willing to pour into a single individual (especially when they have other applicants to consider), and that time limit controls the amount of testing that is available in order to reach the 1:1000 chance bar.

This means that some individuals -- those individuals whose proported psychic power is very subtle or inconsistent and who therefore need a really long time in order to demonstrate that their abilities are functioning significantly above chance -- are not going to be suitable candidates for the JREF Challenge.
If you're right, the following sentence should be corrected:

"At JREF, we offer a one-million-dollar prize to anyone who can show, under proper observing conditions, evidence of any paranormal, supernatural, or occult power or event."

See http://www.randi.org/joom/content/view/38/31/

Gr8wight
26th August 2008, 06:10 PM
As far as I know, Pavel hasn't claimed that he can see what card is coming up in Blackjack or Poker, nor has he claimed he can see where a Roulette wheel will land. Rather, he has claimed that, given two envelopes, one containing a photograph and one containing a blank sheet of photographic paper, he can identify the photograph significantly more often than half the time.

Perfect! Let's test that, then.

Rodney
26th August 2008, 06:12 PM
Perfect! Let's test that, then.

Let's. See post #249 above.

fls
26th August 2008, 06:20 PM
As far as I know, Pavel hasn't claimed that he can see what card is coming up in Blackjack or Poker, nor has he claimed he can see where a Roulette wheel will land. Rather, he has claimed that, given two envelopes, one containing a photograph and one containing a blank sheet of photographic paper, he can identify the photograph significantly more often than half the time.

It is consistent with what he said in the Challenge thread - he visualizes what he will find when he opens the envelope in the near future. The idea of visualizing what is revealed when a card is turned over isn't much different.

Linda

Rodney
26th August 2008, 06:46 PM
It is consistent with what he said in the Challenge thread - he visualizes what he will find when he opens the envelope in the near future. The idea of visualizing what is revealed when a card is turned over isn't much different.

Linda
I suppose you could argue that kicking a soccer ball isn't much different than kicking an American football, but most soccer players don't make it as kickers in American football. Maybe Pavel can tell us whether he thinks he has any ability to see what card is coming up in Blackjack or Poker.

Gr8wight
26th August 2008, 07:27 PM
Let's. See post #249 above.

Well, yes, I've read that, and the 248 posts that preceeded it, and the 150 or so posts in the other thread about Pavel's application. It is clear that Pavel needlessly complicates his trials in order to confuse the statistical landscape and fool himself into thinking he is performing at a higher rate of success than he actually is. If Pavel tests himself strictly following some of the protocols that have been suggested here, he is going to find that he performs at no better than chance. The result will be that he never goes through with the challenge test.

On the other hand, if he refuses to self test honestly, he will go into the JREF challenge test full of optimism, and be either embarrassed, or unbelieving, or hostile when he fails miserably.

I would love for a different result, but I am not prepared to hold my breath waiting for one. I don't look good in blue.

steenkh
27th August 2008, 12:19 AM
http://forums.randi.org/showthread.php?postid=3874502#post3874502

My memory is that the main reason for rejection was that the protocol was "too complicated."
I certainly understand why they rejected it. My head spins when I try to figure out just what is going on in this protocol! And the odds for false positives have been calculated for this?

steenkh
27th August 2008, 12:23 AM
As far as I know, Pavel hasn't claimed that he can see what card is coming up in Blackjack or Poker, nor has he claimed he can see where a Roulette wheel will land. Rather, he has claimed that, given two envelopes, one containing a photograph and one containing a blank sheet of photographic paper, he can identify the photograph significantly more often than half the time.
He has vacillated a bit about this, but I think he ended up with the claim that he can "see" the picture in the future as it is revealed. It does not seem so much different to "see" a playing card as it is revealed in the future in a card game.

But anyway, you and I know that he invented this explanation solely in order to have an argument for why he needs to know the well he is doing as the test proceeds.

William Smith
27th August 2008, 05:06 AM
If you're right, the following sentence should be corrected:

"At JREF, we offer a one-million-dollar prize to anyone who can show, under proper observing conditions, evidence of any paranormal, supernatural, or occult power or event."

See http://www.randi.org/joom/content/view/38/31/

I have said it before: Submit your proposal of a correction to the JREF and tell us how it went.

EHocking
27th August 2008, 05:10 AM
Well, yes, I've read that, and the 248 posts that preceeded it, and the 150 or so posts in the other thread about Pavel's application. It is clear that Pavel needlessly complicates his trials in order to confuse the statistical landscape and fool himself into thinking he is performing at a higher rate of success than he actually is. I have to disagree with you on Pavel's behalf. Pavel is not deliberately complicating the protocols, Pavel clearly did not know how to put one together nor did he know what was required to meet JREF's 1:1000 odds pass rate.
He stated this in a number of posts prior to applying to the Challenge.
Post 51 from his first thread (http://forums.randi.org/showpost.php?p=2859837&postcount=51).
WHAT would be self evident prove? how many times i have to perform it.. let say its an 1 hour test session..
so ill know what is minimum has to be performed..
Post 241 after much discussion (http://forums.randi.org/showpost.php?p=3738821&postcount=241)
As to the accuracy with which I can perform it, I think I have no choice but to claim that I can perform the results that beat 1 to 1000 odds as that what JREF want to be beaten. If I am not mistaken, or I have a choice? Like to claim I can beat 1 to 300 or even 400 odds? 200-300 odds that is way more than just by chance but that will not be considered as a success isn’t it? Or I misunderstood something from requirements? It's quite clear that he came to the Forum to determine at what level he needed to perform his skill to, to pass the Challenge. Others who now claim that JREF is doing hard by him overlook the point that he never claimed a performance % but asked JREF what the threshold % was.
If Pavel tests himself strictly following some of the protocols that have been suggested here, he is going to find that he performs at no better than chance. The result will be that he never goes through with the challenge test.Again, in his defence, Pavel has done some semi-blinded tests (http://forums.randi.org/showpost.php?p=3740931&postcount=259) at Forum member's urgings and shared them in the Forum. We (surprise, surprise) were quick to demonstrate to him that his then current (pre application) success rate was not better than chance (his 70% claim (http://forums.randi.org/showpost.php?p=3740800&postcount=254)) and that he'd have to perform better to pass the Challenge.

On the other hand, if he refuses to self test honestly, he will go into the JREF challenge test full of optimism, and be either embarrassed, or unbelieving, or hostile when he fails miserably.

I would love for a different result, but I am not prepared to hold my breath waiting for one. I don't look good in blue.I haven't seen any of Pavel's posts that indicate he is doing this in anything but good faith, so I have to disagree with your pessimism here. I do think that he will fail, but have confidence in him and the Forum members helping him that he'll end up getting tested.

Just to reiterate, since it is relevant to the thread.

Pavel came to the Forum to find out what success rate was required for him to pass the Challenge.

The accusations that JREF is forcing him to attempt something at a level of success that he didn't claim he could do are unfounded.

Rodney
27th August 2008, 07:05 AM
I have said it before: Submit your proposal of a correction to the JREF and tell us how it went.
Been there, done that. On May 10, 2008, I e-mailed the JREF as follows:

"I recently initiated the following thread on the Million Dollar
Challenge Forum --
http://forums.randi.org/showthread.php?p=3692318#post3692318

"What I argue on that thread is that: (a) In tests where the odds of
success can be readily calculated, it is unclear what odds standard
must be met; and (b) It is unclear whether time-consuming protocols,
such as Ganzfeld experiments, are eligible for the Challenge.
Therefore, I recommend that something along the lines of the following
be added to the Challenge Rules:

"'(1) An applicant must pass a preliminary test, in which the general
criterion for success will be that the applicant must perform at
significantly above the chance level. In tests where the odds of
success can be readily calculated -- such as numbers guessing -- the
applicant must perform at least at the P=.001 level; that is, the odds
must be only one in one thousand that the applicant could have
achieved that performance level by random chance. (However, if the
applicant achieves a lesser, but above chance, performance level in a
limited number of tests -- for example, if the applicant performs at
the P=.05 level in 20 trials -- the preliminary test may be extended
on a different day or days to include more trials.) If the applicant
passes the preliminary test, a final test will be administered, in
which the performance level must meet a significantly more stringent
criterion for the million dollar prize to be awarded. In tests where
the odds of success can be readily calculated, the applicant must
perform at least at the P=.000001 level; that is, for the prize to be
awarded, the odds must be only one in one million that the applicant
could have achieved that performance level by random chance.

"'(2) All protocols, including time-consuming ones such as Ganzfeld
experiments, are eligible for the Challenge; or

"'(2a) Some time-consuming protocols, such as Ganzfeld experiments, are
not eligible for the Challenge due to the impact on JREF resources."

"If you wish, you may respond to these questions on the above thread."

Still no response more than three months later.

fls
27th August 2008, 07:26 AM
Been there, done that. On May 10, 2008, I e-mailed the JREF as follows:

"I recently initiated the following thread on the Million Dollar
Challenge Forum --
http://forums.randi.org/showthread.php?p=3692318#post3692318

"What I argue on that thread is that: (a) In tests where the odds of
success can be readily calculated, it is unclear what odds standard
must be met; and (b) It is unclear whether time-consuming protocols,
such as Ganzfeld experiments, are eligible for the Challenge.
Therefore, I recommend that something along the lines of the following
be added to the Challenge Rules:

"'(1) An applicant must pass a preliminary test, in which the general
criterion for success will be that the applicant must perform at
significantly above the chance level. In tests where the odds of
success can be readily calculated -- such as numbers guessing -- the
applicant must perform at least at the P=.001 level; that is, the odds
must be only one in one thousand that the applicant could have
achieved that performance level by random chance. (However, if the
applicant achieves a lesser, but above chance, performance level in a
limited number of tests -- for example, if the applicant performs at
the P=.05 level in 20 trials -- the preliminary test may be extended
on a different day or days to include more trials.) If the applicant
passes the preliminary test, a final test will be administered, in
which the performance level must meet a significantly more stringent
criterion for the million dollar prize to be awarded. In tests where
the odds of success can be readily calculated, the applicant must
perform at least at the P=.000001 level; that is, for the prize to be
awarded, the odds must be only one in one million that the applicant
could have achieved that performance level by random chance.

"'(2) All protocols, including time-consuming ones such as Ganzfeld
experiments, are eligible for the Challenge; or

"'(2a) Some time-consuming protocols, such as Ganzfeld experiments, are
not eligible for the Challenge due to the impact on JREF resources."

"If you wish, you may respond to these questions on the above thread."

Still no response more than three months later.

Awww, poor Rodney. They ignored your unreasonably complicated and exclusionarily specific proposal. ;)

What about:

"At JREF, we offer a one-million-dollar prize to anyone who can show, under proper and feasible observing conditions, evidence of any paranormal, supernatural, or occult power or event."?

Linda

petre
27th August 2008, 07:29 AM
Even I don't think the million dollar should be given away that easily. I was under the impression that the odds standard for the final test would be much higher -- on the order of 1 in a million.

The formal test is carried out using the same protocol as the preliminary test. Number of trials, required performance, testing conditions all remain the same. It is likely that additional measures may be taken to demonstrate any fraud, but such measures must abide by any requirements in the protocol (i.e. if the applicant insists on having only people directly involved in the testing present, the JREF cannot suddenly request a live audience of academics witness the formal test). This is why passing a preliminary test would be so exciting, a simple repeat performance is worth $1 million!

The real purpose of having two tests is to allow the possibility of catching Randi sleeping and sneak through a clever way to make something appear paranormal. You can bet if someone passes the preliminary, he would certainly give the matter his full attention to ensure nothing short of the paranormal would pass in the formal test.

It is a bit of an unanswered question how this might be achieved in cases where the preliminary test was a one-off prediction (e.g. it will snow in a California valley on a particular day in July, an earthquake will strike a certain city on a certain date between 11 and noon, etc). It has never been an issue, since all predictions made for an accepted preliminary test have all proven incorrect.

Rodney
27th August 2008, 05:28 PM
Awww, poor Rodney. They ignored your unreasonably complicated and exclusionarily specific proposal. ;)
I guess hitting the reply button and stating "Your specific proposal is unreasonably complicated and exclusionary" is too much effort for the JREF.

What about:

"At JREF, we offer a one-million-dollar prize to anyone who can show, under proper and feasible observing conditions, evidence of any paranormal, supernatural, or occult power or event."?

Linda
I can just see it now. Randi: "You mean there is an applicant who has a paranormal power? Uh, what do I do now? Wait, I've got it: Tell him that the observing conditions are not feasible!"

Rodney
27th August 2008, 05:29 PM
The formal test is carried out using the same protocol as the preliminary test. Number of trials, required performance, testing conditions all remain the same.
What do you base the above on?

fls
27th August 2008, 06:36 PM
I can just see it now. Randi: "You mean there is an applicant who has a paranormal power? Uh, what do I do now? Wait, I've got it: Tell him that the observing conditions are not feasible!"

If you think Randi is simply a liar, isn't it silly for you to waste your time trying to understand or improve upon the MDC?

Linda

Rodney
27th August 2008, 06:45 PM
If you think Randi is simply a liar, isn't it silly for you to waste your time trying to understand or improve upon the MDC?

Linda

Linda, Linda, Linda! It was a joke. :)

fls
27th August 2008, 07:13 PM
Linda, Linda, Linda! It was a joke. :)

Damn. I thought I'd figured out a way to get you off this obsession.

Linda

petre
28th August 2008, 09:06 AM
What do you base the above on?

It's been mentioned a number of times in the forums (some helpful person with a more complete link library may be able to help out here with a post from Remiev, Darat, or heck Kramer maybe), but it's possible to conclude this fact from the challenge application.


This offer is administered by the JREF, and no one may negotiate or make any changes, except as set forth
in writing by James Randi (JR). All correspondence must be written, and will be answered, in English only,
except that properly-prepared translations into English accompanied by certification of the qualifications of
the translator, will be accepted. Upon properly completing this document and agreeing upon the test protocol,
applicant will receive the application back, signed by JR. At that point, the applicant becomes eligible for the
preliminary test, which, if successful, will result in the formal test.
...

5. After an agreement is reached on the protocol, no part of the testing procedure may be changed in any
way without the further agreement – in writing – of all parties concerned. JR may or may not be present at
some preliminary or some formal tests, but he will not interact with the materials used, nor with the
protocol, unless specifically requested to do so by the applicant.
6. In all cases, applicant will be required to perform a preliminary test either before an appointed representative,
if distance and time dictate that need, or in a location where a member or representative of the JREF
staff can attend. This preliminary test is to determine if the applicant is likely to perform as promised
during a formal test, using the agreed-upon protocol. To date, no applicant has passed the preliminary test,
and this has eliminated the need for formal testing in those cases. There is no limit on the number of times
an applicant may re-apply, but re-application can take place only after 12 months have elapsed since the
completion of the preliminary test.


Emphasis mine of course. Note that in the first bolded section it identifies the steps to the challenge:
1. Apply
2. Agree on protocol
3. Take preliminary test
4. Take final test

Note that no "renegotiate protocol for final test" step is indicated. Further on in the second bolded section it notes that no changes to the protocol are allowed unless both parties agree. Finally, in the third bolded section it notes a link between both tests and the "agreed-upon protocol" (not protocols).

It is still possible to abuse syntax and scope in the English language and argue the challenge, as worded, does not preclude the JREF from demanding a change in the protocol before the final test. It's convincing to me though just from the wording that passing the preliminary test makes the applicant eligable for the formal test.

Rodney
28th August 2008, 10:13 AM
Emphasis mine of course. Note that in the first bolded section it identifies the steps to the challenge:
1. Apply
2. Agree on protocol
3. Take preliminary test
4. Take final test

Note that no "renegotiate protocol for final test" step is indicated. Further on in the second bolded section it notes that no changes to the protocol are allowed unless both parties agree. Finally, in the third bolded section it notes a link between both tests and the "agreed-upon protocol" (not protocols).

It is still possible to abuse syntax and scope in the English language and argue the challenge, as worded, does not preclude the JREF from demanding a change in the protocol before the final test. It's convincing to me though just from the wording that passing the preliminary test makes the applicant eligable for the formal test.
I'm not talking about a change in the protocol, I'm talking about the odds standard that must be met. I've always been under the impression that the formal test would require a higher odds standard than the preliminary test. For example, in Pavel's case, he might have to get 30 of 40 to pass the preliminary test, but 60 out of 80 to pass the formal test.

Gr8wight
28th August 2008, 10:20 AM
I'm not talking about a change in the protocol, I'm talking about the odds standard that must be met. I've always been under the impression that the formal test would require a higher odds standard than the preliminary test. For example, in Pavel's case, he might have to get 30 of 40 to pass the preliminary test, but 60 out of 80 to pass the formal test.

Being required to pass the same test twice in a row results in the higher odds standard. That's the way I have always read that requirement.

pavel_do
28th August 2008, 11:12 AM
I have to disagree with you on Pavel's behalf. Pavel is not deliberately complicating the protocols, Pavel clearly did not know how to put one together nor did he know what was required to meet JREF's 1:1000 odds pass rate.
He stated this in a number of posts prior to applying to the Challenge.

It's quite clear that he came to the Forum to determine at what level he needed to perform his skill to, to pass the Challenge. Others who now claim that JREF is doing hard by him overlook the point that he never claimed a performance % but asked JREF what the threshold % was.
Again, in his defence, Pavel has done some semi-blinded tests (http://forums.randi.org/showpost.php?p=3740931&postcount=259) at Forum member's urgings and shared them in the Forum. We (surprise, surprise) were quick to demonstrate to him that his then current (pre application) success rate was not better than chance (his 70% claim (http://forums.randi.org/showpost.php?p=3740800&postcount=254)) and that he'd have to perform better to pass the Challenge.

I haven't seen any of Pavel's posts that indicate he is doing this in anything but good faith, so I have to disagree with your pessimism here. I do think that he will fail, but have confidence in him and the Forum members helping him that he'll end up getting tested.

Just to reiterate, since it is relevant to the thread.

Pavel came to the Forum to find out what success rate was required for him to pass the Challenge.

The accusations that JREF is forcing him to attempt something at a level of success that he didn't claim he could do are unfounded.

Thank you.
just one comment here.. when I really tried to find out the odds and things, by emailing to JREF.. I was refused any answers , reasoning that I need to apply firs and after that we start any negotiation.. even when I have asked.. if I claim sirtain % of a minimum rate ( that JREF ask to state in application and description of the claim) even that was ignored..
here is the letter and the PM to Remiev that I have sent a few days before I sent my application to JREF.
"Dear Alison,



Thank you very much for the reply,

Here is the text that I will attach to my application maybe it will be edited slightly but mainly will be the same. Please review it as to avoid the application to be returned to be due to some insignificant mistakes or disagreements from the side of JREF. As to the accuracy, I understand what you were saying, the only thing as the number of the successful photos for me will depend on the finale agreed protocol. As it still can be set of 20 photos where 5 pulled out and I would say I will be right with 3 minimum or let say set of 20 pars where I would have to identify just one photo from each pair, that makes numbers different. Can I state in my claim the ODDS or a % instead of definite number? As JREF demand 1 to 1000 odds to be bitten irrelevantly from the way, it performed. So I would like to claim ODDS as well. In my opinion it will be a success if I will beat say 150 odds as it is it self way more to the chance, taking to the considerations that it will be not 100 runs and just a few like in case with 20 pairs set I would prefer 3-5 runs by 20 sets. Or with 20 photos where 5 will be taken I would prefer have 3 runs only. Will it be fine for the claim as the accuracy to state 70% minimum from my side?


If so, then the following text ( still might be slightly edited) will be attached to the application form together with media coverage (photo copy of the news papers with the articles about me) and the letter signed by 2 professors who tested me and witnessed the results.


"I am Pavel, psychic clairvoyant. As a part of my gift, I have the ability to identify photographs that are sealed in double envelopes without seeing them first, just by holding the envelope between my hands and concentrating for a few minutes. I would prefer to use sets of photos that are completely different from each other, for example, a horse, the statue of liberty, ship, mask etc. The photos will be known to me and the set that is used for testing will be made of the same pictures that I usually practice with , and will be provided to the JREF beforehand together with the name for each photo. However, no one can speak while I perform and there should be no distraction of any kind. For that reason, I prefer that my eyes are covered with a mask, and I will use earplugs, to eliminate light and sound distractions. It is important that in the room where the test will be held, there should be NO photos or pictures of any kind on the walls or anywhere near me. I will need a small break between the runs if there will be a few runs of the testing. I need to come 1 hour before the test, to the place where it will be handled (as I need to get used to place and the people around to be calmer and make sure that I will be comfortable in my place during the test). Such an ability I can perform with the accuracy of a minimum 70% with the success rate significantly better than random chance. ”

Reagards,

Pavel

P.s.

I would appreciate your answer as I need to send my application A.S.A.P a specially if it take more time. Now, As I would like to have my test in US and while I am here, I am here till the September 1 the latest… due to protocol negotiation etc it will take time.. so please just answer last letter and I can send my application."

here is the reply....

"
Re: this is PAVEL

--------------------------------------------------------------------------------

No, Pavel. I will not read and reply to it. As previously stated, the application and associated materials must be received before I will look over any protocol information. No matter what you include with the Challenge Application, it will not be your full protocol. That is negotiated over time.

Please submit the application via postal mail, and do not send any more queries via e-mail or PM.

Thank you "

petre
28th August 2008, 11:44 AM
I'm not talking about a change in the protocol, I'm talking about the odds standard that must be met. I've always been under the impression that the formal test would require a higher odds standard than the preliminary test. For example, in Pavel's case, he might have to get 30 of 40 to pass the preliminary test, but 60 out of 80 to pass the formal test.

The performance requirement for success is as much a part of the protocol as any other part. Therefore the odds requrement cannot be increased for the final test (well, without agreement from both parties, and it would hardly seem in the applicant's interest to raise the bar for a chance at a $million).

You were expecting something like: "Very good job passing the preliminary. Now for the formal test we require you to succeed 60 million out of 80 million trials! Muhahahaha!" ? The contract says the formal test WILL proceed, not conditional to the applicant agreeing to new performance requirements. Insisting on any change to the protocol that the applicant does not agree to before administering a formal test would be a breech of the contract as worded.

Of course since it is an agreement between two parties they certainly could agree to a different standard between the two tests in the original protocol. Maybe the best Pavel can do in one day is a test with 1:500 chance of success (by luck alone) and is willing to take on 1:2000 odds in a longer formal test if the JREF will agree to this relaxation of unwritten standard (or perhaps 1:2500 odds even, just performing the preliminary test twice more for the formal test). The JREF might find such a suggestion agreeable, since there is still a high probability of by-luck failure in the preliminary and the formal would never need to be given.

Having the odds standard unwritten allows this flexability. In general, once a rule is in the formal application, it is indeed a RULE. No exceptions to the explicit rules have ever been granted to an applicant (to my knowledge). Randi has occasionally offered to waive the preliminary test for certain high-profile paranormal claimants, but even that provision may be in the rules (I haven't read it over fully recently).

Rodney
28th August 2008, 04:03 PM
The performance requirement for success is as much a part of the protocol as any other part. Therefore the odds requrement cannot be increased for the final test.
But isn't only the preliminary -- and not the final -- test being negotiated right now for Pavel?

steenkh
28th August 2008, 10:19 PM
What do you base the above on?
From the official challenge page (http://www.randi.org/joom/content/view/40/32/):
This preliminary test is to determine if the applicant is likely to perform as promised during a formal test, using the agreed-upon protocol.
There is only a single protocol. There is not one protocol for the preliminary test and another one for the final test.

steenkh
28th August 2008, 10:25 PM
just one comment here.. when I really tried to find out the odds and things, by emailing to JREF.. I was refused any answers , reasoning that I need to apply firs and after that we start any negotiation.. even when I have asked.. if I claim sirtain % of a minimum rate ( that JREF ask to state in application and description of the claim) even that was ignored..
Is this not because you are not stating what your ability is. You are asking for odds that you will have to beat, which will vary wildly according to your stated ability. If you had said from the beginning: "I can identify the pictures with an accuracy of 70%", the JREF might have had a chance to work out how many tests you would have to do at what success rate, but instead you are asking them something that is impossible to answer.

You should only worry about your own ability, and the JREF can worry about the odds for a false positive.

steenkh
28th August 2008, 10:30 PM
But isn't only the preliminary -- and not the final -- test being negotiated right now for Pavel?
As I said in my previous post: there is only a single protocol, so this protocol will also be the protocol for the final test. However, the final test may still differ.

The Challenge FAQ (http://www.randi.org/joom/content/view/47/37/) states
5.2 What happens between the preliminary test and the official test?

The protocol itself will not be changed, and neither will any of the documents you and the JREF have agreed upon. The final test may be longer, or require more conclusive results through more sets of the test to ensure that the preliminary test was not a fluke.

steenkh
28th August 2008, 10:33 PM
Being required to pass the same test twice in a row results in the higher odds standard. That's the way I have always read that requirement.
As my quote from the FAQ in my previous post shows, James Randi thinks that the final test may require different odds.

petre
29th August 2008, 07:12 AM
As I said in my previous post: there is only a single protocol, so this protocol will also be the protocol for the final test. However, the final test may still differ.

The Challenge FAQ (http://www.randi.org/joom/content/view/47/37/) states

I'm curious how the distinction is made then between the agreed-upon protocol and the performance requirement.

drkitten
29th August 2008, 08:27 AM
I'm curious how the distinction is made then between the agreed-upon protocol and the performance requirement.

Reason and good faith, applied jointly.

If your claim is that you can perform at the 80% accuracy level, and the protocol is negotiated to require you to perform at the 60% accuracy level to pass the preliminary, it is not unreasonable to also require you to perform at the 60% accuracy level to pass the final.

The difference is that you will need to perform more repetitions of an identical task. Instead of needing to get 6 right out of 10, you may need to get twelve right out of 20 (or whatever the numbers work out to be).

Jackalgirl
30th August 2008, 03:59 PM
"At JREF, we offer a one-million-dollar prize to anyone who can show, under proper and feasible observing conditions, evidence of any paranormal, supernatural, or occult power or event."?

Linda


Simpler is better, of course. But may I add that Randi already stipulates a control for time and effort:

I, James Randi, through the JREF, will pay US$1,000,000 [One Million Dollars/US] to any person who can demonstrate any psychic, supernatural or paranormal ability under satisfactory observing conditions.

(Emphasis mine.) The conditions have to be satisfactory to both JREF and the claimant, as is made clear later by the "mutually agreeable" protocol stipulation.

Could JREF go into nauseating detail about what "satisfactory" means? I suppose they could, but as you can see, trying to do so (and cover all the possibilities) gets into a highly complicated what-if game. It's simpler -- and better, in my opinion -- to simply say "satisfactory" and "mutually agreeable" and work that out via discussion with the claimant, especially since the actual definition of what all of that means hangs from the actual claim. Without knowing what the claim is, trying to define the parameters of what is and isn't acceptable is very difficult.

Jackalgirl
30th August 2008, 04:06 PM
It is not relevant who prescribed the trials or what the total number is. I'm just pointing out that your proposed accomodation introduces an error.

Linda

I just got finished reading Mr. Randi's Flim-Flam! -- he calls what you're calling "early stopping" "optional stopping". He says:

Briefly stated, if the subject is allowed to stop whenever he or she wants, there is no value to the experiment, since the subject can stop or be stopped when ahead, and the total result is a win, regardless of what would have happened had the test continued. For this reason the experiment must have an announced number of trials determined firmly in advance, as was done. But optional stopping can also be optional continuing. It's the same problem. If results don't look too good...it is easy to throw in anther few dozen trials to see if we can get ahead before stopping. (Flim-Flam!, Randi, Prometheus Books, Buffalo New York, 1982, p 236)

Given this, I think it's safe to say that Pavel will have to specify a specific number of trials and he will have to complete them all, to the level of accuracy that he claims. (In other words, JREF will not allow early optional stopping or optional continuing.) If that set of trials does not meet the 1:1000 requirement, he'll have to do more until that requirement is met.

I really hope that JREF begins to dialogue with Pavel and Startz via email in a robust and efficient manner. I'm really interested in seeing how this develops.

Rodney
31st August 2008, 07:17 AM
I just got finished reading Mr. Randi's Flim-Flam! -- he calls what you're calling "early stopping" "optional stopping". He says:

(Flim-Flam!, Randi, Prometheus Books, Buffalo New York, 1982, p 236)

Given this, I think it's safe to say that Pavel will have to specify a specific number of trials and he will have to complete them all, to the level of accuracy that he claims. (In other words, JREF will not allow early optional stopping or optional continuing.) If that set of trials does not meet the 1:1000 requirement, he'll have to do more until that requirement is met.
Two points: (1) Beating odds of 1:1000 by random chance is very unlikely, even if early stopping is allowed; (2) My proposal is not to allow early stopping, but to allow the preliminary test to continue if Pavel (or anyone else) initially performs significantly above chance, but short of the 1:1000 level.

I really hope that JREF begins to dialogue with Pavel and Startz via email in a robust and efficient manner. I'm really interested in seeing how this develops.
Yes, and it's hard to understand why the JREF is moving at such a snail's pace when Pavel says that he is ready to take the preliminary test.

EHocking
31st August 2008, 08:42 AM
Been there, done that. On May 10, 2008, I e-mailed the JREF as follows:

"I recently initiated the following thread on the Million Dollar
Challenge Forum --
http://forums.randi.org/showthread.php?p=3692318#post3692318

"What I argue on that thread is that: (a) In tests where the odds of
success can be readily calculated, it is unclear what odds standard
must be met; ...Just trying to cut back to the OP of this thread, perhaps it would be useful to review those protocols that have been successfully negotiated to determine how the odds of success are determined. The application list is at the end of this post.

Of the 11 application, the required success rate to pass the Challenge were all given by the applicants and JREF accepted every single one of the applicant's success rates. One exception was that JREF suggested an 80% score instead of 100% (Corey).

So, for the successfully negotiated protocols so far, JREF has accepted the Applicants' claimed success rates or less for the Challenge.

The spread of odds is large.

Of the 11 Applicants 4 were "self evident", that is a phenomenon occurred or didn't (i.e. yes/no) and odds/statistics weren't discussed.
Of the remaining 7, two were to be tested at 2:100 odds, 2 at 1:10,000 odds, 1 at 1:1,000,000+ odds and 2 at greater than 1:3,000,000.

This clearly demonstrates to me that the odds for each Challenge is determined by the Applicant's claim - just as the Challenge Rules declare.

Setting an absolute statistical success rate is as important to the Challenge procedure as it is impractical.

Each claim determines it's own success rate.
Just so long as it fits the Applicant's claimed success rate and JREF is satisfied that their claim is sufficiently better than that which would occur due to random chance(i.e. "paranormal"), taking the claimant's proposed success rate is obviously quite sufficient.

Carina Landin (http://forums.randi.org/showthread.php?t=32779)
Claim : to identify dead person's identity for a sitter
Success Rate : 80% by applicant
Protocol : guess gender of letter writer - 16/20.
Odds : <1:10,000 (http://www.automeasure.com/chance.html)

Ian Conger (http://forums.randi.org/showthread.php?t=43349) (not tested)
Claim : determine 5 letter words "sent" to Oiuja board
Success Rate : 2 of 3 words by applicant
Odds: 1: 62,500,000,000 (by applicant)

Achau Nguyen (http://forums.randi.org/showthread.php?t=28936)
Claim : psychically send words to receiver
Success Rate : 19+/20 by applicant
Odds: better than 1:1,000,000 (http://www.automeasure.com/chance.html)

Hans Peter Borer (http://forums.randi.org/showthread.php?t=29864)
Claim : Dowse mobile phone from 10 boxes
Success Rate : implied 100% by applicant
Protocol : 13 trials by GWUP tester
Odds: 1:100 (at 6 from 15)

Angela Patel (http://forums.randi.org/showthread.php?t=34609)
Claim: dowse a person's address from A-Z
Success rate: 1 from 1 - 100% from the applicant
Agreed : 3 from 3 (JREF suggested 5).
Odds : none

Cameron Johnson
(http://forums.randi.org/showthread.php?t=38741)Claim: psychically transmit 10 cards to a receiver
Success rate: 10/10 applicant's claim (from previous test)
Odds: 1:3,600,000 (determined by applicant)
Tested by carolina skeptics (http://web.archive.org/web/20050311002011/http://www.wfu.edu/~ecarlson/tasc/investigations/cards.html)

Paul Carey (http://forums.randi.org/showthread.php?t=30676) (not tested)
Claim: Transmit phrases psychically.
Success rate: claimed 100%, Jref suggested 4/5
Odds: 1:100 (http://www.automeasure.com/chance.html)

Yellow Bamboo (http://forums.randi.org/showthread.php?t=34334) (not tested)
Claim: Knock down an attacker without touching them.
Success Rate: yes/no i.e. 100% by applicant
Odds: none

Jim Dunn (http://forums.randi.org/showthread.php?t=32528)
Claim: Prevent hospital deaths over 24+hr period
Success Rate : y/n i.e. 100% by applicant
Odds: none

Russell Shipp (http://forums.randi.org/showthread.php?t=30119) (not tested by JREF)
Claim : Telekinesis. Spin a suspended object.
Success Rate : yes/no i.e. 100% by applicant
Odds: none

James Blunt (http://forums.randi.org/showthread.php?t=28322)(not tested by JREF)
Claime : identify 5 materials in 5 bags by dowsing, 3 trials
Success Rate : 100% by applicant
Odds: in the region of 1:10,000? (http://www.automeasure.com/chance.html)

William Smith
31st August 2008, 08:59 AM
...
Yes, and it's hard to understand why the JREF is moving at such a snail's pace when Pavel says that he is ready to take the preliminary test.

Agreed. That is hard to understand, even with the usually limited resources available.

Just trying to cut back to the OP of this thread, perhaps it would be useful to review those protocols that have been successfully negotiated to determine how the odds of success are determined. The application list is at the end of this post.

Of the 11 application, the required success rate to pass the Challenge were all given by the applicants and JREF accepted every single one of the applicant's success rates. One exception was that JREF suggested an 80% score instead of 100% (Corey).

So, for the successfully negotiated protocols so far, JREF has accepted the Applicants' claimed success rates or less for the Challenge.

The spread of odds is large.

Of the 11 Applicants 4 were "self evident", that is a phenomenon occurred or didn't (i.e. yes/no) and odds/statistics weren't discussed.
Of the remaining 7, two were to be tested at 2:100 odds, 2 at 1:10,000 odds, 1 at 1:1,000,000+ odds and 2 at greater than 1:3,000,000.

This clearly demonstrates to me that the odds for each Challenge is determined by the Applicant's claim - just as the Challenge Rules declare.

Setting an absolute statistical success rate is as important to the Challenge procedure as it is impractical.

Each claim determines it's own success rate.
Just so long as it fits the Applicant's claimed success rate and JREF is satisfied that their claim is sufficiently better than that which would occur due to random chance(i.e. "paranormal"), taking the claimant's proposed success rate is obviously quite sufficient.

Carina Landin (http://forums.randi.org/showthread.php?t=32779)
Claim : to identify dead person's identity for a sitter
Success Rate : 80% by applicant
Protocol : guess gender of letter writer - 16/20.
Odds : <1:10,000 (http://www.automeasure.com/chance.html)

Ian Conger (http://forums.randi.org/showthread.php?t=43349) (not tested)
Claim : determine 5 letter words "sent" to Oiuja board
Success Rate : 2 of 3 words by applicant
Odds: 1: 62,500,000,000 (by applicant)

Achau Nguyen (http://forums.randi.org/showthread.php?t=28936)
Claim : psychically send words to receiver
Success Rate : 19+/20 by applicant
Odds: better than 1:1,000,000 (http://www.automeasure.com/chance.html)

Hans Peter Borer (http://forums.randi.org/showthread.php?t=29864)
Claim : Dowse mobile phone from 10 boxes
Success Rate : implied 100% by applicant
Protocol : 13 trials by GWUP tester
Odds: 1:100 (at 6 from 15)

Angela Patel (http://forums.randi.org/showthread.php?t=34609)
Claim: dowse a person's address from A-Z
Success rate: 1 from 1 - 100% from the applicant
Agreed : 3 from 3 (JREF suggested 5).
Odds : none

Cameron Johnson
(http://forums.randi.org/showthread.php?t=38741)Claim: psychically transmit 10 cards to a receiver
Success rate: 10/10 applicant's claim (from previous test)
Odds: 1:3,600,000 (determined by applicant)
Tested by carolina skeptics (http://web.archive.org/web/20050311002011/http://www.wfu.edu/~ecarlson/tasc/investigations/cards.html)

Paul Carey (http://forums.randi.org/showthread.php?t=30676) (not tested)
Claim: Transmit phrases psychically.
Success rate: claimed 100%, Jref suggested 4/5
Odds: 1:100 (http://www.automeasure.com/chance.html)

Yellow Bamboo (http://forums.randi.org/showthread.php?t=34334) (not tested)
Claim: Knock down an attacker without touching them.
Success Rate: yes/no i.e. 100% by applicant
Odds: none

Jim Dunn (http://forums.randi.org/showthread.php?t=32528)
Claim: Prevent hospital deaths over 24+hr period
Success Rate : y/n i.e. 100% by applicant
Odds: none

Russell Shipp (http://forums.randi.org/showthread.php?t=30119) (not tested by JREF)
Claim : Telekinesis. Spin a suspended object.
Success Rate : yes/no i.e. 100% by applicant
Odds: none

James Blunt (http://forums.randi.org/showthread.php?t=28322)(not tested by JREF)
Claime : identify 5 materials in 5 bags by dowsing, 3 trials
Success Rate : 100% by applicant
Odds: in the region of 1:10,000? (http://www.automeasure.com/chance.html)

Nice work, EHocking.



Could someone calculate the odds for the Yellow Bamboo claim? I know, but anyway.

EHocking
31st August 2008, 09:28 AM
Could someone calculate the odds for the Yellow Bamboo claim? I know, but anyway.Grrr:mad:, one of the points of my post was to show Rodney that in some applications odds are irrelevant.

The claimant either demonstrates a phenomenon or doesn't.
Odds in a claim of that nature are irrelevant.

EHocking
31st August 2008, 09:34 AM
...Yes, and it's hard to understand why the JREF is moving at such a snail's pace when Pavel says that he is ready to take the preliminary test.One of the reasons, IMHO, is that Pavel is too readily lead into complicating a protocol that should have been straightforward. Initiall because he was not clear on what he could do, and then got into detailed protocol discussions without a clear goal in mind.

As can be seen from the successful protocols, when the applicant clearly sets out what they can do the process really only bogs down on logistics.

I still suggest that Pavel goes back to his original claim (per my previous post) and simplifies the protocol. It is a simple one: guess one or two cards from a choice of three, conducted ten times, goal being 70%.

William Smith
31st August 2008, 09:35 AM
Grrr:mad:, one of the points of my post was to show Rodney that in some applications odds are irrelevant.

The claimant either demonstrates a phenomenon or doesn't.
Odds in a claim of that nature are irrelevant.

I know. Still, it seems an interesting challenge for a stat wiz to calculate the odds for e.g. Prophet Yahweh's claim. (http://forums.randi.org/showthread.php?t=40507)

Coveredinbeeees
31st August 2008, 10:32 AM
Just trying to cut back to the OP of this thread, perhaps it would be useful to review those protocols that have been successfully negotiated to determine how the odds of success are determined. The application list is at the end of this post.

Carina Landin (http://forums.randi.org/showthread.php?t=32779)
Claim : to identify dead person's identity for a sitter
Success Rate : 80% by applicant
Protocol : guess gender of letter writer - 16/20.
Odds : <1:10,000 (http://www.automeasure.com/chance.html)

Hans Peter Borer (http://forums.randi.org/showthread.php?t=29864)
Claim : Dowse mobile phone from 10 boxes
Success Rate : implied 100% by applicant
Protocol : 13 trials by GWUP tester
Odds: 1:100 (at 6 from 15)



You might want to recheck your calculations here. I Had a quick look at 2 of these and found mistakes.

For Carina Landin, the odds of getting at least 16 out of 20 genders correct purely by chance would be 0.0059 or 1:170.

For Hans Peter Borer, his protocol was actually to find which of the 10 boxes held the phone at least 7 times in 13 tests. Odds of achieving this by chance alone are 0.0001 or 1:10,000

Here's (http://www.stat.tamu.edu/~west/applets/binomialdemo.html) a handy tool if you don't want to make your own.

It was interesting to see all of the tests laid out this way. I read through the applications section when I first came across these forums and it was fun to be reminded of some of the highlights, or lowlights, of the challenge.

Cheers,

'Beeees

EHocking
31st August 2008, 11:01 AM
You might want to recheck your calculations here. I Had a quick look at 2 of these and found mistakes.I had made a mistake, I seem to have mixed the two examples when typing the list out. Where possible, I used this set of tables (http://www.automeasure.com/chance.html) as a reference, to get a feel for the odds.

For Carina Landin, the odds of getting at least 16 out of 20 genders correct purely by chance would be 0.0059 or 1:170.Indeed, I was surprised that, from the above tables, 16/20 with a 1 from 2 choice is to be expected to occur purely by random chance at odds of 1:100.

The point was that JREF were satisfied that 16/20 (the 80% of the claimant) was sufficient for this applicant to pass the Preliminary Challenge.

For Hans Peter Borer, his protocol was actually to find which of the 10 boxes held the phone at least 7 times in 13 tests. Odds of achieving this by chance alone are 0.0001 or 1:10,000

Here's (http://www.stat.tamu.edu/~west/applets/binomialdemo.html) a handy tool if you don't want to make your own.See previous explanation. The tables I referenced support that range of odds.

It was interesting to see all of the tests laid out this way. I read through the applications section when I first came across these forums and it was fun to be reminded of some of the highlights, or lowlights, of the challenge.

Cheers,

'BeeeesI compiled a spreadsheet of the entire list (about 6months out of date now) to determine just how many applicants get to a final protocol and the reasons for not getting there. It's actually difficult to analyse statistically because of the number of applications that aren't or don't need to be determined by odds compared to random chance. A number of them are "it either happens or it doesn't".

Point I was trying to get across to the OP is that his fixation on JREF stating a standard "pass" rate is quite irrelevant and is a strawman constructed by him to discredit the Challenge.

I will admit though, I had always thought that the Preliminary stage's "pass" threshold was 1:10,000 and the Final was 1,000,000. It may well have been when the Challenge was only for a $10,000 prize, but I've not been able to find evidence for that and quickly reviewing the Challeng Application section will show that it's not applicable in quite a number of applications to the Challenge.

Coveredinbeeees
31st August 2008, 12:03 PM
I had made a mistake, I seem to have mixed the two examples when typing the list out.

Ah, the dreaded transcription error. I've fallen afoul of it on countless occasions myself.

The point was that JREF were satisfied that 16/20 (the 80% of the claimant) was sufficient for this applicant to pass the Preliminary Challenge.

Yes, finding enough letters in time to put together a more rigourous test would have been tricky, I imagine.

I compiled a spreadsheet of the entire list (about 6 months out of date now) to determine just how many applicants get to a final protocol and the reasons for not getting there. It's actually difficult to analyse statistically because of the number of applications that aren't or don't need to be determined by odds compared to random chance. A number of them are "it either happens or it doesn't". [\quote]

Did anything come of your research? It sounds interesting.

[QUOTE=EHocking;3993463]Point I was trying to get across to the OP is that his fixation on JREF stating a standard "pass" rate is quite irrelevant and is a strawman constructed by him to discredit the Challenge.

It does seem that the challenge needs to be considered on a purely case by case basis. If only the applicants' abilities were so self evident as to make knowing the odds of passing by chance superfluous.

Rodney
31st August 2008, 04:15 PM
The point was that JREF were satisfied that 16/20 (the 80% of the claimant) was sufficient for this applicant to pass the Preliminary Challenge.
But why only 20 trials for her and a proposed 40 for Pavel? Even if the JREF insists that Pavel has to achieve at least 75% hits (when he is claiming only a 70% hit rate), 24 hits in 32 trials would defy odds of 1:250, whereas Carina Landin could have passed by defying odds of only 1:170.

EHocking
1st September 2008, 05:50 AM
But why only 20 trials for her and a proposed 40 for Pavel? Even if the JREF insists that Pavel has to achieve at least 75% hits (when he is claiming only a 70% hit rate), 24 hits in 32 trials would defy odds of 1:250, whereas Carina Landin could have passed by defying odds of only 1:170.Because their claims are quite different from each other.

Because their claimed success rate are quite different from each other.

Because each of the protocols are individualised to cover both of the above - so will be quite different from each other.

I stated that quite clearly in my previous post, obviously I need to repeat myself, so:

"Each claim determines it's own success rate.
Just so long as it fits the Applicant's claimed success rate and JREF is satisfied that their claim is sufficiently better than that which would occur due to random chance(i.e. "paranormal"), taking the claimant's proposed success rate is obviously quite sufficient."

EHocking
1st September 2008, 05:55 AM
I compiled a spreadsheet of the entire list (about 6 months out of date now) to determine just how many applicants get to a final protocol and the reasons for not getting there. It's actually difficult to analyse statistically because of the number of applications that aren't or don't need to be determined by odds compared to random chance. A number of them are "it either happens or it doesn't".
Did anything come of your research? It sounds interesting.
Nothing much beyond the fairly basic. 143 applications, 11 agreed protocols, 8 actual Prelim tests, 8 failed Prelim tests.

The difficulty in building any decent stats from the population is that the population is so varied. I had nearly 30 categories describing different paranormal "abilities".

I really should go back and review and refine the study now that I'm more familiar with all the claims. It probably won't be this month though, I've got some web pages for another forum thread (see sig) that I've got behind in maintaining in the last few weeks.

Rodney
1st September 2008, 07:36 AM
Because their claims are quite different from each other.

Because their claimed success rate are quite different from each other.

Because each of the protocols are individualised to cover both of the above - so will be quite different from each other.

I stated that quite clearly in my previous post, obviously I need to repeat myself, so:

"Each claim determines it's own success rate.
Just so long as it fits the Applicant's claimed success rate and JREF is satisfied that their claim is sufficiently better than that which would occur due to random chance(i.e. "paranormal"), taking the claimant's proposed success rate is obviously quite sufficient."
I appreciate your research, but I'm still puzzled as to why Carina Landin was held to an odds standard of only 1:170 in the preliminary test while the current proposal being floated for Pavel would hold him to an odds standard of 1:900. Bear in mind that Landin was claiming a higher success rate than Pavel (80% vs. 70%) so, if anything, it could be argued that she should have been held to a higher odds standard, not a lower one.

steenkh
1st September 2008, 08:04 AM
Carina Landin's claim also involved the digging out a number of hard-to-find diaries, which seems to have caused the JREF to suggest a compromise.

Gr8wight
1st September 2008, 08:54 AM
I appreciate your research, but I'm still puzzled as to why Carina Landin was held to an odds standard of only 1:170 in the preliminary test while the current proposal being floated for Pavel would hold him to an odds standard of 1:900. Bear in mind that Landin was claiming a higher success rate than Pavel (80% vs. 70%) so, if anything, it could be argued that she should have been held to a higher odds standard, not a lower one.

Thinking error. The odds standard is not necessarily related to the applicant's claimed success rate. The odds standard the JREF requires is more related to the protocol design, and in the case of Ms. Landin, as was pointed out, can be relaxed in some instances in order to facilitate a test when one might not otherwise occur. This is not one of those instances because a protocol to test Pavel does not run up against any logistical challenges, as the one for Carina did.

Rodney
1st September 2008, 10:11 AM
Thinking error. The odds standard is not necessarily related to the applicant's claimed success rate. The odds standard the JREF requires is more related to the protocol design, and in the case of Ms. Landin, as was pointed out, can be relaxed in some instances in order to facilitate a test when one might not otherwise occur.
So if it's even more difficult to facilitate a test for an applicant than it was for Ms. Landin, the odds standard would be even lower?

Thabiguy
1st September 2008, 12:04 PM
So if it's even more difficult to facilitate a test for an applicant than it was for Ms. Landin, the odds standard would be even lower?

There is no odds standard.

If you mean the odds of false positive, sure, JREF might be willing to accept higher odds of false positive for the preliminaries, depending on the particular case. Randi even announced that JREF may decide to skip the preliminary test altogether in certain cases, i.e. reduce the required odds to 1 in 1.

Why do you keep asking questions that you know perfectly well the answers to?

Czarcasm
1st September 2008, 12:28 PM
Because he wishes to muddy the waters for anyone who is late in joining this conversation?

Rodney
1st September 2008, 01:14 PM
There is no odds standard.
I agree that there is none in the official JREF MDC rules, but for several applicants, odds standards have been used in the preliminary tests.

If you mean the odds of false positive, sure, JREF might be willing to accept higher odds of false positive for the preliminaries, depending on the particular case.
Except that there appears to be no rhyme or reason for the higher or lower odds that each applicant must meet.

Randi even announced that JREF may decide to skip the preliminary test altogether in certain cases, i.e. reduce the required odds to 1 in 1.
When did he announce that? According to Rule 6 of the MDC: "In all cases, applicant will be required to perform a preliminary test either before an appointed representative, if distance and time dictate that need, or in a location where a member or representative of the JREF staff can attend."

Why do you keep asking questions that you know perfectly well the answers to?
How can I know the answers when the MDC rules are as muddled as they are?

Czarcasm
1st September 2008, 01:29 PM
Just because the rules aren't as lax as you wish them to be doesn't make them muddled.

EHocking
1st September 2008, 02:57 PM
I agree that there is none in the official JREF MDC rules, but for several applicants, odds standards have been used in the preliminary tests. All of that is covered by a single sentence in Rule 3.
3. We will consult competent statisticians when an evaluation of the experimental design, is required.
Except that there appears to be no rhyme or reason for the higher or lower odds that each applicant must meet. All of that is covered by a single sentence in Rule 3.
3. We will consult competent statisticians when an evaluation of the experimental design, is required.

Also note the very first point in the application itself (my bolding),
1. This is the primary and most important of these rules: Applicant must state clearly in advance, and
applicant and JREF will agree upon, what powers and/or abilities will be demonstrated, the limits of
the proposed demonstration (so far as time, location and other variables are concerned) and what will
constitute both a positive and a negative result.

So. Again.
The applicant declares what constitutes a success or failure.
If the declared results require, JREF will consult statisticians to check whether such a result falls under the realm of random chance and if so require the applicant to perform to a level that does NOT fall within the bounds of random chance.

Finally and most importantly.

All applicants that have agreed to a protocol and/or been tested for the Preliminary Challenge have been tested at or below their declared "pass rate".

Your argument on lack of odds standards is irrelevant in light of that last fact. And fact it is.

Pavel's application is made difficult because he was unable to declare what his potential success rate was because his "skill", by his own admission, is hit and miss.

Landin declared emphatically that she could perform at 80%.
How can I know the answers when the MDC rules are as muddled as they are?They are pretty clear.
As a friend of mine is wont to say, "I can 'splain it for you, but I can't understand it for you".

drkitten
1st September 2008, 03:47 PM
So if it's even more difficult to facilitate a test for an applicant than it was for Ms. Landin, the odds standard would be even lower?

Or alternatively, the test would not happen at all because JREF would judge the test as being too difficult to run.

Rodney
1st September 2008, 03:56 PM
All of that is covered by a single sentence in Rule 3.
3. We will consult competent statisticians when an evaluation of the experimental design, is required.
That's an evasion -- there should be a clear standard applicable to all odds-based applications. Up to this point, every such application appears to have been handled on an ad hoc "make it up as you go along" basis.

Pavel's application is made difficult because he was unable to declare what his potential success rate was because his "skill", by his own admission, is hit and miss.
Try reading (Pavel's) post #281 on this thread:

"When I really tried to find out the odds and things, by emailing to JREF I was refused any answers, reasoning that I need to apply firs and after that we start any negotiation.. even when I have asked .. if I claim sirtain % of a minimum rate (that JREF ask to state in application and description of the claim) even that was ignored." Pavel then inquired: "Will it be fine for the claim as the accuracy to state 70% minimum from my side?"

Rodney
1st September 2008, 04:03 PM
Or alternatively, the test would not happen at all because JREF would judge the test as being too difficult to run.
So what do you suppose is the lowest standard that the JREF would accept in an odds-based preliminary test? In Carina Landin's case, it was 1:170. Would they go as low as 1:100? 1:50? 1:10? Or what???

Paul2
1st September 2008, 04:20 PM
May I speak in favor of Rodney? I don't endorse every word he's posted, but can I see the issue he's raising. Even though it's JREF's challenge, they set the rules, and they must agree on the protocol (as must the applicant), I don't see why an odds-based test should have different standards for different applicants, as a *substantive* matter. That JREF *can* put forward different odds for different applicants is not the issue.

It seems to be a matter of substance that a different odds standard (1/500, 1/1000), to ensure results beyond chance) should not be applied to different applicants. At minimum, and in the case of applicants claiming different success rates, a minimum standard would then apply, as EHocking infers in post 302, ""Each claim determines it's own success rate. Just so long as it fits the Applicant's claimed success rate and JREF is satisfied that their claim is sufficiently better than that which would occur due to random chance. . . ."

That minimum standard should be an intellectual, substantive issue, related to what is necessary to ensure results beyond chance, not related to the pragmatics of protocal negotiations.

Thabiguy
1st September 2008, 04:35 PM
When did he announce that?
In January 2007. See here (http://www.randi.org/jr/2007-01/011207challenge.html), look for the word "waive".

Rodney
1st September 2008, 05:21 PM
In January 2007. See here (http://www.randi.org/jr/2007-01/011207challenge.html), look for the word "waive".
Thanks. However, I note that Randi stated: "We may be prepared to possibly waive the requirement for a preliminary test as soon as these two qualifications have been validated. In such a case we will be prepared to move right into the second phase: the formal test." Certainly, if the JREF is seriously considering a waiver, they should say so, instead of continuing to state in Rule 6 of the MDC: "In all cases, applicant will be required to perform a preliminary test . . ."

drkitten
1st September 2008, 07:23 PM
That's an evasion -- there should be a clear standard applicable to all odds-based applications.

You keep saying this. You are also the ONLY person who keeps saying this, which suggests to me that, no, there is no necessity for such a standard.

Tell you what. When you get a million dollars, you can make whatever rules you like. Until then, STFU.

drkitten
1st September 2008, 07:25 PM
So what do you suppose is the lowest standard that the JREF would accept in an odds-based preliminary test? In Carina Landin's case, it was 1:170. Would they go as low as 1:100? 1:50? 1:10? Or what???

Demonstrably, 1 in 1. They've already offered to waive the preliminary test for some applications in an effort to get them to be tested at all.

steenkh
2nd September 2008, 02:43 AM
That minimum standard should be an intellectual, substantive issue, related to what is necessary to ensure results beyond chance, not related to the pragmatics of protocal negotiations.
Why? The MDC, while using scientific methods, is not a scientific study. It is simply a challenge for people to do what they claim they can do. The JREF is betting a million dollars that this will not happen, and it is only reasonable to have low odds for a false positive.

Now it also seems a problem for some that the JREF sometimes accepts higher odds for false positives. Why?

It seems that rather than concentrate on the keeping the odds for a false negative low, people are trying to arrange for increasingly higher odds for false positives. Is this a sign of desperation?

The million is for the JREF to give, and the rules are clear: just perform as claimed. I have little respect for people who claim they can do something, and then it turns out that they can actually not do it with any certainty, and I do not see why they should be catered for with extra tests and more chances for success.

Cuddles
2nd September 2008, 03:25 AM
Firstly, what Steenkh said. This has been said many times already, but it bears repeating as often as is necessary. The probability of winning by chance alone is irrelevant to anyone who genuinely believes they have a paranormal ability. If they can actually do what they claim, chance doesn't come into it. The only time chance is relevant is if they can't do what they claim, in which case the JREF wants there to be as small a chance of a win as is practical. As has been pointed out, what is practical varies from test to test. Carina Landin was allowed a very high chance of winning by chance because her claim relied on fairly rare items, so much so that there were problems with the test and the JREF will allow her to retake it. On the other hand, Pavel's test uses easily obtained envelopes and photos, and it is therefore very easy to greatly reduce the probability of winning by chance.

I think one problem here is that there seems to be some confusion about the different probabilities involved. There are at least three very different probabilities:
1) The success rate of the applicant's ability;
2) The chance of a false positive;
3) The chance of a false negative.

People seem to regularly conflate 1) with both 2) and 3). The 70% (or whatever) rate given by Pavel is not the chance of him winning. It is simply the chance, given a single envelope, that he will identify what is inside. The 1:1000 odds (or whatever the actual value is for a test) is the chance of a false positive. The test can be arranged to have a 1:1000 chance no matter what the rate claimed in 1) actually is. 70%, 50%, 1%, whatever. Of course, the lower the success rate, the bigger and longer a test needs to be, so there reaches a point where it is no longer practical to test, even if you accept a higher chance of a false positive.

Obviously the JREF is most concerned about 2), since they don't want to give their money to someone who only won by chance. 1) is the applicants claim, and as long as it is testable, the JREF really doesn't care what they claim about it. 3) is important for the applicant and should be what they focus their attention on, but from the applications I've read, it seems only the JREF ever thinks about this, hence suggesting that an 80% success rate would win when 100% was claimed.

The point is, these are three very different things. 1) is completely independent of 2) and 3), and it is in the interest of both the JREF and any honest applicant to have both 2) and 3) as low as possible. If you are truly interested in deomstrating an ability you believe you have, you should want the chance of winning without an ability to be as low as possible, as long as that doesn't make the chance of losing despite the ability too high. Exactly what "as low as possible" and "too high" are wil vary depending on circumstances, but general arguments that the JREF is being unfair by trying to reduce both of them, as Rodney argues, just make no sense at all from either the JREF's or the applicants' point of view.

drkitten
2nd September 2008, 07:00 AM
Even though it's JREF's challenge, they set the rules, and they must agree on the protocol (as must the applicant), I don't see why an odds-based test should have different standards for different applicants, as a *substantive* matter.

That minimum standard should be an intellectual, substantive issue, related to what is necessary to ensure results beyond chance, not related to the pragmatics of protocal negotiations.

Don't do much bench science, do you? I believe it was Sir Peter Medawar (a Nobel laureate) who pointed out that "if politics is `the art of the possible,' science is `the art of the soluble.'" If you know beforehand that practical considerations prevent you from running an experiment, there's not much point in wasting time and energy on optimizing the theoretical design.

Scarcity of data, for example, is a very real problem for any experiment; if there are only 100 people in the world with a particular disease, it will be impossible to conduct large-scale clinical trials of your shiny new drug. If you need to run an experiment on the ground at an elevation of more than 7000 meters, there are fewer than 500 places world-wide that you can do it. If you need to run an experiment under 15,000 meters of water, you can't do it at all.

If you know that you can only get 200 points of data, you design the test to squeeze as much out of those 200 points as you can --- and if you need to accept slightly larger error bars, that's a substantive limitation.

Paul2
2nd September 2008, 07:10 AM
Why? The MDC, while using scientific methods, is not a scientific study. I didn't say it was scientific.

It is simply a challenge for people to do what they claim they can do. The JREF is betting a million dollars that this will not happen, and it is only reasonable to have low odds for a false positive.I didn't say low odds were a problem. I'm only arguing for the *same* minimum odds for any applicant.

It seems that rather than concentrate on the keeping the odds for a false negative low, people are trying to arrange for increasingly higher odds for false positives. Is this a sign of desperation?That's not me, that's not what I said.

The million is for the JREF to give, and the rules are clear: just perform as claimed. I have little respect for people who claim they can do something, and then it turns out that they can actually not do it with any certainty, and I do not see why they should be catered for with extra tests and more chances for success.I agree, I'm not arguing for extra tests or more chances for success.

EHocking
2nd September 2008, 07:23 AM
That's an evasion -- there should be a clear standard applicable to all odds-based applications. I noticed that you ignored the rest of my post on this issue. Perhaps because it explains the situation?.

Let me repeat it.

Rule 1.
Applicant must state clearly in advance, and applicant and JREF will agree upon, what powers and/or abilities will be demonstrated, the limits of the proposed demonstration (so far as time, location and other variables are concerned) and what will constitute both a positive and a negative result.

Rule 3 qualifies the above with.
We will consult competent statisticians when an evaluation of the experimental design, is required

The rules are not meant to be taken in isolation (quote mining) as you are doing, they are to be taken in their entirity as they explain the process and JREF's stance on protocols and applications.Up to this point, every such application appears to have been handled on an ad hoc "make it up as you go along" basis. The Rules are not meant to be taken in isolation but referred to as a whole explanation of the Challenge.

You are quote mining here. Read ALL the rules in proper context.

Try reading (Pavel's) post #281 on this thread:

"When I really tried to find out the odds and things, by emailing to JREF I was refused any answers, reasoning that I need to apply firs and after that we start any negotiation.. even when I have asked .. if I claim sirtain % of a minimum rate (that JREF ask to state in application and description of the claim) even that was ignored." Pavel then inquired: "Will it be fine for the claim as the accuracy to state 70% minimum from my side?"Fie on your post #281, I have already quoted his posts from the start of this thread before he applied for the Challenge.
From my post (#269 (http://forums.randi.org/showpost.php?p=3980924&postcount=269)) which prompted Pavel's response in #281:

He stated this in a number of posts prior to applying to the Challenge.

Quote: (http://forums.randi.org/showpost.php?p=3980924&postcount=269)
Post 51 from his first thread (http://forums.randi.org/showpost.php?p=2859837&postcount=51).
WHAT would be self evident prove? how many times i have to perform it.. let say its an 1 hour test session..
so ill know what is minimum has to be performed..

Quote:
Post 241 after much discussion (http://forums.randi.org/showpost.php?p=3738821&postcount=241)
As to the accuracy with which I can perform it, I think I have no choice but to claim that I can perform the results that beat 1 to 1000 odds as that what JREF want to be beaten. If I am not mistaken, or I have a choice? Like to claim I can beat 1 to 300 or even 400 odds? 200-300 odds that is way more than just by chance but that will not be considered as a success isn’t it? Or I misunderstood something from requirements?

It's quite clear that he came to the Forum to determine at what level he needed to perform his skill to, to pass the Challenge.

Paul2
2nd September 2008, 07:31 AM
The probability of winning by chance alone is irrelevant to anyone who genuinely believes they have a paranormal ability. If they can actually do what they claim, chance doesn't come into it. The only time chance is relevant is if they can't do what they claim. . . .I can claim that I can hit a bullseye with a bow an arrow, but I can't do it 100% of the time. And, Pavel doesn't claim he can see inside envelopes with 100% accuracy. There are many activities that we commonly think people can do but they can't do 100% of the time.

. . . in which case the JREF wants there to be as small a chance of a win as is practical. Huh? Can you explain this further? It sounds like something I don't think you mean.

As has been pointed out, what is practical varies from test to test. Carina Landin was allowed a very high chance of winning by chance because her claim relied on fairly rare items, so much so that there were problems with the test and the JREF will allow her to retake it. On the other hand, Pavel's test uses easily obtained envelopes and photos, and it is therefore very easy to greatly reduce the probability of winning by chance.But surely there must be some minimum odds *that are independent of the particulars of a test* that, by themselves, argue that the results were or were not due to chance.

Cuddles
2nd September 2008, 07:49 AM
I can claim that I can hit a bullseye with a bow an arrow, but I can't do it 100% of the time. And, Pavel doesn't claim he can see inside envelopes with 100% accuracy. There are many activities that we commonly think people can do but they can't do 100% of the time.

Did you even bother reading my post? You've done exactly what I said people were doing and conflated points 1) and 3). As I said, that is utterly irrelevant. The whole point of my post was that the claimed success rate of an ability is independent of the probability of a false positive or negative. A test can be designed to give any false positive chance you like, no matter what success rate is claimed. Of course, whether such a test is practical or not depends on the specific claim.

Huh? Can you explain this further? It sounds like something I don't think you mean.

It means exactly what it says. If an applicant doesn't have an ability, the JREF doesn't want them to win. What part of that do you find hard to understand?

But surely there must be some minimum odds *that are independent of the particulars of a test* that, by themselves, argue that the results were or were not due to chance.

Why? I see people saying that there should be the same chance of false positive for all tests, but I don't see anyone giving a reason. From the applicant's point of view, the chance of false positive is irrelevant. From the JREF's point of view, the chance of false postive should be as small as possible, but since it's the JREF's money, they are free to accept whatever odds they like.

Paul2
2nd September 2008, 09:12 AM
Did you even bother reading my post? You've done exactly what I said people were doing and conflated points 1) and 3). I see your point.
Why? I see people saying that there should be the same chance of false positive for all tests, but I don't see anyone giving a reason. From the applicant's point of view, the chance of false positive is irrelevant. From the JREF's point of view, the chance of false postive should be as small as possible, but since it's the JREF's money, they are free to accept whatever odds they like.

I think the issue crops up with the inevitable interpretation of someone passing the challenge. Admittedly, interpreting a successful challenge is beyond the terms of the challenge. But it sure would be inevitable.

Passing the challenge would not prove the existence of the paranormal, but it would be one piece of evidence, good or bad. In order to consider how good that piece of evidence would be for arguing that the paranormal existed, one question would be the odds of a false positive. So should there be different standard for one test versus another *in these terms, in this context?*

So arguing for some inherent standard for a false positive versus the pragmatic, variable approach is the difference between looking at the challenge narrowly (just one protocol between JREF and an applicant, on such and such terms) versus the larger implications of the challenge, especially if someone ever passed it.

I would accept the argument that the narrow, pragmatic approach is perfectly fine, and let interpretation take care of itself. But I think that's where the inherent standard idea comes from.

drkitten
2nd September 2008, 11:07 AM
Passing the challenge would not prove the existence of the paranormal, but it would be one piece of evidence, good or bad. In order to consider how good that piece of evidence would be for arguing that the paranormal existed, one question would be the odds of a false positive. So should there be different standard for one test versus another *in these terms, in this context?*

Absolutely, for several reasons.

First, it is in the JREF's best interest to have the test be as conclusive as practical. That's not the same thing, at all, as "as conclusive as possible." If the costs or effort involved in making an absolutely ironclad test is unreasonable, then they will (justifiably) settle for a merely zinc-clad or copper-clad test, on the grounds that they can't get ironclad. And they're certainly not going to go for unobtainium-clad when unobtainium is, by definition, not available.

In the case of the Landin test, for example, the number of diaries available limited the strength of the test that could be performed. I'm not sure that the Landin test would even qualify as "duct-tape-clad," but it was the best that could be arranged, and both Landin and and the JREF were willing to accept that limitation. When we're dealing with things as omnipresent as photographs and envelopes, it's hard to argue about material limitation.

Second, the JREF is not a scientific institution, but an educational one -- and if there is educational value in running a more lightly controlled test (as there would be, for example, if one of the "name" psychics were to attempt the challenge), then the JREF can, quite justifiably, tell the scientific purists to go hang. Especially since the difference between 1:200 and 1:1000 is not likely to make a difference in any one particular instance. But again, I don't see any particular educational value in testing Pavel per se; he hasn't got the huge following of Sylvia or Uri, and showing him to be mislead would not have nearly as much the impact.

Paul2
2nd September 2008, 11:59 AM
Absolutely, for several reasons.

First, it is in the JREF's best interest to have the test be as conclusive as practical. I think I get it now. Sorry, Rodney, I tried to see your side, but it just ain't gonna work.

William Smith
2nd September 2008, 12:07 PM
Absolutely, for several reasons.

First, it is in the JREF's best interest to have the test be as conclusive as practical. That's not the same thing, at all, as "as conclusive as possible." If the costs or effort involved in making an absolutely ironclad test is unreasonable, then they will (justifiably) settle for a merely zinc-clad or copper-clad test, on the grounds that they can't get ironclad. And they're certainly not going to go for unobtainium-clad when unobtainium is, by definition, not available.

In the case of the Landin test, for example, the number of diaries available limited the strength of the test that could be performed. I'm not sure that the Landin test would even qualify as "duct-tape-clad," but it was the best that could be arranged, and both Landin and and the JREF were willing to accept that limitation. When we're dealing with things as omnipresent as photographs and envelopes, it's hard to argue about material limitation.

Second, the JREF is not a scientific institution, but an educational one -- and if there is educational value in running a more lightly controlled test (as there would be, for example, if one of the "name" psychics were to attempt the challenge), then the JREF can, quite justifiably, tell the scientific purists to go hang. Especially since the difference between 1:200 and 1:1000 is not likely to make a difference in any one particular instance. But again, I don't see any particular educational value in testing Pavel per se; he hasn't got the huge following of Sylvia or Uri, and showing him to be mislead would not have nearly as much the impact.

Well said.

Rodney
2nd September 2008, 03:44 PM
I think I get it now. Sorry, Rodney, I tried to see your side, but it just ain't gonna work.
You were right the first time. ;) But let's hope that the JREF can finally muster the energy to test Pavel, even it it can't explain its odds standard.

Moochie
2nd September 2008, 05:27 PM
Did you even bother reading my post? You've done exactly what I said people were doing and conflated points 1) and 3). As I said, that is utterly irrelevant. The whole point of my post was that the claimed success rate of an ability is independent of the probability of a false positive or negative. A test can be designed to give any false positive chance you like, no matter what success rate is claimed. Of course, whether such a test is practical or not depends on the specific claim.



It means exactly what it says. If an applicant doesn't have an ability, the JREF doesn't want them to win. What part of that do you find hard to understand?



Why? I see people saying that there should be the same chance of false positive for all tests, but I don't see anyone giving a reason. From the applicant's point of view, the chance of false positive is irrelevant. From the JREF's point of view, the chance of false postive should be as small as possible, but since it's the JREF's money, they are free to accept whatever odds they like.

I get really annoyed by the specious arguments put forward to re-jig the stats. It's perfectly obvious to me that what we are after is an actual demonstration of a "paranormal" ability, one that we can confidently say did not happen by chance.

If I were the JREF, I wouldn't even countenance testing the kind of "ability" being claimed here, any more than I would test the ability of a master gambler. It's looking more and more like there is no actual "paranormal ability" on offer here, but only a calculated risk of successful guesswork.


M.

William Smith
2nd September 2008, 09:22 PM
I get really annoyed by the specious arguments put forward to re-jig the stats. It's perfectly obvious to me that what we are after is an actual demonstration of a "paranormal" ability, one that we can confidently say did not happen by chance.

If I were the JREF, I wouldn't even countenance testing the kind of "ability" being claimed here, any more than I would test the ability of a master gambler. It's looking more and more like there is no actual "paranormal ability" on offer here, but only a calculated risk of successful guesswork.


M.

...which would of course perfectly illustrate one reason why the MDC was discontinued: It has served its purpose. By attracting people making claims like the one Pavel makes.

catbasket
3rd September 2008, 02:18 AM
You were right the first time. ;) But let's hope that the JREF can finally muster the energy to test Pavel, even it it can't explain its odds standard.

Strange, could have sworn I recently read a thread which showed there is no odds standard. If only I could recall where that was ...

EHocking
3rd September 2008, 05:02 AM
Strange, could have sworn I recently read a thread which showed there is no odds standard. If only I could recall where that was ...er... I think it was the very first response to Rodney's OP in this thread?

Post 2 (http://forums.randi.org/showpost.php?p=3689771&postcount=2) Quote: GzuzKryzt
Experience suggests that in most cases, one in 1000 chances are reflected in the protocols.
Since every claim seems to be different, however, the JREF has not set the standard at a certain probability mark. It depends on the claim. No idea why it has taken 332 subsequent posts for Rodney not to see this point.

drkitten
3rd September 2008, 07:39 AM
I get really annoyed by the specious arguments put forward to re-jig the stats. It's perfectly obvious to me that what we are after is an actual demonstration of a "paranormal" ability, one that we can confidently say did not happen by chance.

I think I have more sympathy for the pro-paranormalists (in general) than you have. I think it's obvious to anyone with the sense of a cucumber that if the paranormal exists, it's a very subtle thing. I can't simply scratch names on onions and see whom I'm going to marry or spin a set of coins to tell me if I will get a better job next year.

But it's also obvious to those same cucumbers that the real world is very subtle; I can't just walk into a pharmacy and grab a drug to make me feel better, which is why medical school takes several years. So it's obvious to any thinking paranormalist (which I admit is a very small group) that demonstrating that the paranormal exists will require a very deft touch and a very sensitive experiment.

A major research project, in fact.

The flip side of that is that such projects are expensive to run. I could easily burn through a million bucks in a year testing remote vision. Well, I could , if I had that million bucks -- but the NSF wouldn't touch that proposal with a hayfork. The only group that appears willing to front large amounts of research money for this purpose is the JREF.

Having said that, that's not what the JREF does, and it's no more willing to shell out a million for psychic research than the NSF. But I think people get more upset about their misunderstanding of the JREF than their (true) understanding of the NSF.

I think Rodney, in particular, is misunderstanding what the JREF does. And it ticks him off that an organization run by a magician is more into showmanship and fraud detection than it is into bench science. Poor thing.

petre
3rd September 2008, 07:44 AM
So what do you suppose is the lowest standard that the JREF would accept in an odds-based preliminary test? In Carina Landin's case, it was 1:170. Would they go as low as 1:100? 1:50? 1:10? Or what???

Though I have little in the way of evidence (most of what is available in that direction has been presented already), I suggest it is quite likely the JREF is willing to lower the bar (either on odds for the prelim test, or skipping it entirely) based on the media presence and employment history of the applicant. For someone like Pavel, who has been pretty honest and fair in dealing with other people, the bar's a bit rigid and unlikely to move. He appears to be almost unknown (at least as far as his ability) outside of people he's had personal contact with. Ms. Landin (from memory) did a fair bit of reading for folks and had a small, but established, following. For someone like Sylvia Browne, I imagine JREF would be willing to go as far as skipping the preliminary and maybe even accepting 1:100 odds, though only they could say for sure (Sylvia has only to ask...).

If Pavel had spent the last 10 or 30 years attempting to make a living from his ability, perhaps even overstating it often in order to dupe foolish people into giving him lots of money, writing (or ghost-writing) several "best-selling" books with a lot of pages but not much content, and appearing on many media outlets that feel using words like "maybe" and "perhaps" absolves them of any journalistic responsibility when reporting sensationalism as fact, Pavel might find it easier now to get JREF to give a little for the chance to expose such a history for the fraud it is.

You may find it an unfortunate inequality, but when the world looks at Sylvia Brown more like it regards Pavel, the JREF may be more willing to treat them as equals as well.

Rodney
3rd September 2008, 10:00 AM
I think Rodney, in particular, is misunderstanding what the JREF does.
The JREF contributes to that misunderstanding by stating:

"The Foundation is committed to providing reliable information about paranormal claims. It both supports and conducts original research into such claims.

"At JREF, we offer a one-million-dollar prize to anyone who can show, under proper observing conditions, evidence of any paranormal, supernatural, or occult power or event."

If you're right, that language should be changed to:

"The Foundation is committed to providing reliable information about some paranormal claims, such as the claims of psychic celebrities who appear on TV. It both supports and conducts original research into such claims, but does not have the time or inclination to more seriously investigate the paranormal.

"At JREF, we offer a one-million-dollar prize to anyone who can show, under proper observing conditions, evidence of a limited number of paranormal, supernatural, or occult powers or events."

Rodney
3rd September 2008, 10:03 AM
For someone like Sylvia Browne, I imagine JREF would be willing to go as far as skipping the preliminary and maybe even accepting 1:100 odds, though only they could say for sure (Sylvia has only to ask...)
Let's suppose the JREF were dumb enough to do that. What would it prove that someone managed to beat 1:100 odds?

Moochie
3rd September 2008, 10:06 AM
I think I have more sympathy for the pro-paranormalists (in general) than you have. I think it's obvious to anyone with the sense of a cucumber that if the paranormal exists, it's a very subtle thing. I can't simply scratch names on onions and see whom I'm going to marry or spin a set of coins to tell me if I will get a better job next year.

It's a claim some people make. Opinion.

But it's also obvious to those same cucumbers that the real world is very subtle; I can't just walk into a pharmacy and grab a drug to make me feel better, which is why medical school takes several years. So it's obvious to any thinking paranormalist (which I admit is a very small group) that demonstrating that the paranormal exists will require a very deft touch and a very sensitive experiment.

A major research project, in fact.Good luck with that.

The flip side of that is that such projects are expensive to run. I could easily burn through a million bucks in a year testing remote vision. Well, I could , if I had that million bucks -- but the NSF wouldn't touch that proposal with a hayfork. The only group that appears willing to front large amounts of research money for this purpose is the JREF.The JREF is researching the paranormal? Since when?

Having said that, that's not what the JREF does, and it's no more willing to shell out a million for psychic research than the NSF. But I think people get more upset about their misunderstanding of the JREF than their (true) understanding of the NSF.Glad you clarified that. :)

I think Rodney, in particular, is misunderstanding what the JREF does. And it ticks him off that an organization run by a magician is more into showmanship and fraud detection than it is into bench science. Poor thing.Not sure I agree with that summation.


M.

pavel_do
3rd September 2008, 10:49 AM
...which would of course perfectly illustrate one reason why the MDC was discontinued: It has served its purpose. By attracting people making claims like the one Pavel makes.


Ye ye... this Pavel is such pain in the neck..:rolleyes: cause of him and others a like him. MDC has to be closed down...:dqueen

:wink8:

drkitten
3rd September 2008, 12:53 PM
Let's suppose the JREF were dumb enough to do that. What would it prove that someone managed to beat 1:100 odds?

The same thing that it proved if she managed to beat 1:10000000000000000000000 odds. That either something strange is going on, or she got really lucky. No statistical test will prove beyond a doubt that the paranormal exists.

The question is whether the risk of Sylvia being able to win (by chance) and walk away with JREF's money and bragging rights is worth the chance of being able to demonstrate with relative firmness that she has no paranormal powers. And that question is both risk-dependent and "not my call to make."

William Smith
3rd September 2008, 01:05 PM
Ye ye... this Pavel is such pain in the neck..:rolleyes: cause of him and others a like him. MDC has to be closed down...:dqueen

:wink8:

Obviously, you are no pain in the neck. You might wish, but you aren't.

As obviously, you are a semi-clever self-promoter who will very likely fail the MDC preliminary test and not follow up your wild claims with any substantial evidence.

I dare you to conduct proper research at an established facility. No odds discussion there.

Also: No million dollars for false postive longshot hail mary lucky punch.

fls
3rd September 2008, 01:11 PM
I dare you to conduct proper research at an established facility. No odds discussion there.

I was under the impression that Pavel already had been tested at a psychology department and had submitted those results (along with the affadavits of the researchers) to the JREF as part of the application?

Linda

Rodney
3rd September 2008, 01:19 PM
The same thing that it proved if she managed to beat 1:10000000000000000000000 odds. That either something strange is going on, or she got really lucky.
You really don't see the difference between 1:100 and 1:10000000000000000000000 odds?

No statistical test will prove beyond a doubt that the paranormal exists.
By that logic, no statistical test can prove beyond a doubt that anything exists. In the real world, however, once a phenomenon defies extraordinarily high odds, it is conceded by rational people to exist.

The question is whether the risk of Sylvia being able to win (by chance) and walk away with JREF's money and bragging rights is worth the chance of being able to demonstrate with relative firmness that she has no paranormal powers. And that question is both risk-dependent and "not my call to make."
The problem with that logic is that one test cannot possibly demonstrate that she has no paranormal powers and -- while beating odds of 1:100 by pure luck is unlikely -- it is not at all mind-boggling. Beating odds of at least 1:1000000 would be mind-boggling.

William Smith
3rd September 2008, 02:04 PM
I was under the impression that Pavel already had been tested at a psychology department and had submitted those results (along with the affadavits of the researchers) to the JREF as part of the application?

Linda

Since the JREF does not publish the content of said affidavits I assume the tests that lead to said conclusions did not meet rigorous controls. I base this assumption on experience with similar situations - not the least of which involved Uri Geller - and I am willing to be convinced otherwise.
However, I would not wish to drain the JREFs resources by submitting tangential inquiries.

By proper research I was first and foremost hinting strongly: Something with no million dollars as a dangling carrot. Only a Nobel Prize.

drkitten
3rd September 2008, 02:48 PM
You really don't see the difference between 1:100 and 1:10000000000000000000000 odds?

You've already demonstrated to my satisfaction that you have no knowledge of statistics. You needn't keep trying.


By that logic, no statistical test can prove beyond a doubt that anything exists.

Got it in one. That's one of the problems of statistics. A null hypothesis can never be proven, and it can never be disproven absolutely.


The problem with that logic is that one test cannot possibly demonstrate that she has no paranormal powers and -- while beating odds of 1:100 by pure luck is unlikely -- it is not at all mind-boggling. Beating odds of at least 1:1000000 would be mind-boggling.

Perhaps -- but that says more about your mind than it does about the world. Lots of people beat odds of 1:1000000. Is your mind boggled by the fact that lottery winners exist?

drkitten
3rd September 2008, 02:50 PM
The problem with that logic is that one test cannot possibly demonstrate that she has no paranormal powers

And here we see Rodney's fundamental problem. He admits that he can never be convinced that Sylvia has no paranormal powers, and he wonders why we won't simply let her adjust the odds until she can win?

Rodney
3rd September 2008, 06:43 PM
You've already demonstrated to my satisfaction that you have no knowledge of statistics. You needn't keep trying.
Feel free to submit this thread to an unbiased statistician and see if s/he thinks your arguments make more sense than mine.

Got it in one. That's one of the problems of statistics. A null hypothesis can never be proven, and it can never be disproven absolutely.
Which is technically true, but meaningless. Science -- and life in general -- is based on statistics, and when the odds become overwhelming enough, rational people accept or reject a hypothesis, instead of arguing something like: "We still don't know for sure whether Tiger Woods is a top golfer. He may just be extraordinarily lucky."

Perhaps -- but that says more about your mind than it does about the world. Lots of people beat odds of 1:1000000. Is your mind boggled by the fact that lottery winners exist?
No, because if there are only 1 million combinations in a lottery and tickets are sold for each of those combinations, it is inevitable that there will be a lottery winner. But that's completely different than a JREF applicant beating odds of one in a million.

drkitten
4th September 2008, 07:22 AM
Feel free to submit this thread to an unbiased statistician and see if s/he thinks your arguments make more sense than mine.

I have. I work in a university, in very close cooperation with several professional statisticians.

They think, not to put too fine a point on it, that you are ignorant to the point of dishonesty, although they use more professional language.



No, because if there are only 1 million combinations in a lottery and tickets are sold for each of those combinations, it is inevitable that there will be a lottery winner.

I see. So if only 999,999 tickets were sold, then it would be mind-blowing? (Excuse me, mind-boggling.)

But that's completely different than a JREF applicant beating odds of one in a million.

Shall I run this statement by the statisticians as well? It's a slow day and they would probably appreciate how badly they are needed in educating the general public.

Rodney
4th September 2008, 07:54 AM
I have. I work in a university, in very close cooperation with several professional statisticians.

They think, not to put too fine a point on it, that you are ignorant to the point of dishonesty, although they use more professional language.
So how about a specific critique by them? Better yet, have them join this forum and weigh in here, so everyone can benefit from their wisdom.

I see. So if only 999,999 tickets were sold, then it would be mind-blowing? (Excuse me, mind-boggling.)
If 999,999 different combinations were sold, leaving only 1 that wasn't sold, it would be far more mind-boggling if there were no lottery winner.

Shall I run this statement by the statisticians as well? It's a slow day and they would probably appreciate how badly they are needed in educating the general public.
Bring it on. :)

petre
4th September 2008, 08:27 AM
Let's suppose the JREF were dumb enough to do that. What would it prove that someone managed to beat 1:100 odds?

To who, me? That more investigation may be worthwhile? I believe JREF would take the 99:100 odds of showing Sylvia can't even beat a simple test. It would never happen though, Sylvia has far more to lose on that 99:100 side than she could gain on the 1:100 side.

drkitten
4th September 2008, 10:54 AM
So how about a specific critique by them? Better yet, have them join this forum and weigh in here, so everyone can benefit from their wisdom.

And, once again, the woos insist it is science's job to drop everything and disprove each individual nut-case idea for free.

If you want a professional statistician to critique your ideas, the American Statistical Association maintains a list of consultants; I believe the current rate is around $250/hour. If you have something useful, like an interesting (read: publishable) research project to bring to the table, they may be willing to work for less.

Rodney
4th September 2008, 05:59 PM
And, once again, the woos insist it is science's job to drop everything and disprove each individual nut-case idea for free.

If you want a professional statistician to critique your ideas, the American Statistical Association maintains a list of consultants; I believe the current rate is around $250/hour. If you have something useful, like an interesting (read: publishable) research project to bring to the table, they may be willing to work for less.
You might try re-reading your previous post (#351 on this thread). I was responding to what you said there.

drkitten
4th September 2008, 06:35 PM
I was responding to what you said there.

I know. Without thinking, as is usual.

You asked what a professional statistician would think of your "argument." Having shared your illiterate and innumerate rantings with some of my colleagues, I am in a position to answer you -- they think you're an idiot, an opinion they volunteered over the coffee pot.

You then asked for a "specific critique." If you want a consultancy report, consultancy cost money. They're busy professionals and have requests for critiques piling up on their desks; in fact, consultancy is a major supplement to their faculty salaries. But if you want a professional statistician to explain type I and type II errors to you (and how no statistical study can be entirely free of either type, and how the threshholds are set by the study designer based on externallities like percieved risk), I'm sure you can find one.

In fact, I even gave you the name of the relevant professional organization from which you can try to hire one. But, as I warned you, an hour's consultation will typically cost around $250.

Rodney
4th September 2008, 07:33 PM
I know. Without thinking, as is usual.

You asked what a professional statistician would think of your "argument." Having shared your illiterate and innumerate rantings with some of my colleagues, I am in a position to answer you -- they think you're an idiot, an opinion they volunteered over the coffee pot.

You then asked for a "specific critique." If you want a consultancy report, consultancy cost money. They're busy professionals and have requests for critiques piling up on their desks; in fact, consultancy is a major supplement to their faculty salaries. But if you want a professional statistician to explain type I and type II errors to you (and how no statistical study can be entirely free of either type, and how the threshholds are set by the study designer based on externallities like percieved risk), I'm sure you can find one.

In fact, I even gave you the name of the relevant professional organization from which you can try to hire one. But, as I warned you, an hour's consultation will typically cost around $250.
Before I pay even 25 cents, I would want to know whether the consultant believes -- as you evidently do -- that there is still some uncertainty about things that are obvious even to people who have never taken a statistics course -- such as whether Tiger Woods is a top golfer.

In any event, the subject of this thread is "Odds Standard for Preliminary Test", and it's rather obvious to anyone who has been reading that the JREF does not have a consistent position on that subject.

drkitten
4th September 2008, 08:26 PM
Before I pay even 25 cents, I would want to know whether the consultant believes -- as you evidently do -- that there is still some uncertainty about things that are obvious even to people who have never taken a statistics course -- such as whether Tiger Woods is a top golfer.

Then you'll never find a consultant to pay, because any qualified statistician will point out that every statistical test involves non-zero alpha and beta cutoffs, and thus there is uncertainty involved.

In fact, you don't even need to get the professionals involved for that one. Anyone who took and passed baby stats can give you the same answer for a lot less money.


In any event, the subject of this thread is "Odds Standard for Preliminary Test", and it's rather obvious to anyone who has been reading that the JREF does not have a consistent position on that subject.

On the contrary, they have a very consistent position on that subject. Their position --- forthrightly stated --- is that there are no standards and that odds are negotiated on a case-by-case basis.

Has it really taken you nearly four hundred posts before you realized that the JREF are serious about that?

EHocking
4th September 2008, 11:41 PM
Has Will it really taken you nearly over four hundred posts before you realized that the JREF are serious about that?
Small correction that I think better reflects his position.

Moochie
5th September 2008, 07:16 AM
<snip>



In any event, the subject of this thread is "Odds Standard for Preliminary Test", and it's rather obvious to anyone who has been reading that the JREF does not have a consistent position on that subject.

And the reasons for that have been explained to you.


M.

Garrette
5th September 2008, 08:02 AM
And the reasons for that have been explained to you.


M.I think drkitten's distinction is crucial here, and we shouldn't muddy it.

The JREF certainly does have a consistent position. They do not, however, have a consistent standard. They are being entirely reasonable, fair, and scientific in approaching it this way.

MattC
8th September 2008, 12:19 AM
The issue seems much simpler than nine pages of mathematical analysis would convey.

Rodney's original question was "what is needed to pass the preliminary test?", and that seems to have morphed into "what percentage successes are needed to pass the preliminary test?" Questions like these that reference generality for what is really a very subjective and specific response tend to be notoriously difficult to answer effectively, hence nine pages of bewildering mathematical arguments.

The odds the JREF applies seems to be related to the odds of success of the applicant. If there are odds involved, then the JREF naturally desires to make them as high as possible - they want to be certain that the applicant can, in fact, do what he says he can. The applicant desires to make them as loose as possible, wanting to win a million dollars (as Mr. Randi once put it, "a million dollars for less than an hour's work"). Through negotiation, both sides arrive at odds each consider acceptable.

It is important to note that odds are dependent upon the maximum amount of possible trials - saying things like "one in a thousand" implies that one thousand trials can be rigorously conducted, which in some cases is impractical (re: Landin, Jagannathan, Hubinsky) and in others not necessary (re: Yahweh, Hunter, Williams).

After an analysis of what is required for the test and what is available, both sides will work together to determine what odds the candidate must meet to pass the preliminary test. Placing a formal "one in one thousand" meter upon all trials implies that this is practical to do, which in several of the cases listed above it was simply not the case (anyone desiring to conduct a trial along these lines may freely find one thousand "letters written by persons now dead, whose receiver must be alive").

(I confess to being very new here, as my post count surely suggests, but as far as I understand the JREF's system of setting odds is listed above.)

~ MattC

drkitten
8th September 2008, 06:44 AM
It is important to note that odds are dependent upon the maximum amount of possible trials - saying things like "one in a thousand" implies that one thousand trials can be rigorously conducted, which in some cases is impractical (re: Landin, Jagannathan, Hubinsky) and in others not necessary (re: Yahweh, Hunter, Williams).

Er, not at all.

Saying things like "one in a thousand" simply implies a calculated probability of 0.001; it makes no implications on the number of trials that can be rigorously conducted. If I remember Yahweh's claim correctly --- he was the one that was going to produce snow in Berkeley in August? -- we can calculate the odds of that happening simply by examining weather records. Since we've got something like 100 years of weather records for Berkeley CA, and 31 days of August weather for each year, a statistician can easily calculate the maximum probability of snow-in-August (and as you expect, it works out to be something less than 1 in 3100).

Similarly, if I claim to be able to tell you the exact order a (shuffled) deck of cards is in by
pendulum dowsing or something like that, my odds are 1 in 52!, again by direct calculation. But I only need one deck to do that.

In the case of "letters written by persons living or dead," assuming I can only tell you "living" or "dead" about the letter, I can hit odds of 1:1000 with only ten letters if I can get them all right. The problem is when I can't get them all right; if I can only get 70% correct, it will take many many more letters to allow for the possibility of errors.

MattC
8th September 2008, 07:13 AM
I'll take your word for it and edit the post accordingly, you know more about statistics than I and it's best that my ramblings not be permitted to color the issue. Thank you for your clarification.

(edit: ... well, it seems I can't edit my own post. drkitten is right and should be considered so.)

~ MattC

Rodney
8th September 2008, 06:02 PM
If there are odds involved, then the JREF naturally desires to make them as high as possible - they want to be certain that the applicant can, in fact, do what he says he can.
The crucial point you (and so many others here) are missing is that, if an applicant is not claiming spectacular results, to even tentatively establish that (s)he has a paranormal power will require a longer preliminary test than the JREF seems inclined to give. In Pavel's case, for example, he is claiming 70% accuracy when 50% accuracy would be expected by chance. However, if he performs at exactly a 70% hit rate over 40 trials, that will not beat odds of 1:1000, and so the JREF apparently would regard that as a failure of the preliminary test.

drkitten
8th September 2008, 07:10 PM
The crucial point you (and so many others here) are missing is that, if an applicant is not claiming spectacular results, to even tentatively establish that (s)he has a paranormal power will require a longer preliminary test than the JREF seems inclined to give.

We're not missing it.

We simply don't care. It would be one thing if Pavel (for example) had some sort of colorable right to attempt to win Randi's million, and he were being unjustly deprived of that right. But Randi is not in the business of testing subtle results and tentatively establishing the existence of paranormal powers.

Randi is in the business of exposing frauds and charlatans.

You might as well complain that there's something wrong with the local bus company because it won't take me from Atlanta to Toronto. It's not something wrong with the local bus company, but with the person making the complaint. If you want to go from Atlanta to Toronto, there are long-distance bus companies, there are airplanes, there are trains and there are car rental agencies.

Pavel is fundamentally standing at the wrong counter.

fls
8th September 2008, 07:33 PM
The crucial point you (and so many others here) are missing is that, if an applicant is not claiming spectacular results, to even tentatively establish that (s)he has a paranormal power will require a longer preliminary test than the JREF seems inclined to give.

I think that what you are missing, is that that is the intention. The claims that prompted the MDC weren't those that are almost imperceptible or not particularly spectacular. It was those individuals held up as remarkable that drew Randi's attention. This translates to at least a large effect size instead of the medium effect size Pavel is claiming.

Linda

Rodney
9th September 2008, 08:22 AM
I think that what you are missing, is that that is the intention. The claims that prompted the MDC weren't those that are almost imperceptible or not particularly spectacular. It was those individuals held up as remarkable that drew Randi's attention. This translates to at least a large effect size instead of the medium effect size Pavel is claiming.

Linda
What both you and drkitten seem to be saying is: "Yes, there may well be such a thing as the paranormal, but the JREF really doesn't care about that -- it only cares about exposing well-known psychic frauds." And yet, as recently as this year, Randi stated:

"It was March 6th, 1998, when the JREF Million-Dollar Challenge first came into existence. That’s almost ten years ago. It’s always been a simple, direct, matter: do what you claim you can do of a paranormal nature, and walk away with the prize." See http://www.randi.org/joom/swift/swift-january-4-2008-7.html#i4

So, if you two are correct, why doesn't the JREF stop the charade and admit that the MDC is designed only for a very limited purpose and folks like Pavel should not apply?

fls
9th September 2008, 09:09 AM
What both you and drkitten seem to be saying is: "Yes, there may well be such a thing as the paranormal, but the JREF really doesn't care about that -- it only cares about exposing well-known psychic frauds." And yet, as recently as this year, Randi stated:

"It was March 6th, 1998, when the JREF Million-Dollar Challenge first came into existence. That’s almost ten years ago. It’s always been a simple, direct, matter: do what you claim you can do of a paranormal nature, and walk away with the prize." See http://www.randi.org/joom/swift/swift-january-4-2008-7.html#i4

So, if you two are correct, why doesn't the JREF stop the charade and admit that the MDC is designed only for a very limited purpose and folks like Pavel should not apply?

How is that a charade? It's always had a very limited purpose - challenging those who claim they can do something that is obviously amazing, not those who claim to get lucky every once in a while. Pavel thought he could do something amazing and it turned out that he over-estimated his abilities. That's not Randi's fault.

Linda

steenkh
9th September 2008, 09:09 AM
And yet, as recently as this year, Randi stated:

"It was March 6th, 1998, when the JREF Million-Dollar Challenge first came into existence. That’s almost ten years ago. It’s always been a simple, direct, matter: do what you claim you can do of a paranormal nature, and walk away with the prize." See http://www.randi.org/joom/swift/swift-january-4-2008-7.html#i4

So, if you two are correct, why doesn't the JREF stop the charade and admit that the MDC is designed only for a very limited purpose and folks like Pavel should not apply?
I stressed the important part of Randi's statement. You are proposing that the MDC should be everything but a "simple, direct, matter".

People whose abilities tend to get lost in statistical noise should perhaps stop the charade of claiming that they have these abilities.

drkitten
9th September 2008, 09:19 AM
What both you and drkitten seem to be saying is: "Yes, there may well be such a thing as the paranormal, but the JREF really doesn't care about that -- it only cares about exposing well-known psychic frauds." And yet, as recently as this year, Randi stated:

"It was March 6th, 1998, when the JREF Million-Dollar Challenge first came into existence. That’s almost ten years ago. It’s always been a simple, direct, matter: do what you claim you can do of a paranormal nature, and walk away with the prize." See http://www.randi.org/joom/swift/swift-january-4-2008-7.html#i4

So, if you two are correct, why doesn't the JREF stop the charade and admit that the MDC is designed only for a very limited purpose and folks like Pavel should not apply?

Because it's not a charade.

Why are you trying to make Randi and the JREF into something that they have never been, never wanted to be, and never claimed to be?

(Answer : because it makes you feel good to lie about the JREF, because it lets you keep your delusions intact that the paranormal exists. The simple fact is that no competent researcher has ever found reliable evidence for the paranormal, but it looks better if you pick on Randi, who isn't a researcher at all and has never claimed to be.)

Czarcasm
9th September 2008, 11:02 AM
What both you and drkitten seem to be saying is: "Yes, there may well be such a thing as the paranormal, but the JREF really doesn't care about that -- it only cares about exposing well-known psychic frauds." And yet, as recently as this year, Randi stated:

"It was March 6th, 1998, when the JREF Million-Dollar Challenge first came into existence. That’s almost ten years ago. It’s always been a simple, direct, matter: do what you claim you can do of a paranormal nature, and walk away with the prize." See http://www.randi.org/joom/swift/swift-january-4-2008-7.html#i4

So, if you two are correct, why doesn't the JREF stop the charade and admit that the MDC is designed only for a very limited purpose and folks like Pavel should not apply?
Pavel's original claim certainly was incredible enough to qualify-it's only when he was confronted by the reality that he would have to back up his claims in a specific fashion that he slowly start muddying the waters with vague and complicated changes. Compare his originally simple claim on the first page with the protocol(such as it is) as it stands now.

Rodney
9th September 2008, 06:41 PM
Pavel's original claim certainly was incredible enough to qualify-it's only when he was confronted by the reality that he would have to back up his claims in a specific fashion that he slowly start muddying the waters with vague and complicated changes. Compare his originally simple claim on the first page with the protocol(such as it is) as it stands now.
Are you talking about his claim that ". . . i Have 80 PHOTOS ( not cards) its 80 different symbols like picture of horse..key..cross..etc.. they all unique .. so i have 80 of them.. and i am proposing to 'guess' minimum 3 out of ten that will be puled out of 80"? In any event, getting 70% correct when 50% would be expected by chance (as Pavel is now claiming) may not appear spectacular, but it would take fewer than 70 trials to beat odds of 1:1000 at that hit rate.

fls
10th September 2008, 03:34 AM
Are you talking about his claim that ". . . i Have 80 PHOTOS ( not cards) its 80 different symbols like picture of horse..key..cross..etc.. they all unique .. so i have 80 of them.. and i am proposing to 'guess' minimum 3 out of ten that will be puled out of 80"? In any event, getting 70% correct when 50% would be expected by chance (as Pavel is now claiming) may not appear spectacular, but it would take fewer than 70 trials to beat odds of 1:1000 at that hit rate.

His first contact was through the YouTube videos (http://www.youtube.com/profile?user=pavelprorok). What he chose to show there was a 100% hit rate.

Linda

DevilsAdvocate
21st September 2008, 10:58 PM
His first contact was through the YouTube videos (http://www.youtube.com/profile?user=pavelprorok). What he chose to show there was a 100% hit rate.

LindaThanks for the link to that video. I had not seen it. It was surprising to me. It came across to me far, far, far more as a magic trick than the paranormal demonstration I was expecting.

Why does he shuffle empty envelopes? Why does he look directly in front of him while handling the photos and envelopes instead of looking at his hands like a normal person would? It is almost as though he were looking into a mirror or a screen showing back the live video.

And most importantly, as he places the photos in the envelopes, why does he twist each and every one to expose a significant portion of the image to the camera (or mirror or person or whatever or whoever is in front of him)? :boggled:

Rodney
29th November 2009, 07:28 AM
On May 10, 2008, I sent the following e-mail to challenge@randi.org

I recently initiated the following thread on the Million Dollar
Challenge Forum --
http://forums.randi.org/showthread.p...18#post3692318

What I argue on that thread is that: (a) In tests where the odds of
success can be readily calculated, it is unclear what odds standard
must be met; and (b) It is unclear whether time-consuming protocols,
such as Ganzfeld experiments, are eligible for the Challenge.
Therefore, I recommend that something along the lines of the following
be added to the Challenge Rules:

"(1) An applicant must pass a preliminary test, in which the general
criterion for success will be that the applicant must perform at
significantly above the chance level. In tests where the odds of
success can be readily calculated -- such as numbers guessing -- the
applicant must perform at least at the P=.001 level; that is, the odds
must be only one in one thousand that the applicant could have
achieved that performance level by random chance. (However, if the
applicant achieves a lesser, but above chance, performance level in a
limited number of tests -- for example, if the applicant performs at
the P=.05 level in 20 trials -- the preliminary test may be extended
on a different day or days to include more trials.) If the applicant
passes the preliminary test, a final test will be administered, in
which the performance level must meet a significantly more stringent
criterion for the million dollar prize to be awarded. In tests where
the odds of success can be readily calculated, the applicant must
perform at least at the P=.000001 level; that is, for the prize to be
awarded, the odds must be only one in one million that the applicant
could have achieved that performance level by random chance.

"(2) All protocols, including time-consuming ones such as Ganzfeld
experiments, are eligible for the Challenge; or

"(2a) Some time-consuming protocols, such as Ganzfeld experiments, are
not eligible for the Challenge due to the impact on JREF resources."

If you wish, you may respond to these questions on the above thread.

Thank you,

Rodney
__________________________________________________ ________________________

On February 3, 2009, I attached the above e-mail to a follow-up e-mail:

I did not receive a response to the below inquiry. Some Randi Forum participants suggested that I jog your memory about it.

Regards,

Rodney
__________________________________________________ ________________________

The next day Jeff Wagg responded as follows:

Hello Rodney,

Thanks for the suggestions. So you know, the challenge rules are being reconsidered, and we'll take your suggestions into account.

If we do make changes, they'll be posted publicly.

Jeff
__________________________________________________ _________________________
While it was nice of Jeff to respond to my follow-up e-mail, my suggestions were never taken into account. The JREF Challenge Rules were not changed to accommodate those concerns (See http://www.randi.org/site/index.php/...plication.html), nor did I ever receive a "detailed and official answer" as to why the JREF Challenge Rules could not be so clarified. Thus, to this day, it remains unclear what odds standard must be met and whether time-consuming protocols are eligible for the Challenge.

Darat
29th November 2009, 07:30 AM
...snip...

While it was nice of Jeff to respond to my follow-up e-mail, my suggestions were never taken into account.

...snip...

How do you know this?

Rodney
29th November 2009, 07:40 AM
How do you know this?
Because the MDC rules weren't changed in accord with my suggestions, and I received no explanation as to why not.

Darat
29th November 2009, 07:48 AM
Because the MDC rules weren't changed in accord with my suggestions, and I received no explanation as to why not.

Your reasoning is incorrect, that they did not make changes as you suggested does not mean that they did not take your suggestions into account.

William Smith
29th November 2009, 09:33 AM
Because the MDC rules weren't changed in accord with my suggestions, and I received no explanation as to why not.

Why do you think the JREF should explain to you why the MDC rules weren't changed in accord with your suggestions?

fls
29th November 2009, 11:55 AM
Thus, to this day, it remains unclear what odds standard must be met and whether time-consuming protocols are eligible for the Challenge.

I think the problem is that you look at the Challenge as a test for paranormal effects, so you make suggestions that are suitable for such a test. Instead, think of the Challenge as a publicity stunt. The purpose is not to discover paranormal abilities, but rather to put a public face on what it means to be skeptical and to examine claims.

If you look at it in that regard, it becomes obvious that tests which are tedious, uninteresting to watch, and show effects which, even if established to be remarkable at greater than 1000 to 1 odds, are so small that no one goes "Wow" when they see it, will not be suitable for the Challenge. It needs to be set up so that it is obvious to the casual observer when something remarkable or unremarkable has happened. And also for this reason, the odds cannot be set in advance so that Randi can maintain flexibility in the design of the experiment.

Anita's recent test is a good example of something which isn't suitable for the Challenge. The test was set up in such a way that she had a good chance of getting at least one correct answer and of providing answers that would be perceived as hits to the casual observer. You can see that some people were unable to resist the temptation of considering her choice of the right person and her purported sensation of certainty as somehow indicative of an effect (even if neither were part of the formal test). On the other hand, the PEAR testing influencing random outcomes, showed outcomes which were far more unlikely than Anita's, yet who would see that one excess hit for every 10,000 trials and argue that they saw an effect?

Think of it this way - it has to play well on Youtube.

Linda

MattC
29th November 2009, 12:25 PM
Bear in mind, Mr. Rodney, when finagling with the rules that it is not JREF resources which are unduly taxed by the application procedure outside of the negotiation stage - the applicant pays the costs of the procedure, while some other organization handles the actual test. I don't know what fees those organizations charge, if any, but from personal history with non-skeptical organizations of about the same size these fees would be in the four digits and would certainly present a roadblock to any expedient test being conducted.

I think ultimately it is this difficulty that prohibits your emendations from being codified - the difficulty placed on other organizations (not to mention the applicant) would be too great for many of them to bear. Creating a list of organizations capable of bearing the costs of any extended procedure would both dramatically limit the scope of the Challenge and greatly restrict the available applicant pool, hardly ideal for a comparatively small organization focused upon education. Skeptics and their organizations seem to be judged upon knowledge accumulated rather than financial werewithal.

Your point about more stringent codification of the odds is one I am in favor of at least on the theoretical level, but it seems practically difficult to enforce adequately. It would certainly be more scientific, but generally the more stringent the controls the more expensive the test - I don't think it reasonable for an applicant to bear the costs of a fully scientific procedure (which would be quite expensive, judging by some of the grants my university obtains to do that degree of research we're looking in the mid-upper six figures). Also, bear in mind that perfect scientific exclusion is not altogether practical in many of these experiments for ethical concerns - I'd love to forcibly isolate a Ganzfeld participant for the duration of a p=0.001 test, for example, but I imagine that few people would volunteer for it once the requirements were explained to them.


I think the problem is that you look at the Challenge as a test for paranormal effects, so you make suggestions that are suitable for such a test. Instead, think of the Challenge as a publicity stunt. The purpose is not to discover paranormal abilities, but rather to put a public face on what it means to be skeptical and to examine claims.

No. If it isn't a test for the paranormal, what purpose does it serve? A magician doesn't need to put a million dollars on the line to garner what is essentially cheap publicity available for the cost of a webcam - judging by the proliferation of amateur magicians and skeptics on YouTube, many have come up with this same idea (and have found parents willing to buy them the webcam as well). Given that many of the actual tests conducted for the Challenge were not widely publicized (whether on YouTube or otherwise) I am not sure this proclamation holds water. Ms. Hunter's case possessed some absurdist elements that made it a worthy spectacle, while Ms. Sonne's test was broadcast as part of a larger event - the decision to broadcast may depend upon noteworthy features, but the decision to test is not.

Science ultimately tries to prove things by exclusion, meaning that implicit or potential effects are removed by experimentation over the long term. If, for example, I set up a test to determine whether or not I can psychically detect someone's hair color when they were in an adjoining room, it's possible that I could simply guess right each time. If we do three trials of this and I get three hits, blind guessing is both a very normal solution and a very possible one. If we increase the number of trials, guesswork becomes increasingly unlikely and will eventually pass the point of statistical likelihood. This doesn't mean it isn't possible (over a thousand trials it's still possible that I could guess a significant percentage right), merely that it's so improbable that it is a functionally insignificant probability - but, because it is a probability, there's always that slim chance of success. Blind luck cannot be controlled for.

The ultimate goal of the Challenge is to control out the mundane explanations and trickery that could be causing a supposedly paranormal event. The decision to broadcast a Challenge testing to the world at large is not made upon test design criteria, rather they are made upon features of the claim (as in Ms. Hunter's case) or done as a subsidiary of a larger broadcasting initiative (Ms. Sonne's test at TAM).

~ Matt

William Smith
29th November 2009, 01:57 PM
Reading MattC's post, would it be completely out of line to ask you, Mr. Rodney, if you could accept the limitations of the JREF Challenge and just set up a Ganzfeld test elsewhere?

Yes, we know it's not easy.

Perhaps you could draw courage from the fact that you would be actually doing something, creating reality, if you will, other than meandering about the limitations you very well know and understand.

Rodney
29th November 2009, 03:37 PM
I think the problem is that you look at the Challenge as a test for paranormal effects, so you make suggestions that are suitable for such a test. Instead, think of the Challenge as a publicity stunt.
I commend you for your astuteness. I think MDC Rule Number 1 should be: "The first and most important rule is that the Challenge is only a publicity stunt."

fromdownunder
29th November 2009, 08:43 PM
I commend you for your astuteness. I think MDC Rule Number 1 should be: "The first and most important rule is that the Challenge is only a publicity stunt."

Well it is a marketing device. I doubt that anyone thinks otherwise. But that being said, how are "odds" relevant for all claims.

Let's say my claim is "I can levitate" This is a paranormal claim, and the odds are irrelevant. I either can or I can't (In fact a can/can't preliminary test actually took place last year).

There are no odds to calculate. So, in a rules for the claim, specific odds become irrelevant. Because all claims are different, there can be no specific "beat the odds" thingy included. Some of the proposed challenges simply do not work that way.

And as has been said, perhaps your suggestions were considered, found wanting, and not included for that reason.

Norm

MattC
29th November 2009, 09:22 PM
Let's say my claim is "I can levitate" This is a paranormal claim, and the odds are irrelevant. I either can or I can't (In fact a can/can't preliminary test actually took place last year).

There are no odds to calculate. So, in a rules for the claim, specific odds become irrelevant. Because all claims are different, there can be no specific "beat the odds" thingy included. Some of the proposed challenges simply do not work that way.

It can become more complex (e.g. "I can't do it all the time"), but in general there are some cases where odds generally aren't required nor can they be specific, particularly as Mr. Rodney seems to be pushing for. If this levitation process requires three hours of intensive meditation before you can manage it, doing more than one trial seems quite an imposition upon whomever we ask to observe the event. While undoubtedly someone could be found to shamelessly observe twenty repetitions of the same levitation, there comes a limit as to what most people are willing to volunteer for. Given that neutrality of volunteers is important it makes sense to reduce the demands of the protocol in favor of getting an actual test off the ground so long as experimental efficacy is not compromised.


Reading MattC's post, would it be completely out of line to ask you, Mr. Rodney, if you could accept the limitations of the JREF Challenge and just set up a Ganzfeld test elsewhere?

The only real stumbling block I can see to establishing a Ganzfeld test would be to find the space - you'd need at least two rooms and permission to soundproof them (if they weren't already). More importantly you'd need them for a fair bit of time, which leads me to think that getting two motel rooms some distance apart (at least four units I'd think) and setting up the equipment there wouldn't be so difficult. Paying for them would naturally be the responsibility of the applicant.

~ Matt

Rodney
30th November 2009, 06:17 AM
It can become more complex (e.g. "I can't do it all the time"), but in general there are some cases where odds generally aren't required nor can they be specific, particularly as Mr. Rodney seems to be pushing for.
I agree, which is why I suggested the following language: "An applicant must pass a preliminary test, in which the general criterion for success will be that the applicant must perform at significantly above the chance level. In tests where the odds of success can be readily calculated (emphasis added) -- such as numbers guessing -- the applicant must perform at least at the P=.001 level; that is, the odds must be only one in one thousand that the applicant could have achieved that performance level by random chance. (However, if the applicant achieves a lesser, but above chance, performance level in a limited number of tests -- for example, if the applicant performs at the P=.05 level in 20 trials -- the preliminary test may be extended on a different day or days to include more trials.)"

MattC
30th November 2009, 07:02 AM
http://www.automeasure.com/chance.html

This web site suggests that performing at a 1/1000 level in a test with five trials requires 0-2 successes - mighty good odds. I'd be willing to apply for a number guessing trial given that hit list, a pack of cards is cheap and the potential rewards quite worth it.

~ Matt

steenkh
30th November 2009, 07:12 AM
In tests where the odds of success can be readily calculated (emphasis added) -- such as numbers guessing -- the applicant must perform at least at the P=.001 level; that is, the odds must be only one in one thousand that the applicant could have achieved that performance level by random chance.
The recorded history of the MDC seems to indicate that the JREF practically always offers better chances for success than this.

(However, if the applicant achieves a lesser, but above chance, performance level in a limited number of tests -- for example, if the applicant performs at the P=.05 level in 20 trials -- the preliminary test may be extended on a different day or days to include more trials.)"A prime focus for the JREF has been to keep tests short and simple. It would be counterproductive to extend short tests until the P=.001 had been reached. It must be up to the claimant to demand a sufficient number of tests if he thinks his level of certainty is close to random chance.

William Smith
30th November 2009, 07:18 AM
Rodney, why do you think the JREF should explain to you why the MDC rules weren't changed in accord with your suggestions?

Rodney
30th November 2009, 08:03 AM
http://www.automeasure.com/chance.html
Your link didn't work for me.

Rodney
30th November 2009, 08:10 AM
Rodney, why do you think the JREF should explain to you why the MDC rules weren't changed in accord with your suggestions?
For one thing, you asserted in another context: "If you want a detailed and official answer, please contact challenge@randi.org." See http://forums.randi.org/showthread.php?t=160637

Second, I doubt if the JREF receives more than a few e-mails each year suggesting general modifications to the rules. So why not a brief response to the few inquiries that they do receive?

MattC
30th November 2009, 02:44 PM
http://www.automeasure.com/chance.html

With the best of fortune, this link should work. Without that, having enjoyed several conversations about this issue, pray let me expound on why I think it's theoretically good (having some sort of standard on hand is good) but practically unsound (it couldn't be applied all the time, which is contrary to the definition of "standard").

I think a large part of the difficulty here between both parties can be focused entirely upon the "random chance" phrase and its persistent inclusion. If we were to enforce a 1/1000 standard, breaking this would be quite simple within a few trials precisely because we wouldn't be randomly guessing - guessing according to a pattern or figures is certainly practical, but perhaps an example might aid my case.

I claim that I can tell someone's hair color at a better-than-chance rate if they are in an adjoining room. Over 10 trials of this, the web site I keep attempting to link suggests that any number of successes greater than two (table 3 has this information) would not be attributable to "random chance," but this operates under the presumption that I am actually randomly guessing. Were I to be located in Europe, where according to Frost (http://cogweb.ucla.edu/ep/Frost_06.html) black hair (or some shade thereof) is comparatively common, beating these odds by simply guessing "black" on every trial becomes a real possibility - and certainly not a paranormal one. Were the JREF unaware of this predominance, dire circumstances could result from this adherence to a set odds protocol. Further, were I actually inclined to apply, I would view the ability to set my own criteria for success as a sign of the JREF's intention to fairly investigate my claim.

The ultimate problem with mathematical determinations of "random chance" is that mathematical purity rarely translates very well to muddied reality. In the sciences it is commonly considered to be a guide that something is causing a significant result, but what that "something" is cannot be determined by simply "beating the odds" - luck can never be controlled for. If you truly desire to test something for the paranormal, set the odds for success much higher to forcibly exclude most cases of percentage-based guessing (as I employed above) and a majority of luck-oriented factors.

Wisely, in my opinion, the JREF does so.


Second, I doubt if the JREF receives more than a few e-mails each year suggesting general modifications to the rules. So why not a brief response to the few inquiries that they do receive?

This is a forum ultimately designed to facilitate debate, and the participants here are not hesitant to engage in such behavior. A "brief response" would, judging by a brief analysis of the forums, snowball into a much bigger response that would serve little purpose.

~ Matt

Rodney
30th November 2009, 06:20 PM
http://www.automeasure.com/chance.html
Thanks. The link works now.

With the best of fortune, this link should work. Without that, having enjoyed several conversations about this issue, pray let me expound on why I think it's theoretically good (having some sort of standard on hand is good) but practically unsound (it couldn't be applied all the time, which is contrary to the definition of "standard").

I think a large part of the difficulty here between both parties can be focused entirely upon the "random chance" phrase and its persistent inclusion. If we were to enforce a 1/1000 standard, breaking this would be quite simple within a few trials precisely because we wouldn't be randomly guessing - guessing according to a pattern or figures is certainly practical, but perhaps an example might aid my case.

I claim that I can tell someone's hair color at a better-than-chance rate if they are in an adjoining room. Over 10 trials of this, the web site I keep attempting to link suggests that any number of successes greater than two (table 3 has this information) would not be attributable to "random chance," but this operates under the presumption that I am actually randomly guessing. Were I to be located in Europe, where according to Frost (http://cogweb.ucla.edu/ep/Frost_06.html) black hair (or some shade thereof) is comparatively common, beating these odds by simply guessing "black" on every trial becomes a real possibility - and certainly not a paranormal one. Were the JREF unaware of this predominance, dire circumstances could result from this adherence to a set odds protocol. Further, were I actually inclined to apply, I would view the ability to set my own criteria for success as a sign of the JREF's intention to fairly investigate my claim.

The ultimate problem with mathematical determinations of "random chance" is that mathematical purity rarely translates very well to muddied reality. In the sciences it is commonly considered to be a guide that something is causing a significant result, but what that "something" is cannot be determined by simply "beating the odds" - luck can never be controlled for. If you truly desire to test something for the paranormal, set the odds for success much higher to forcibly exclude most cases of percentage-based guessing (as I employed above) and a majority of luck-oriented factors.
Individually-designed protocols should be able to ensure that each applicant is meeting a .001 odds standard in tests where the odds of success can be readily calculated. The problem with things the way they are now is that there is no uniformity. In Pavel's cases, after undergoing endless negotiations with the JREF, he was summarily informed that he must score 100% on his preliminary test to pass -- when Pavel himself had consistently stated that his paranormal ability is less than perfect.

Wisely, in my opinion, the JREF does so.

This is a forum ultimately designed to facilitate debate, and the participants here are not hesitant to engage in such behavior. A "brief response" would, judging by a brief analysis of the forums, snowball into a much bigger response that would serve little purpose.

~ Matt
If you're right, GzuzKryzt is wrong when he recommends: "If you want a detailed and official answer, please contact challenge@randi.org."

William Smith
30th November 2009, 08:47 PM
...
Second, I doubt if the JREF receives more than a few e-mails each year suggesting general modifications to the rules. So why not a brief response to the few inquiries that they do receive?

You said this was Jeff Wagg's response:



"Hello Rodney,

Thanks for the suggestions. So you know, the challenge rules are being reconsidered, and we'll take your suggestions into account.

If we do make changes, they'll be posted publicly.

Jeff"



A brief response to an inquiry. What am I missing?

MattC
30th November 2009, 11:25 PM
Individually-designed protocols should be able to ensure that each applicant is meeting a .001 odds standard in tests where the odds of success can be readily calculated. The problem with things the way they are now is that there is no uniformity. In Pavel's cases, after undergoing endless negotiations with the JREF, he was summarily informed that he must score 100% on his preliminary test to pass -- when Pavel himself had consistently stated that his paranormal ability is less than perfect.

I do not know much about Mr. Pavel's case aside from the involvement of Mr. Startz, someone whom I respect.

0.001 standards are calculated against random - truly random - chance. If you employ nonrandom chance these odds are quite beatable as I have persistently attempted to show. Employing a set standard like this serves no benefit.

~ Matt

fls
1st December 2009, 07:06 AM
No. If it isn't a test for the paranormal, what purpose does it serve? A magician doesn't need to put a million dollars on the line to garner what is essentially cheap publicity available for the cost of a webcam - judging by the proliferation of amateur magicians and skeptics on YouTube, many have come up with this same idea (and have found parents willing to buy them the webcam as well). Given that many of the actual tests conducted for the Challenge were not widely publicized (whether on YouTube or otherwise) I am not sure this proclamation holds water. Ms. Hunter's case possessed some absurdist elements that made it a worthy spectacle, while Ms. Sonne's test was broadcast as part of a larger event - the decision to broadcast may depend upon noteworthy features, but the decision to test is not.

What explanation do you offer for Randi's capricious and abrupt rejection of Pavel's claim?

Science ultimately tries to prove things by exclusion, meaning that implicit or potential effects are removed by experimentation over the long term. If, for example, I set up a test to determine whether or not I can psychically detect someone's hair color when they were in an adjoining room, it's possible that I could simply guess right each time. If we do three trials of this and I get three hits, blind guessing is both a very normal solution and a very possible one. If we increase the number of trials, guesswork becomes increasingly unlikely and will eventually pass the point of statistical likelihood. This doesn't mean it isn't possible (over a thousand trials it's still possible that I could guess a significant percentage right), merely that it's so improbable that it is a functionally insignificant probability - but, because it is a probability, there's always that slim chance of success. Blind luck cannot be controlled for.

It is seems clear that what you have just described does not resemble the Challenge.

The ultimate goal of the Challenge is to control out the mundane explanations and trickery that could be causing a supposedly paranormal event. The decision to broadcast a Challenge testing to the world at large is not made upon test design criteria, rather they are made upon features of the claim (as in Ms. Hunter's case) or done as a subsidiary of a larger broadcasting initiative (Ms. Sonne's test at TAM).

~ Matt

The decision to broadcast is irrelevant to what I said earlier.

It is very important to Randi's educational mission that he has a body of work, known collectively as The Challenge, to refer to. Each piece of this body of work should be somewhat representative, so that any piece can serve as an example of what The Challenge represents - unqualified failure. Whether any individual test is broadcast or whether it subsequently shows up on youtube, it should be viewable as an unqualified failure to any casual observer, in order to provide support for the rest of Randi's message.

Linda

fls
1st December 2009, 07:14 AM
0.001 standards are calculated against random - truly random - chance. If you employ nonrandom chance these odds are quite beatable as I have persistently attempted to show.

You have shown that calculating odds that are not based upon the situation at hand would be foolish. However, I have not seen anyone, particularly Rodney, suggest this. In fact, it is a quite bizarre suggestion and I am puzzled as to why you even brought it up.

Employing a set standard like this serves no benefit.

~ Matt

One benefit that I could see would be to make Dean Radin and others look foolish when they tried to claim that it would take thousands of ganzfeld trials to pass the Challenge.

Linda

Rodney
1st December 2009, 07:24 AM
You said this was Jeff Wagg's response:

"Hello Rodney,

Thanks for the suggestions. So you know, the challenge rules are being reconsidered, and we'll take your suggestions into account.

If we do make changes, they'll be posted publicly.

Jeff"

A brief response to an inquiry. What am I missing?
The "detailed and official answer" that you suggested NWO Sentryman would receive if he were to inquire about the applicability of the JREF MDC to bomb detectors. That is what I hoped to receive in May 2008 when I first made my inquiry. Instead, I received no response until I followed up in February 2009, when I received the above response from Jeff. What I then expected to happen is that either: (a) the MDC rules would be modified in accord with my suggestions, or (b) I would receive an explanation as to why my suggestions were rejected.

fls
1st December 2009, 07:34 AM
I commend you for your astuteness. I think MDC Rule Number 1 should be: "The first and most important rule is that the Challenge is only a publicity stunt."

Just out of curiosity, since you have been so persistent on this point...is your real purpose to get the JREF to change the rules in line with your suggestions, or is it to force them to be more explicit in their real intentions by their refusal to change them? Or is it simply to show them up as inherently unfair and unscientific?

Linda

Rodney
1st December 2009, 08:01 AM
Just out of curiosity, since you have been so persistent on this point...is your real purpose to get the JREF to change the rules in line with your suggestions, or is it to force them to be more explicit in their real intentions by their refusal to change them? Or is it simply to show them up as inherently unfair and unscientific?

Linda
The former, but the failure of the JREF to even explicitly respond to my suggestions -- let alone implement them -- more than 18 months down the road from when I first made them suggests that the latter is true.

Moochie
1st December 2009, 08:58 AM
The former, but the failure of the JREF to even explicitly respond to my suggestions -- let alone implement them -- more than 18 months down the road from when I first made them suggests that the latter is true.


Case not proven.


M.

William Smith
1st December 2009, 10:07 AM
...
What I then expected to happen is that either: (a) the MDC rules would be modified in accord with my suggestions, or (b) I would receive an explanation as to why my suggestions were rejected.

I somehow admire you for your persistence. I kid you not.

If you seriously expected "the MDC rules would be modified in accord with [your] suggestions", consider you may have a warped view of the MDC, as per the definition of its rules. Plus, you might have huge cojones.

I can't see why the JREF should give you "an explanation as to why [your] suggestions were rejected.".

fls
1st December 2009, 10:37 AM
The former, but the failure of the JREF to even explicitly respond to my suggestions -- let alone implement them -- more than 18 months down the road from when I first made them suggests that the latter is true.

Well, I don't think either of them makes your case.

It's pretty clear that the JREF, and Randi in particular, don't give any priority to communication. This is a consistent theme on any subject - from responding to requests for references, to transparency in moderation, to timely follow-up on Challenge applications - whether or not you hear back from them is pretty much happenstance. And this has nothing to do with whether or not you are on their side. So there is nothing remarkable about the absence of a reply.

And there are several good reasons for failing to be specific when it comes to odds or time limits - reasons that you have ignored or dismissed in these discussions. And some of your suggestions (as has already been pointed out to you) have the effect of introducing a bias (the finding of a result when none should be found) or of making the test very unfair to the subject.

Linda

MattC
1st December 2009, 12:07 PM
What explanation do you offer for Randi's capricious and abrupt rejection of Pavel's claim?

I seem to recall mentioning that I had very little (if anything) to do with the case, paying it only the most cursory attention because I enjoy matching my responses against Startz's considerably more educated and sensible ones as a benchmark of personal success. I am further not certain how this question is relevant to what you quoted?


It is seems clear that what you have just described does not resemble the Challenge.

Certainly, which is why I declared it "science" and invoked terms like "long term" in an attempt to illustrate the difference. Mr. Randi is quite rightly not attempting to do pure science, he seems to attempt to be as scientific as possible with the resources available to both him and the applicant. Bear in mind when evaluating the Challenge that a magician is not an expert in science but rather in trickery - it is sufficient for a challenge offered by a magician to control out the forms of trickery that magicians would perceive, something that holds many similarities to science but is not perfectly equivalent.


You have shown that calculating odds that are not based upon the situation at hand would be foolish. However, I have not seen anyone, particularly Rodney, suggest this. In fact, it is a quite bizarre suggestion and I am puzzled as to why you even brought it up.

Oh, certainly he has - his attempt to emend the rules to incorporate an odds standard based upon mathematical purity rather than physical reality is doing just that - calculating odds not based upon the situation at hand. It is possible (as I also attempted to show) to design a claim where the odds are easy to calculate, thereby qualifying Mr. Rodney's arbitrary standard, and further to have those odds quite beatable by nonrandom guessing.


It's pretty clear that the JREF, and Randi in particular, don't give any priority to communication. This is a consistent theme on any subject - from responding to requests for references, to transparency in moderation, to timely follow-up on Challenge applications - whether or not you hear back from them is pretty much happenstance. And this has nothing to do with whether or not you are on their side. So there is nothing remarkable about the absence of a reply.

Small organizations seldom have the luxury of specialization - they cannot afford to staff call centers with trained "account representatives" who are paid to take and respond to calls. Frankly I'd regard most of the things you mentioned as being of very low priority for a response, with the "timely follow-up on Challenge applications" as a medium priority.

~ Matt

Rodney
1st December 2009, 12:14 PM
And there are several good reasons for failing to be specific when it comes to odds or time limits
Let's focus on the former for the time being. If there had been a P=.001 standard for the preliminary test, don't you think Pavel would have been tested by now? If not, why not?

fls
1st December 2009, 01:02 PM
I seem to recall mentioning that I had very little (if anything) to do with the case, paying it only the most cursory attention because I enjoy matching my responses against Startz's considerably more educated and sensible ones as a benchmark of personal success. I am further not certain how this question is relevant to what you quoted?

You asked, if not a test of the paranormal, what purpose does it serve? Well, Pavel's claim was a paranormal claim, he sent a proposal to the MDC which fulfilled all the explicit requirements, and he was rejected. If Randi is interested in testing paranormal claims and he is presented with an opportunity for doing so, why would he reject that opportunity?

Certainly, which is why I declared it "science" and invoked terms like "long term" in an attempt to illustrate the difference. Mr. Randi is quite rightly not attempting to do pure science, he seems to attempt to be as scientific as possible with the resources available to both him and the applicant. Bear in mind when evaluating the Challenge that a magician is not an expert in science but rather in trickery - it is sufficient for a challenge offered by a magician to control out the forms of trickery that magicians would perceive, something that holds many similarities to science but is not perfectly equivalent.

I just think that it should be distinguished from science, which is about the process of discovery and following up interesting leads. The Challenge is more about demonstrating a claim to be fraudulent or mistaken, rather than any attempt at discovery.

Oh, certainly he has - his attempt to emend the rules to incorporate an odds standard based upon mathematical purity

The idea of mathematical purity seems to have come from you, rather than anyone else. The only person who has suggested odds based on mathematical purity is you. Rodney referred to "tests where the odds of success can be readily calculated", not to some unrelated distribution for the sake of mathematical purity.

rather than physical reality is doing just that - calculating odds not based upon the situation at hand. It is possible (as I also attempted to show) to design a claim where the odds are easy to calculate, thereby qualifying Mr. Rodney's arbitrary standard, and further to have those odds quite beatable by nonrandom guessing.

Except that nobody would have calculated the odds in your test to be 1:1000, nor would they have set success at the particular standard you chose, if they knew anything at all about probability. If the guidelines that Rodney proposed were followed, then the odds would be no more beatable by random guessing than any other 1:1000 guess.

Small organizations seldom have the luxury of specialization - they cannot afford to staff call centers with trained "account representatives" who are paid to take and respond to calls. Frankly I'd regard most of the things you mentioned as being of very low priority for a response, with the "timely follow-up on Challenge applications" as a medium priority.

~ Matt

I understand that. Which is why I specifically stated that the lack of a response cannot be taken to mean anything, since the JREF seems to be busy not responding to anyone.

Linda

fls
1st December 2009, 01:06 PM
Let's focus on the former for the time being. If there had been a P=.001 standard for the preliminary test, don't you think Pavel would have been tested by now? If not, why not?

No, because Pavel's test is not the kind that serves Randi's covert purposes. It would still have been rejected as too long.

Linda

MattC
1st December 2009, 02:38 PM
You asked, if not a test of the paranormal, what purpose does it serve? Well, Pavel's claim was a paranormal claim, he sent a proposal to the MDC which fulfilled all the explicit requirements, and he was rejected. If Randi is interested in testing paranormal claims and he is presented with an opportunity for doing so, why would he reject that opportunity?

I will not theorize nor answer questions on something I know nothing about.


The idea of mathematical purity seems to have come from you, rather than anyone else. The only person who has suggested odds based on mathematical purity is you. Rodney referred to "tests where the odds of success can be readily calculated", not to some unrelated distribution for the sake of mathematical purity.


Oh, but we can readily calculate the odds of success in the case I proposed, and by Rodney's proposed emendation we would have been forced to use 1:1000 odds - it's not a choice under his restrictions, the JREF would have to do it.


Except that nobody would have calculated the odds in your test to be 1:1000, nor would they have set success at the particular standard you chose, if they knew anything at all about probability. If the guidelines that Rodney proposed were followed, then the odds would be no more beatable by random guessing than any other 1:1000 guess.

First, he is by no means proposing a guideline - he is suggesting a rule change that must be followed in all cases. According to him, if you can readily calculate the odds of success - which we can - you must use 1:1000 odds. Whatever you might want to set them at is no good under his proposed emendations, you must use 1:1000 - and that is why I have disagreed with it ever since he proposed it.

Second, in the plan I suggested, I would not be randomly guessing - far from it. Guessing certainly, but guessing very much according to the numbers. Accordingly, I must restate my previous supposition that using odds based upon mathematical purity (which is exactly what you are doing if you refer to totally random guessing and my odds of beating the test thereon) has no merit in cases of nonrandom guessing and, in fact, is entirely harmful given Rodney's proposed restrictions.

~ Matt

fls
1st December 2009, 03:09 PM
Oh, but we can readily calculate the odds of success in the case I proposed, and by Rodney's proposed emendation we would have been forced to use 1:1000 odds - it's not a choice under his restrictions, the JREF would have to do it.

Yes, but the JREF would do so correctly, or some of us would step in and do it for them.

First, he is by no means proposing a guideline - he is suggesting a rule change that must be followed in all cases. According to him, if you can readily calculate the odds of success - which we can - you must use 1:1000 odds. Whatever you might want to set them at is no good under his proposed emendations, you must use 1:1000 - and that is why I have disagreed with it ever since he proposed it.

That is also why I disagreed with it. It is your reason for disagreeing with it, "to have those odds quite beatable by nonrandom guessing", which is incorrect. It is not beatable by non-random guessing.

Second, in the plan I suggested, I would not be randomly guessing - far from it. Guessing certainly, but guessing very much according to the numbers.

The probabilities you quoted have nothing to do with whether or not the guessing follows a random distribution. In fact, guessing is rarely random. Rather, they reflect the distribution of the sample about which the guesses are made. So if you are guessing hair-colour and the underlying distribution has only one black-haired person for every 50 people (which is what you would need for 3 correct guesses out of 10 trials to reflect less than 1:1000 odds), it doesn't matter if you guess "black hair" every single time. You can only be right one time in 50 on each trial.

Accordingly, I must restate my previous supposition that using odds based upon mathematical purity (which is exactly what you are doing if you refer to totally random guessing and my odds of beating the test thereon) has no merit in cases of nonrandom guessing and, in fact, is entirely harmful given Rodney's proposed restrictions.

~ Matt

You are incorrect. The odds do not refer to guessing, but to the distribution of the answers. If you have a test with equal numbers of A's, B's, C's and D's as answers, guessing C every time will still only get you 25% on the test.

Linda

Uncayimmy
1st December 2009, 08:45 PM
fls,

I think what Matt is suggesting and what Rodney is ignoring is that while we can certainly calculate The Odds, those odds are based on the premise that the protocol ensures things are random. In some cases, that's simply not the case.

Take the IIG test for VisionFromFeeling where she was supposed to detect who was missing a kidney. She could see the people well enough to determine things like gender and especially age. Based on my research, it seems like each year kidney donations are roughly the same in number across the age range of about 18 to 55. Thus, the group of people born in 1954 will have donated a buttload more kidneys than the group born in 1991. Couple this with reading nonverbal cues like fidgeting, and I think the case can be made that the theoretical odds don't match with the reality of what's practical to assemble for a protocol.

By not having any set odds in the rules, the JREF can require one odds requirement for a Connie Sonne type protocol (no conceivable way for her to take an educated guess) and a more stringent requirement for a VFF type protocol so that the JREF can have a comfort zone, so to speak.

Matt, if I'm not stating your position correctly, I apologize. However, it's also my position, so no time was wasted.

fls
2nd December 2009, 04:35 AM
fls,

I think what Matt is suggesting and what Rodney is ignoring is that while we can certainly calculate The Odds, those odds are based on the premise that the protocol ensures things are random. In some cases, that's simply not the case.

Take the IIG test for VisionFromFeeling where she was supposed to detect who was missing a kidney. She could see the people well enough to determine things like gender and especially age. Based on my research, it seems like each year kidney donations are roughly the same in number across the age range of about 18 to 55. Thus, the group of people born in 1954 will have donated a buttload more kidneys than the group born in 1991. Couple this with reading nonverbal cues like fidgeting, and I think the case can be made that the theoretical odds don't match with the reality of what's practical to assemble for a protocol.

Why not simply recognize when you can or cannot readily calculate odds?

By not having any set odds in the rules, the JREF can require one odds requirement for a Connie Sonne type protocol (no conceivable way for her to take an educated guess) and a more stringent requirement for a VFF type protocol so that the JREF can have a comfort zone, so to speak.

I'm not disagreeing that there are good reasons to maintain flexibility in protocol design, but in this case it would have made more sense to me to subject VFF to a well-designed test, rather than trying to put a bandaid on a poorly-designed test by making the threshold more stringent.

Linda

Rodney
2nd December 2009, 06:41 AM
I'm not disagreeing that there are good reasons to maintain flexibility in protocol design, but in this case it would have made more sense to me to subject VFF to a well-designed test, rather than trying to put a bandaid on a poorly-designed test by making the threshold more stringent.
I couldn't agree more, and yet many on the VFF discussion thread want to blame Anita for the test deficiencies, even though she was the one who paid $1,000 to fly across the country and stay overnight to take what she thought was a well-designed test of her claim. See, for example, http://forums.randi.org/showpost.php?p=5358380&postcount=1766

petre
2nd December 2009, 07:08 AM
While I don't feel such an addition would harm the purpose any, I expect if JREF were to add such a clause and for any later protocol suggest the odds are not readily calcuable (tests where the applicant may get significant hints from mundane observations), people (perhaps some already involved in this discussion) would then request the addition of a definition of "readily calcuable". Ad infinitum.

fls
2nd December 2009, 07:24 AM
While I don't feel such an addition would harm the purpose any, I expect if JREF were to add such a clause and for any later protocol suggest the odds are not readily calcuable (tests where the applicant may get significant hints from mundane observations), people (perhaps some already involved in this discussion) would then request the addition of a definition of "readily calcuable". Ad infinitum.

Well, it's a bit less vague than "proper observing conditions" or "feasible", I suppose. :)

Linda

Uncayimmy
2nd December 2009, 10:10 AM
Why not simply recognize when you can or cannot readily calculate odds?

We don't "simply recognize" that because the world is not black and white. We can calculate the odds if we make the assumption that everything is random. We can also acknowledge that things are not random but estimate that the factors working in favor a claimant, whom we assume to be without a sooper power, still give us enough of an edge. At the end of the day it's just a bet: money vs failure.

I'm not disagreeing that there are good reasons to maintain flexibility in protocol design, but in this case it would have made more sense to me to subject VFF to a well-designed test, rather than trying to put a bandaid on a poorly-designed test by making the threshold more stringent.

I am going to start a thread in GS&P about the VFF protocol, and I hope you participate. Until then, you have to remember that a "well designed test" is actually a negotiation between two parties and subject to practical limitations. It's a value judgment to proceed with a challenge that you know is not as good as you'd like but still sufficient to demonstrate the point.

Take the age factor in kidney donations. Finding people missing a kidney and who are willing volunteer their time is a pain in the ass. Ideally, for each target I would try to assemble a group of controls of the same sex with the same general physical characteristics including age. That's a lot of work.

Sometimes the claimant refuses to budge on certain issues (VFF insisted on 4.5 minutes per person), so again, it's a value judgment whether to concede the point or hold fast. At the end of the day it's still just a challenge, not a scientific test. If you're confident that your money is safe and the claimant is confident she can demonstrate her abilities, then many would argue that there's no reason not to go through with it.

fls
2nd December 2009, 01:08 PM
We don't "simply recognize" that because the world is not black and white. We can calculate the odds if we make the assumption that everything is random.

I don't think that that is the assumption which is made. Or maybe I don't know what you mean by "random".

Anyway, you seemed able to come up with two clear examples earlier.

Linda

MattC
2nd December 2009, 03:09 PM
I think what Matt is suggesting and what Rodney is ignoring is that while we can certainly calculate The Odds, those odds are based on the premise that the protocol ensures things are random. In some cases, that's simply not the case.

Yes. Mathematical randomness and the controlled randomness inherent in a protocol design seldom perfectly intersect. Your example of the VFF test is a very good one to demonstrate the point - if the applicant knows anything more about the target pool or can discover this information it is no longer a perfectly random test. Certain some elements of randomness are preserved, but the goal of any protocol agreement should be to ensure that this randomness preserved is related to the applicant's ability.


By not having any set odds in the rules, the JREF can require one odds requirement for a Connie Sonne type protocol (no conceivable way for her to take an educated guess) and a more stringent requirement for a VFF type protocol so that the JREF can have a comfort zone, so to speak.

A better way to state it might be that all odds requirements are based upon normal knowledge available to the applicant throughout the procedure. If we're testing for paranormal knowledge it makes sense to identify and exclude the normal throughout the process of protocol design. I wonder if this sort of information-oriented explanation might be a better way to clarify protocol designs in the future, specifically the denial of normal sources of relevant information through controls. If this is impractical for some reason (as it was in the VFF case), the odds requirement will be higher than it would be in other cases - your ability to find examples is quite exemplary.

~ Matt

Uncayimmy
2nd December 2009, 08:09 PM
I don't think that that is the assumption which is made. Or maybe I don't know what you mean by "random".

Anyway, you seemed able to come up with two clear examples earlier.

Linda

I'll try to clarify.

If the self-evident results have a set number of choices and/or answers, then you can pretty much calculate The Odds. In in a perfect world these worst case odds indicate how likely a person flipping a coin or rolling dice is to pass the test. Let's call this scenario Blind Luck.

Unfortunately, the world's not perfect. There are going to be occasions where due to time, space, money, personalities, or whatever, the protocol is not going whittle it down to Success by Ability vs Success by Blind Luck. Our odds calculation remains the same, but our confidence level is reduced.

It sounds to me like you're saying that if we don't have complete confidence that we have made it Ability vs. Blind Luck that we have no right to be discussing odds. Is that your stance?

I say that this is a challenge and not scientific research. We're perfectly entitled to say, "We acknowledge that in this protocol an ordinary person without a special ability will outperform an ordinary person flipping a coin." We can do that because the more important statement we're making is, "We're so confident that only a person who has an ability could pass this test that we're gonna put up our money."

The blind luck odds are just one factor to consider when setting up the challenge.

Cuddles
3rd December 2009, 03:42 AM
I think the most important point is one I've made before - there's no reason to have a consistent probability of winning by chance, because as far as the applicant is concerned it's completely irrelevant. Remember, the odds we're talking about here are the odds of someone winning if they can't actually do what they claim. Obviously applicant believe they can do whatever it is, so they have no reason care what the odds actually are. The odds of winning by chance only matter to the JREF, because it represents the chance of them losing their money without actually demonstrating anything interesting.

The only way these odds affect an applicant in any way is in the length of time, via the number of separate rounds, required for a test. However, since pretty much every ability claimed takes a different length of time per trial, and usually have different restrictions on how long people claim to be able to take part in a test for, there's no way of setting any consistent standard based on that. For example, Pavel_do was happy to go on for as long as the JREF might have wanted, while in VissionFromFeeling claims to be unable to do more than two rounds before being unable to continue.

Perhaps the most important question to ask here is - is consistency really necessary? Does it matter if one applicant has to beat 1:500, another 1:1000 and another 1:10000? There can hardly be accusations of favouritism, since the JREF obviously doesn't think anyone will ever win, and anyone who actually did have an ability would win regardless of what their odds of winning by chance were. So why all the fuss here? Why should the JREF bother coming up with a hard answer on what odds they'll accept? As long as they are happy that the odds in each case are high enough that someone without an ability isn't going to win, what difference does it make?

fls
3rd December 2009, 08:24 AM
I'll try to clarify.

If the self-evident results have a set number of choices and/or answers, then you can pretty much calculate The Odds. In in a perfect world these worst case odds indicate how likely a person flipping a coin or rolling dice is to pass the test. Let's call this scenario Blind Luck.

Unfortunately, the world's not perfect. There are going to be occasions where due to time, space, money, personalities, or whatever, the protocol is not going whittle it down to Success by Ability vs Success by Blind Luck. Our odds calculation remains the same, but our confidence level is reduced.

Unless the subject is making their guesses based on a random-like process, such as rolling a die, it has to be assumed that their guesses will not be random, but will follow some sort of pattern. So unless there is no pattern to the placement of the target, the distribution of guesses, based on the pattern of guessing will be different than the distribution of guesses based on random sampling.

We typically see two kinds of guessing games - one where the placement of the target is random, and one where the placement is not random. Connie Sonne's test is an example of the former, and the VFF test is an example of the latter. Just think how easy it would have been to test VFF if the presence or absence of a kidney was determined randomly. The distribution of blind guesses in the former, even though those guesses are not likely to be random, will still correspond to the distribution of guesses based on random sampling. The distribution of guesses in the latter would have to be determined empirically. And that is the stumbling block in the Challenge, because it is not set up to empirically measure the distribution (and thereby provide us with a way to calculate the odds), preferring to depend upon a theoretical distribution.

So what we can try to do, instead, is to make the subject as blind as possible when it comes to the application of any sort of pattern to their guessing. For VFF this means that the subjects are covered in clothing which hides anything that could be used as a clue, such as age. The subjects are presented one at a time, chosen randomly with replacement (which means that some subjects may be read more than once, and some not at all). And she is not told beforehand how many missing kidneys there are. This is what you referred to earlier as Blind Luck vs. Ability.

It sounds to me like you're saying that if we don't have complete confidence that we have made it Ability vs. Blind Luck that we have no right to be discussing odds. Is that your stance?

My stance is that you are using the wrong distribution in order to calculate odds simply because to do otherwise is inconvenient.

I say that this is a challenge and not scientific research. We're perfectly entitled to say, "We acknowledge that in this protocol an ordinary person without a special ability will outperform an ordinary person flipping a coin."

And that is where empirical measurement would be necessary if we want to have some way of quantifying this for comparison to a person with a special ability. Right now, we can guess that both a person without a special ability and a person with a special ability will outperform random sampling, but we don't know by how much. This works against us if we don't make very good guesses as to how much and against the claimant if we way over-compensate. I think my main complaint is that the odds based on random sampling are used as though they are meaningful. :)

We can do that because the more important statement we're making is, "We're so confident that only a person who has an ability could pass this test that we're gonna put up our money."

Yeah, I'd just like to see some indication that our confidence does not reflect the odds based on random sampling, but rather our certainty that we've made a good guess as to how a person without special abilities would perform.

Linda

fls
3rd December 2009, 08:33 AM
So why all the fuss here? Why should the JREF bother coming up with a hard answer on what odds they'll accept? As long as they are happy that the odds in each case are high enough that someone without an ability isn't going to win, what difference does it make?

I think that the fuss is to remove some of the apparent capriciousness as to whose application and/or protocol is accepted. I'm not sure that setting an odds standard will prevent that, though.

Linda

fls
3rd December 2009, 09:02 AM
Yeah, I'd just like to see some indication that our confidence does not reflect the odds based on random sampling, but rather our certainty that we've made a good guess as to how a person without special abilities would perform.

Linda

Just to elaborate on this a bit...the odds will not reflect our confidence that the protocol has removed any bias. For example, the odds for the meta-analyses of the ganzfeld tests and for the PEAR random-number generator tests are billions to one, yet the results aren't persuasive because we aren't particularly confident that a tiny amount of residual bias accounts for the effect (i.e. the person without ability performing better than random sampling). On the other hand, when we are confident that we have removed all bias, lower odds would command our attention. Connie Sonne's test would have been impressive if she had passed (or many even if she had guessed two out of three correctly).

Linda

fls
3rd December 2009, 09:10 AM
So why all the fuss here? Why should the JREF bother coming up with a hard answer on what odds they'll accept? As long as they are happy that the odds in each case are high enough that someone without an ability isn't going to win, what difference does it make?

I thought of something else.

Sometimes the organizations running the preliminary tests set standards higher than those set if Randi was running the test. For example (I don't have the exact details on this), Suitbert Ertel told me of a test which was run by a skeptical group in Europe using a couple of his psi-stars which passed the 1 in a 1000 threshold, but did not pass the 1 in a 10,000 threshold set by the group. Under other circumstances (say undertaking the test at TAM and broadcasting it to a wider audience) this could have been good enough to pass and to proceed to the final. Setting a standard could prevent this sort of situation, where the circumstances, rather than the actual performance, mean the difference between qualifying for a million dollar prize or receding into obscurity.

Linda

Uncayimmy
3rd December 2009, 10:30 AM
fls, I think you're starting premises are flawed. Here's how I and I believe the JREF and IIG approach these challenges.

* It's a challenge, not a scientific experiment..

* The protocol has to be accepted by the subject, which means the subject has to believe their powers will work under those conditions. This is usually irrational, so if the test is to happen, some rather odd requests might need to be honored.

* The protocol has to greatly reduce, if not eliminate, the chances of ordinary means resulting in success.

* The theoretical odds combined with a degree of confidence in the above has to be sufficient for the organization to believe their money is safe.

* The protocol has to be practical in terms of time, logistics and expense.

* And here's the one that you seem to be missing, the entire thing must be designed in such a way that it can be easily explained to and convincing enough for the Average Joe on the street. What the skeptics, True Believers and the claimant think about it is irrelevant because this publicity stunt won't change any of their minds.

Your suggestions seemed to based on the idea that this publicity stunt should be treated like an experiment that will be reviewed by scientists or deemed "consistent" from claimant to claimant. That's not the case.

The VFF test is easily explained to the Averge Joe: "Anita claims to be able to detect through means unknown to science that a person is missing a kidney and which kidney is missing just by looking at a person. We have set up three trials with six people each and with one person in each missing a kidney. We have taken steps to reduce visual clues available to Anita. Since there are 12 possible locations for kidneys (two per person) in each trial, Anita has a 1 in 12 chance of being right by just guessing. The odds of guessing all three are 1/12 x 1/12 x 1/12, which means the overall odds are 1/1,728. This is the level she must reach to pass the test. Probability says that with three guesses there's a 1 in 4 chance she'll get one right and a 1 in 51 chance she'll get two right. Her selections will be verified by an on-site sonogram technician."

While I could write it better, the point is that it's short enough and simple enough for any media outlet to write a blurb about it, which is the ultimate goal. She failed, and it was plain as day to reasonable people that she failed. The challenge served its purpose.

Your suggestions for the VFF protocol would not have been accepted by Anita. Even if they were, your test is overly complicated to explain. I'm no expert in stats, but I'm way above the average layman, and I'm not sure how to calculate the odds. The protocol is just not accessible to the people you need to reach.

The three trials of six people is dirt simple to understand. Understanding the 1 in 4 chance of getting one right is easy to understand if you think, "Okay, 12 choices. 3 guesses. 12 divided by 3 is 4. Okay, I get it." The test is useless if the audience is saying, "Oh, she got one right. What does that mean? What? She picked the same person again, but this time guessed the other kidney? You can pick the same person twice? What if she could tell it was the same person because the guy was a smoker?"

fls
3rd December 2009, 12:59 PM
fls, I think you're starting premises are flawed. Here's how I and I believe the JREF and IIG approach these challenges.

* It's a challenge, not a scientific experiment..

* The protocol has to be accepted by the subject, which means the subject has to believe their powers will work under those conditions. This is usually irrational, so if the test is to happen, some rather odd requests might need to be honored.

* The protocol has to greatly reduce, if not eliminate, the chances of ordinary means resulting in success.

* The theoretical odds combined with a degree of confidence in the above has to be sufficient for the organization to believe their money is safe.

* The protocol has to be practical in terms of time, logistics and expense.

* And here's the one that you seem to be missing, the entire thing must be designed in such a way that it can be easily explained to and convincing enough for the Average Joe on the street. What the skeptics, True Believers and the claimant think about it is irrelevant because this publicity stunt won't change any of their minds.

Your suggestions seemed to based on the idea that this publicity stunt should be treated like an experiment that will be reviewed by scientists or deemed "consistent" from claimant to claimant. That's not the case.

No, I'm of the opinion that this is a challenge, not a scientific experiment, that it should be acceptable to the claimant, that it should be easily understandable to a casual observer, and that failure or success should be easily observed. I have already explicitly stated each of those points in this thread.

The VFF test is easily explained to the Averge Joe: "Anita claims to be able to detect through means unknown to science that a person is missing a kidney and which kidney is missing just by looking at a person. We have set up three trials with six people each and with one person in each missing a kidney. We have taken steps to reduce visual clues available to Anita. Since there are 12 possible locations for kidneys (two per person) in each trial, Anita has a 1 in 12 chance of being right by just guessing. The odds of guessing all three are 1/12 x 1/12 x 1/12, which means the overall odds are 1/1,728. This is the level she must reach to pass the test. Probability says that with three guesses there's a 1 in 4 chance she'll get one right and a 1 in 51 chance she'll get two right. Her selections will be verified by an on-site sonogram technician."

Your calculations should take into consideration that we already know which proportion of missing kidneys were on the left or the right and we already know that there is a pattern to which side she guesses.

While I could write it better, the point is that it's short enough and simple enough for any media outlet to write a blurb about it, which is the ultimate goal. She failed, and it was plain as day to reasonable people that she failed. The challenge served its purpose.

How is it plain that she failed? She gave the appearance of performing better than chance, even if she didn't surpass some unreasonably rigid standard. I realize that it is difficult to step out of the shoes of a non-believer, but the casual observer doesn't necessarily see much of a difference between a very unexpected event and a very, very unexpected event.

Your suggestions for the VFF protocol would not have been accepted by Anita. Even if they were, your test is overly complicated to explain. I'm no expert in stats, but I'm way above the average layman, and I'm not sure how to calculate the odds. The protocol is just not accessible to the people you need to reach.

Why wouldn't it have been accepted? Anita already agreed that they could wear clothes and cover their heads, and her original claim involved an individual reading. And my point with regards to the use of odds and statistics is that failure and success should be obvious, rather than relying upon not only the ability to calculate odds, but upon convincing someone that there is a world of difference between one in nineteen and one in twenty. You shouldn't have to explain the odds, it should be set up so that it doesn't give the appearance of something unexpected unless something unexpected actually happens.

The three trials of six people is dirt simple to understand.

How is "one at a time, Anita indicates whether she sees the right, left or both kidneys" difficult to understand?

Understanding the 1 in 4 chance of getting one right is easy to understand if you think, "Okay, 12 choices. 3 guesses. 12 divided by 3 is 4. Okay, I get it." The test is useless if the audience is saying, "Oh, she got one right. What does that mean? What? She picked the same person again, but this time guessed the other kidney? You can pick the same person twice? What if she could tell it was the same person because the guy was a smoker?"

Smell is one of those things you'd try to eliminate.

Eighteen people, 17 right kidneys, 16 left kidneys. Each individual has a one in nine chance of missing a left kidney and a one in 18 chance of missing a right kidney. Getting one left kidney correct (one in nine) and no right kidneys correct, as well as seeing kidneys that aren't there, plus not seeing kidneys that aren't there, isn't going to look all that remarkable to the casual observer.

Linda

Uncayimmy
3rd December 2009, 02:42 PM
Your calculations should take into consideration that we already know which proportion of missing kidneys were on the left or the right and we already know that there is a pattern to which side she guesses.
To compensate for that, I'm thinking the IIG used six people instead of five.

How is it plain that she failed? She gave the appearance of performing better than chance, even if she didn't surpass some unreasonably rigid standard. I realize that it is difficult to step out of the shoes of a non-believer, but the casual observer doesn't necessarily see much of a difference between a very unexpected event and a very, very unexpected event.
Do you have any evidence that a fence-sitter feels *better* about Anita's claims being true than before? People I have spoken to don't have any problems understanding that she failed, especially when I point out that the lady with the sonogram got 100% correct in about 30 seconds per subject.

Why wouldn't it have been accepted?
Did you follow the kidney protocol thread? Anita rejected burqas. At first she was insisting that she be allowed to observe the entire group and then have an opportunity to dismiss any number she wanted (one time) so that she could then concentrate on the remaining subjects. I am very confident that she would not agree to a one person at a time reading.

Anita already agreed that they could wear clothes and cover their heads, and her original claim involved an individual reading. And my point with regards to the use of odds and statistics is that failure and success should be obvious, rather than relying upon not only the ability to calculate odds, but upon convincing someone that there is a world of difference between one in nineteen and one in twenty. You shouldn't have to explain the odds, it should be set up so that it doesn't give the appearance of something unexpected unless something unexpected actually happens.
Nothing unexpected happened. She got one right, and that was a 1 in 4 chance. It's an excellent springboard to a discussion about confirmation bias and possible reasons why people believe they have special abilities. Of course, you remind them that she agreed in advance that she could do exactly what was tested and that before the test when she was told who was missing a kidney, she verified this in a matter of seconds.


How is "one at a time, Anita indicates whether she sees the right, left or both kidneys" difficult to understand?
Explain to me how I calculate the odds with an unknown number of targets and the possibility of a target being viewed more than once and some others not being viewed at all.

Smell is one of those things you'd try to eliminate.
I am very confident I can do a lot of things, but I would probably not agree to any test where I had to wear nose plugs during the process. Want to put me in a glass cage? No way. That would freak me out. Want to put the subjects in a glass cage? Same thing. Want to put them in a room with a glass window? Well, Anita will say that the glass blocks her perceptions.

You think like a scientist who sets up a protocol and then spends a bunch of research money finding people willing to be tested. A challenge is about a bunch of volunteers trying to organize a test that is acceptable to themselves, unpaid volunteers, and a subject who has already demonstrated a tenuous grasp on reality. Oh, and to do so as cheaply as possible.

Eighteen people, 17 right kidneys, 16 left kidneys. Each individual has a one in nine chance of missing a left kidney and a one in 18 chance of missing a right kidney. Getting one left kidney correct (one in nine) and no right kidneys correct, as well as seeing kidneys that aren't there, plus not seeing kidneys that aren't there, isn't going to look all that remarkable to the casual observer.
That's what happened, so I'm not seeing your point.

Cuddles
4th December 2009, 04:09 AM
I think that the fuss is to remove some of the apparent capriciousness as to whose application and/or protocol is accepted. I'm not sure that setting an odds standard will prevent that, though.

Linda

That's the thing. Without a fairly major overhaul of the whole procedure, the JREF are free to dismiss and application at any time, up until they've actually signed it. Of course, applicants are free to do likewise. I agree with many of your concerns, but I don't see how this would make any difference. For example, with Pavel's protocol, they were aiming for odds of 1/1000. We can be reasonably sure that a set standard would not allow better odds than that, so it would have made no difference at all in his case.

Sometimes the organizations running the preliminary tests set standards higher than those set if Randi was running the test.

I've not heard of the case you refer to but remember that Randi has to agree to all protocols, regardless of who is running the tests.

Rodney
4th December 2009, 05:49 AM
For example, with Pavel's protocol, they were aiming for odds of 1/1000. We can be reasonably sure that a set standard would not allow better odds than that, so it would have made no difference at all in his case.
I have no idea "what they were aiming for", but Pavel states that in August he was informed by the JREF that to pass the preliminary test he must perform at a 100% success rate in 20 trials, where his probability of success in each trial was 50%. See http://forums.randi.org/showpost.php?p=5027325&postcount=228. The odds of doing that are less than one in a million, not one in a thousand.

steenkh
4th December 2009, 06:15 AM
I have no idea "what they were aiming for", but Pavel states that in August he was informed by the JREF that to pass the preliminary test he must perform at a 100% success rate in 20 trials, where his probability of success in each trial was 50%. See http://forums.randi.org/showpost.php?p=5027325&postcount=228. The odds of doing that are less than one in a million, not one in a thousand.
That was one of Randi's gruff remarks that came after numerous attempts at fixing a protocol had come to nothing. It is not even clear if 20 out of 20 was what was really meant, or if he just wanted 20 attempts. The JREF has done many such tests, and they were never 20 out of 20, but more commonly 16 out of 20.

Rodney
4th December 2009, 06:27 AM
That was one of Randi's gruff remarks that came after numerous attempts at fixing a protocol had come to nothing. It is not even clear if 20 out of 20 was what was really meant, or if he just wanted 20 attempts. The JREF has done many such tests, and they were never 20 out of 20, but more commonly 16 out of 20.
If the JREF is willing to test Pavel under the condition that he get 16 of 20 right in the preliminary test, he might accept that (although he claims less than an 80% hit rate). In any event, if the JREF is willing to back off what it told Pavel in August, why doesn't it clarify what would be an acceptable performance by Pavel?

fls
4th December 2009, 08:03 AM
Do you have any evidence that a fence-sitter feels *better* about Anita's claims being true than before?

I don't know. I certainly wouldn't feel comfortable assuming that they would see the results the same way as those who are busy crowing about how she got pnwed.

Did you follow the kidney protocol thread?

No. I read the protocol for the test.

Anita rejected burqas. At first she was insisting that she be allowed to observe the entire group and then have an opportunity to dismiss any number she wanted (one time) so that she could then concentrate on the remaining subjects. I am very confident that she would not agree to a one person at a time reading.

I didn't suggest burqas. I suggested those things she had agreed to, but most of the subjects chose not to use - head-coverings, for example.

I realize that you will deny that any suggestions I make are feasible. My point is that the perception of what she is able to accomplish will depend more upon the experimental set-up than what she actually does. And that there is no point in making it easy for her to obtain dramatic results.

Nothing unexpected happened. She got one right, and that was a 1 in 4 chance. It's an excellent springboard to a discussion about confirmation bias and possible reasons why people believe they have special abilities. Of course, you remind them that she agreed in advance that she could do exactly what was tested and that before the test when she was told who was missing a kidney, she verified this in a matter of seconds.

It is important to realize that she will be judged on two fronts - whether she did something unexpected and whether she passed the test. You seem to be treating it as though, because she didn't pass the test, her results weren't unexpected. However, she correctly identified two people as missing a kidney, something that if it were due to random sampling, would only happen 8 percent of the time, under the conditions of the test. This intuitively seems to be close to our usual 5 percent cut-off. What's more, the way in which this was revealed was dramatic and obvious, as first the chosen subject is asked to stand and then the person without a kidney is asked to stand. This really emphasizes that it's the same person. And the reveal as to whether or not the correct side was chosen becomes almost an after-thought - especially because that part of the test is unremarkable once the person is identified.

Explain to me how I calculate the odds with an unknown number of targets and the possibility of a target being viewed more than once and some others not being viewed at all.

I did in my prior post. If we are given the information that there are 18 subjects with 3 of them missing a kidney, two on the left and one on the right, then any individual, selected randomly, has a one in nine chance of missing a left kidney and a one in eighteen chance of missing a right kidney (ignoring the slight bias introduced by nobody missing two kidneys).


I am very confident I can do a lot of things, but I would probably not agree to any test where I had to wear nose plugs during the process. Want to put me in a glass cage? No way. That would freak me out. Want to put the subjects in a glass cage? Same thing. Want to put them in a room with a glass window? Well, Anita will say that the glass blocks her perceptions.

Or Anita could be asked what sort of scents she finds pleasing and a bunch of roses or an incense stick (or whatever) could be placed in the room with her.

You think like a scientist who sets up a protocol and then spends a bunch of research money finding people willing to be tested. A challenge is about a bunch of volunteers trying to organize a test that is acceptable to themselves, unpaid volunteers, and a subject who has already demonstrated a tenuous grasp on reality. Oh, and to do so as cheaply as possible.

I think that you simply plan to reject whatever I suggest.

That's what happened, so I'm not seeing your point.

This set-up takes the same results, but changes the extent to which they can be seen as unusual. Instead of making it look like she only made two errors (one of them trivial) on her way to doing something unexpected, all of her errors can be noted - seeing kidneys that aren't there and failing to see kidneys that are there. Plus all of her correct guesses will be expected, except for one. And that one would happen 11 percent of the time if due to random sampling - something that intuitively isn't as close to a remarkable finding.

Linda

steenkh
4th December 2009, 08:31 AM
In any event, if the JREF is willing to back off what it told Pavel in August, why doesn't it clarify what would be an acceptable performance by Pavel?
I think it is because they are fed up with Pavel, and the way he develops his abilities as he goes. If he had stated his ability clearly from the beginning, and if he did not constantly make last-minute changes to protocols, he would have been tested by now.

As I said at the time, I would not have dismissed Pavel's claim, and I would have liked to see him getting an extra chance, but I am more patient than Randi.

William Smith
4th December 2009, 09:04 AM
If the JREF is willing to test Pavel under the condition that he get 16 of 20 right in the preliminary test, he might accept that (although he claims less than an 80% hit rate). In any event, if the JREF is willing to back off what it told Pavel in August, why doesn't it clarify what would be an acceptable performance by Pavel?

I do not know.

What Randi said in his last statement was essentially one finger held up. The JREF reasoning "lack of personnel and time" did not cut it for me since then.

Uncayimmy
4th December 2009, 03:00 PM
I didn't suggest burqas. I suggested those things she had agreed to, but most of the subjects chose not to use - head-coverings, for example.
They all wore straw hats and scarves on the backs of their necks.

I realize that you will deny that any suggestions I make are feasible. My point is that the perception of what she is able to accomplish will depend more upon the experimental set-up than what she actually does. And that there is no point in making it easy for her to obtain dramatic results.
I am trying to hammer home the point that when a scientist wants to conduct a study, she comes up with a protocol that she believes will be accepted as evidence by not only herself but by her peers. She then seeks out subjects for testing. If she can't find enough willing subjects, she either gives up because no other protocol is good enough or she makes adjustments. She doesn't negotiate with people.

A challenge is very different. The claimant makes some wild-ass claim that defies our current understandings of science. The organization and the claimant then negotiate a protocol. The claimant wants something acceptable for their "powers" to work, even if it is as irrational as the temperature of the room. The organization merely needs to eliminate ordinary and known means of passing the challenge. Nothing is actually proven if the claimant passes, so that's not an issue.

This is important to understand because while burqas (for example) might make an excellent choice for blinding, the subject has just as much control over the protocol and can reject it. You can't sit in an ivory tower and make proclamations about what is a good protocol. You need to wrestle with the pig to find something you both can accept.


It is important to realize that she will be judged on two fronts - whether she did something unexpected and whether she passed the test. You seem to be treating it as though, because she didn't pass the test, her results weren't unexpected. However, she correctly identified two people as missing a kidney, something that if it were due to random sampling, would only happen 8 percent of the time, under the conditions of the test. This intuitively seems to be close to our usual 5 percent cut-off. What's more, the way in which this was revealed was dramatic and obvious, as first the chosen subject is asked to stand and then the person without a kidney is asked to stand. This really emphasizes that it's the same person. And the reveal as to whether or not the correct side was chosen becomes almost an after-thought - especially because that part of the test is unremarkable once the person is identified.

Whose 5% cut-off? Do you think the general public knows about this number and how it is used? I don't think so. There's always a risk that is balanced against logistics and ease of presentation.


I did in my prior post. If we are given the information that there are 18 subjects with 3 of them missing a kidney, two on the left and one on the right, then any individual, selected randomly, has a one in nine chance of missing a left kidney and a one in eighteen chance of missing a right kidney (ignoring the slight bias introduced by nobody missing two kidneys).

What I am asking is how you actually conduct the test and calculate the odds. I *think* you are saying that people are randomly selected and presented to her. The person selected would then go back into the pool for possible selection again.

So, how does that work? How many people does she see?

Or perhaps you can explain to me how to calculate the odds for another suggestion, which is where each person was presented to Anita one at a time where she doesn't know how many people are missing kidneys. For each person she would indicate both kidneys or which kidney was missing. I figure with dependent events it is 3/36 * 2/35 * 1/34 to calculate the odds of getting all three correct.

Beyond that, I'm hitting a brick wall trying to derive the math and haven't tried to look it up.

Or Anita could be asked what sort of scents she finds pleasing and a bunch of roses or an incense stick (or whatever) could be placed in the room with her.
And let's make sure that all 18 people don't have an allergic reaction to the scent.

I think that you simply plan to reject whatever I suggest.
Not at all, because it doesn't matter what you or I suggest. What matters is what the claimant is willing to accept, what is practical for the organization to do, and what the volunteer subjects are willing to endure.

Take your perfume suggestion. First, Anita would have to decide on what scent she likes. Then on the other coast the IIG has to try to find that scent. What if they don't make it anymore and Anita doesn't have enough? We're back to the drawing board.

Assume they can find the scent. Before they agree to it, they have to find 18 people, three of whom are missing a kidney, and see if they can handle being around the scent for 30 minutes without getting visibly annoyed or having an allergic reaction. Even nose plugs, assuming you can get 15 normal people and 3 people missing a kidney willing to wear them for 30 minutes without getting visibly annoyed, don't prevent eye irritation. Of course, if you trot them out one at a time, it's less of an issue.

Assuming you can arrange all that, you still have the problem of the claimant right before the test saying the scent is too strong and bothers her. Or maybe combined with the scent from the freshly shampooed carpets it annoys her so much she won't take the test.

This is the real world of challenge negotiations. To get back on track, what I'm saying is that it's easier to add an extra subject or two to change the odds than it is to go through all this ******** trying to find a perfectly blinded protocol. Therefore, the JREF and the IIG are smart not to cite any fixed odds requirements in advance.

ETA: Unlike ordinary scientific research, a challenge is a one-time event (maybe someday 2-times if anybody ever passes) that is likely to have press coverage. There are no second chances. There are no trial runs you can use refine the process. That means you really do have to worry about things like people being allergic to perfumes or freaking out about wearing a nose plug.

fls
5th December 2009, 06:14 AM
They all wore straw hats and scarves on the backs of their necks.

I'm sorry. That sentence started out somewhat differently and I didn't edit it properly. It should say "chose to use" (rather than "not to use").

I am trying to hammer home the point that when a scientist wants to conduct a study, she comes up with a protocol that she believes will be accepted as evidence by not only herself but by her peers. She then seeks out subjects for testing. If she can't find enough willing subjects, she either gives up because no other protocol is good enough or she makes adjustments. She doesn't negotiate with people.

But we've already agreed that this is irrelevant, since this is not a scientific study. Why do you keep bringing it up? Are you under the impression that you are the only person here who has ever walked through a protocol negotiation with a claimant?

A challenge is very different. The claimant makes some wild-ass claim that defies our current understandings of science. The organization and the claimant then negotiate a protocol. The claimant wants something acceptable for their "powers" to work, even if it is as irrational as the temperature of the room. The organization merely needs to eliminate ordinary and known means of passing the challenge. Nothing is actually proven if the claimant passes, so that's not an issue.

Exactly. And you don't need to keep making this point, because as far as I can tell, we have been in agreement on this all along. I've been making this point for several years, so unless came to this realization after we started this conversation and you are now telling me your position, it's a bit of a distraction for you to keep bringing it up.

This is important to understand because while burqas (for example) might make an excellent choice for blinding, the subject has just as much control over the protocol and can reject it. You can't sit in an ivory tower and make proclamations about what is a good protocol. You need to wrestle with the pig to find something you both can accept.

Yes, exactly.

Whose 5% cut-off?

The cut-off which is in general use for research in fields like medicine or parapsychology.

Do you think the general public knows about this number and how it is used?

Anyone with a bit of a science background will be familiar with it, or they might pick it up from press releases about research studies which often include statements about whether the results exceeded a certain p-value.

I don't think so. There's always a risk that is balanced against logistics and ease of presentation.

Yes, my point is that it should be relatively easy for your audience to tell whether or not something is unexpected or expected, without going through the process of calculating odds. If someone declares that they can roll doubles when they roll a pair of dice, I don't need to be able to do the calculation to see that doing so two times out of three would be unusual.

What I am asking is how you actually conduct the test and calculate the odds. I *think* you are saying that people are randomly selected and presented to her. The person selected would then go back into the pool for possible selection again.

So, how does that work? How many people does she see?

Ah, I see - you mean coming up with a threshold and figuring out whether she exceeded that threshold for the purpose of the challenge. I was thinking about whether it would be clear to a casual observer, or whether it could be clear in a media blurb, just how likely or unlikely each guess was. As I mentioned earlier, I think the fence-sitters are more interested in whether Anita can do something interesting and unexpected than whether or not she can pass some particularly rigid standard.

Or perhaps you can explain to me how to calculate the odds for another suggestion, which is where each person was presented to Anita one at a time where she doesn't know how many people are missing kidneys. For each person she would indicate both kidneys or which kidney was missing. I figure with dependent events it is 3/36 * 2/35 * 1/34 to calculate the odds of getting all three correct.

Beyond that, I'm hitting a brick wall trying to derive the math and haven't tried to look it up.

We can sit down and work out the probabilities for various outcomes for a specific protocol, and of course that part would be necessary for the purposes of handing out prize money. I don't know if you're asking for a tutorial, but I see it as a bit of a side issue to the idea of making the results intuitively obvious to the casual observer.

It really seems to me that the reason for the MDC and then smaller challenges like Anita's, is to show that these claims need to be treated with skepticism. And we can't do that if people won't subject their claims to testing (hence the carrot), and we also can't do that if people are able to demonstrate that they are able to perform unusual and unexpected feats. Whether or not they are unusual enough to pass (some might say) an unreasonable standard, is somewhat secondary.

And let's make sure that all 18 people don't have an allergic reaction to the scent.

Not at all, because it doesn't matter what you or I suggest. What matters is what the claimant is willing to accept, what is practical for the organization to do, and what the volunteer subjects are willing to endure.

Take your perfume suggestion. First, Anita would have to decide on what scent she likes. Then on the other coast the IIG has to try to find that scent. What if they don't make it anymore and Anita doesn't have enough? We're back to the drawing board.

Assume they can find the scent. Before they agree to it, they have to find 18 people, three of whom are missing a kidney, and see if they can handle being around the scent for 30 minutes without getting visibly annoyed or having an allergic reaction. Even nose plugs, assuming you can get 15 normal people and 3 people missing a kidney willing to wear them for 30 minutes without getting visibly annoyed, don't prevent eye irritation. Of course, if you trot them out one at a time, it's less of an issue.

Assuming you can arrange all that, you still have the problem of the claimant right before the test saying the scent is too strong and bothers her. Or maybe combined with the scent from the freshly shampooed carpets it annoys her so much she won't take the test.

Ah, I see. The issue of various smells present in the building and carried on or about people was so minor that it didn't deserve any mention in the protocol, but once you need a way to reject whatever I say, it becomes an insurmountable problem.

Silly me.

Linda

Uncayimmy
5th December 2009, 12:33 PM
But we've already agreed that this is irrelevant, since this is not a scientific study. Why do you keep bringing it up? Are you under the impression that you are the only person here who has ever walked through a protocol negotiation with a claimant?

You said, "there's no point in making this any easier" for the claimant as if the JREF or the IIG are sitting around trying to find ways to make things easier. Every time they make it "easier" it's a calculated concession.

The cut-off which is in general use for research in fields like medicine or parapsychology.

Anyone with a bit of a science background will be familiar with it, or they might pick it up from press releases about research studies which often include statements about whether the results exceeded a certain p-value.
I know what it is. By asking "whose" I was pointing out that your use of "our cutoff" presumes a familiarity that just isn't there. I doubt that the general public is even vaguely aware of this number.

Yes, my point is that it should be relatively easy for your audience to tell whether or not something is unexpected or expected, without going through the process of calculating odds.
How should they do it? Intuitively? Isn't that how these beliefs come about in the first place? If the test can serve as a mini-lesson in critical thinking, that's a good thing. That's why it's important to have the probability calculation be as simple as possible.

Ah, I see - you mean coming up with a threshold and figuring out whether she exceeded that threshold for the purpose of the challenge. I was thinking about whether it would be clear to a casual observer, or whether it could be clear in a media blurb, just how likely or unlikely each guess was. As I mentioned earlier, I think the fence-sitters are more interested in whether Anita can do something interesting and unexpected than whether or not she can pass some particularly rigid standard.
Sorry, but that's not what I'm asking. With the "three rounds of six" protocol, it's easy for a layperson to understand how the odds for getting all three correct are calculated. It's also easy to understand how with three guesses a person has about a 25% chance of getting one right.

Right now, I don't even understand what it is you propose with selecting people and allowing them to be put back in the pool. Is she looking at all 18 people at once? Sequentially? Is she randomly being presented people to read? If so, how many?

I can't comment on whether it's better that what they did if I don't even understand how it works.

We can sit down and work out the probabilities for various outcomes for a specific protocol, and of course that part would be necessary for the purposes of handing out prize money. I don't know if you're asking for a tutorial, but I see it as a bit of a side issue to the idea of making the results intuitively obvious to the casual observer.
Intuition: direct perception of truth, fact, etc., independent of any reasoning process;

I don't want people using intuition. That's what Anita did with the results, and look where that got her. I want to present people with basic facts and simple reasoning so they can think about it critically. The "three trials of six" scenario is pretty easy to understand for a layperson.

What I want to know is your proposal for the test and the probabilities for her getting k right in n guesses. Maybe it's better. I don't know.

Ah, I see. The issue of various smells present in the building and carried on or about people was so minor that it didn't deserve any mention in the protocol, but once you need a way to reject whatever I say, it becomes an insurmountable problem.

Silly me.
You're taking things personally. This has nothing to do with you. What I am trying to show is my reasoning behind not having fixed or consistent odds for challenges.

Suppose the claimant does not use the sense of smell as part of her special abilities. Suppose further that we think there's a small chance that smell might give her a slight edge over chance in our protocol, but at the same time we don't think it's at all possible that she could use it to ace the the test.

A) No mention of smell is made during negotiations. Nothing about smell is written into the contract we call a protocol. At the time of the test, the claimant refuses to take the test because she doesn't like a smell that doesn't seem to bother anyone else. Does she get a retest? No, and I doubt any judge would grant her one even if he didn't toss the case.

B) Same as the above except she complains about the smell after the test. Does she get a retest? No way in hell.

C) Smell is discussed during negotiations as a method of blocking ordinary means of detection, and a specific scent is agreed upon. It becomes part of the signed contract where it is the organization's duty to create the testing environment. Right before the test the claimant complains that the scent is too strong or when combined with the scent from the freshly shampooed carpet is too distracting. Does she get a retest? Yep.

So, if I'm the IIG or the JREF, I'm fine with scenarios A & B, but not C. If I normally use 1,000 to 1 odds in situations where I am 99.9% sure of my blinding, then I might go with 1,728 to 1 odds in this challenge and skip trying to deal with the perfume issue. It's a trade-off.

Thus, no fixed odds.

fls
5th December 2009, 02:55 PM
You said, "there's no point in making this any easier" for the claimant as if the JREF or the IIG are sitting around trying to find ways to make things easier. Every time they make it "easier" it's a calculated concession.

I understand that. I'm suggesting that it's valuable to consider not conceding on a point that provides the opportunity for a dramatic result, but to look for a different option instead.

I know what it is. By asking "whose" I was pointing out that your use of "our cutoff" presumes a familiarity that just isn't there. I doubt that the general public is even vaguely aware of this number.

I'm not sure that half the general public is relevant. I'm thinking of the people who pay any attention to the Challenge and the results, or who would be watching a program on Randi or attending one of his lectures.

How should they do it? Intuitively? Isn't that how these beliefs come about in the first place?

Yes, exactly. If they are forming their beliefs based on intuitive assessments, then I suspect that addressing that intuition will be more persuasive than asking them to rely on methods they are not comfortable using.

If the test can serve as a mini-lesson in critical thinking, that's a good thing. That's why it's important to have the probability calculation be as simple as possible.

I agree. I'm especially looking for ways to combine that with something familiar. Research on decision-making shows that we tend to depend upon intuitive assessments, which are sometimes very wrong. But if you present the information in a way that shows a closer match between the way we use our intuition and the actual probability, our assessments become more accurate.

Sorry, but that's not what I'm asking. With the "three rounds of six" protocol, it's easy for a layperson to understand how the odds for getting all three correct are calculated. It's also easy to understand how with three guesses a person has about a 25% chance of getting one right.

It still needs a basic understanding of binomial probabilities, and that is all that is needed for my suggestion as well.

Right now, I don't even understand what it is you propose with selecting people and allowing them to be put back in the pool. Is she looking at all 18 people at once? Sequentially? Is she randomly being presented people to read? If so, how many?

The subjects are presented one at a time, chosen randomly with replacement (which means that some subjects may be read more than once, and some not at all), one at a time, Anita indicates whether she sees the right, left or both kidneys.

The number of trials will depend upon how accurate Anita thinks she would be with that test and the p-value threshold that the IIG is looking for.

Intuition: direct perception of truth, fact, etc., independent of any reasoning process;

I don't want people using intuition. That's what Anita did with the results, and look where that got her. I want to present people with basic facts and simple reasoning so they can think about it critically. The "three trials of six" scenario is pretty easy to understand for a layperson.

But intuition is what people use and (as you mentioned) what Anita used. If you had presented Anita with a protocol in which the results she achieved did not intuitively seem all that remarkable, maybe she'd be less of a pain (I realize that's a ROF laughable idea :)).

What I want to know is your proposal for the test and the probabilities for her getting k right in n guesses. Maybe it's better. I don't know.

You would use a binomial distribution to calculate the probabilities. I would go through the work of figuring out the exact numbers and explaining it if this was a real protocol.

You're taking things personally. This has nothing to do with you.

Don't worry. I'm aware that others enjoy the same treatment. :)

What I am trying to show is my reasoning behind not having fixed or consistent odds for challenges.

Suppose the claimant does not use the sense of smell as part of her special abilities. Suppose further that we think there's a small chance that smell might give her a slight edge over chance in our protocol, but at the same time we don't think it's at all possible that she could use it to ace the the test.

A) No mention of smell is made during negotiations. Nothing about smell is written into the contract we call a protocol. At the time of the test, the claimant refuses to take the test because she doesn't like a smell that doesn't seem to bother anyone else. Does she get a retest? No, and I doubt any judge would grant her one even if he didn't toss the case.

B) Same as the above except she complains about the smell after the test. Does she get a retest? No way in hell.

C) Smell is discussed during negotiations as a method of blocking ordinary means of detection, and a specific scent is agreed upon. It becomes part of the signed contract where it is the organization's duty to create the testing environment. Right before the test the claimant complains that the scent is too strong or when combined with the scent from the freshly shampooed carpet is too distracting. Does she get a retest? Yep.

So, if I'm the IIG or the JREF, I'm fine with scenarios A & B, but not C. If I normally use 1,000 to 1 odds in situations where I am 99.9% sure of my blinding, then I might go with 1,728 to 1 odds in this challenge and skip trying to deal with the perfume issue. It's a trade-off.

Thus, no fixed odds.

I was simply pointing out that since your solution does not actually address the problem - i.e. increasing the odds does not necessarily affect whether or not it will be too easy for her to pass the test when holes are left in the protocol - it may be better to address holes in the protocol instead.

Linda

Uncayimmy
5th December 2009, 06:22 PM
Yes, exactly. If they are forming their beliefs based on intuitive assessments, then I suspect that addressing that intuition will be more persuasive than asking them to rely on methods they are not comfortable using. [quote]
Wow. I find that condescending as well as ineffective. The way to teach critical thinking skills is to, well, teach critical thinking skills. Setting up a test so their "intuition" matches the results is a rather ridiculous approach.

[QUOTE]The subjects are presented one at a time, chosen randomly with replacement (which means that some subjects may be read more than once, and some not at all), one at a time, Anita indicates whether she sees the right, left or both kidneys.

The number of trials will depend upon how accurate Anita thinks she would be with that test and the p-value threshold that the IIG is looking for.

Anita claimed 100% accuracy and we know the IIG was okay with 1,728 to 1 odds, so please continue with your explanation.

* How many trials?
* What are her odds of getting 1 or more correct while still failing?

You see, Linda, I'm way above the average layman when it comes to understanding stats, but I am at a loss right now how to calculate this. Their version of the test is incredibly simply to calculate and explain even if you've never read a statistics textbook.

You would use a binomial distribution to calculate the probabilities. I would go through the work of figuring out the exact numbers and explaining it if this was a real protocol.
The explanation of the statistics and the chances of getting "some" right are incredibly important factors in setting up a test like this. They don't mean squat in real research but they mean everything in a publicity stunt.

I was simply pointing out that since your solution does not actually address the problem - i.e. increasing the odds does not necessarily affect whether or not it will be too easy for her to pass the test when holes are left in the protocol - it may be better to address holes in the protocol instead.
And thus we're back to wrestling with the pig.

Look, you patch all the holes that money, time, and the irrational person on the other side of the table will let you patch. You then make a judgment call about whether you have enough confidence that your money is completely safe. Sometimes you tweak the odds the to make yourself feel more comfortable.

After all, there's nothing inherently "right" about 1,000 to 1, 20 to 1, 10,000 to 1, or 1,728 to 1. It boils down to a judgment call anyway.

fls
7th December 2009, 06:23 AM
Anita claimed 100% accuracy and we know the IIG was okay with 1,728 to 1 odds, so please continue with your explanation.

* How many trials?
* What are her odds of getting 1 or more correct while still failing?

You see, Linda, I'm way above the average layman when it comes to understanding stats, but I am at a loss right now how to calculate this. Their version of the test is incredibly simply to calculate and explain even if you've never read a statistics textbook.

I agree that the results of her test were fairly simple to calculate, which is why the average layman can understand that she did something unexpected.

There are two aspects to consider when looking at the test I suggested. As Anita makes each guess, the average layman will be able to understand the probability that a "missing kidney" guess is correct and the probability that a "kidney" guess is correct. And their focus will be on whether or not she gets the "missing kidney" guesses correct, since those will obviously be the low probability guesses. The layman is looking at the results after they have happened which makes these estimates fairly straightforward.

The IIG, on the other hand, is looking at the results before they have happened. And this is the part which can be fairly straight forward or more complicated depending upon the set-up, because they basically have to guess at what the distribution of results might be, and set criteria based on that. But this part is also relatively unimportant in terms of whether or not your audience understands the test and the results, as the explanation can be more direct.

The explanation of the statistics and the chances of getting "some" right are incredibly important factors in setting up a test like this. They don't mean squat in real research but they mean everything in a publicity stunt.

I wish it was true that background information means squat when setting up real research. It would make my life much easier. :)

And thus we're back to wrestling with the pig.

Look, you patch all the holes that money, time, and the irrational person on the other side of the table will let you patch. You then make a judgment call about whether you have enough confidence that your money is completely safe. Sometimes you tweak the odds the to make yourself feel more comfortable.

After all, there's nothing inherently "right" about 1,000 to 1, 20 to 1, 10,000 to 1, or 1,728 to 1. It boils down to a judgment call anyway.

Like I said, I don't disagree that there are valid reasons to maintain flexibility in setting the odds. I just don't think "let's pretend it's a solution even though it doesn't actually solve the problem" should be one of them.

Linda

Uncayimmy
7th December 2009, 02:39 PM
Like I said, I don't disagree that there are valid reasons to maintain flexibility in setting the odds. I just don't think "let's pretend it's a solution even though it doesn't actually solve the problem" should be one of them.

Suppose you design a microwave that has issues with even heating. Do you spend a bunch of time and money redesigning the guts or do you add a simple carousel to turn the food?

fls
7th December 2009, 02:59 PM
Suppose you design a microwave that has issues with even heating. Do you spend a bunch of time and money redesigning the guts or do you add a simple carousel to turn the food?

That example doesn't make any sense, since adding a simple carousel will actually address the problem. What I wouldn't do is spend a bunch of time and money painting the walls red and claim that it helps.

Linda

Uncayimmy
7th December 2009, 06:36 PM
That example doesn't make any sense, since adding a simple carousel will actually address the problem. What I wouldn't do is spend a bunch of time and money painting the walls red and claim that it helps.

I don't recall anyone suggesting anything as useless as painting walls red to fix a defective microwave emitter. I have heard suggestions where instead of fixing a defective emitter that steps be taken to reduce the chances of it heating unevenly. A carousel doesn't actually fix the problem, it just lessens the effect under most circumstances. What happens when you try to heat an item that is too big to rotate?

That's why I thought it an appropriate analogy.

So, based on our discussions, what do you consider a "paint the walls red" type of suggestion and why?

fls
8th December 2009, 07:08 AM
I don't recall anyone suggesting anything as useless as painting walls red to fix a defective microwave emitter. I have heard suggestions where instead of fixing a defective emitter that steps be taken to reduce the chances of it heating unevenly. A carousel doesn't actually fix the problem, it just lessens the effect under most circumstances. What happens when you try to heat an item that is too big to rotate?

That's why I thought it an appropriate analogy.

Okay.

So, based on our discussions, what do you consider a "paint the walls red" type of suggestion and why?

It goes back to this post:

http://forums.randi.org/showthread.php?postid=5372144#post5372144

Holes in the protocol mean that there is an effect contributing to the results in addition to chance and paranormal abilities. We do not know the size of that effect. To "increase the odds" means that you have either increased the number of trials or you have increased the proportion of hits necessary to count as a success. Increasing the number of trials increases the likelihood that the effect of the holes will show up in the results (you have increased your power to detect an effect). Alternatively, if you don't know your hole effect size, you don't know to what extent increasing the effect size you are testing for (increasing the hit rate) changes or reduces the likelihood that the test will be passed.

Linda

Uncayimmy
8th December 2009, 10:34 AM
It goes back to this post:

http://forums.randi.org/showthread.php?postid=5372144#post5372144

Holes in the protocol mean that there is an effect contributing to the results in addition to chance and paranormal abilities. We do not know the size of that effect. To "increase the odds" means that you have either increased the number of trials or you have increased the proportion of hits necessary to count as a success. Increasing the number of trials increases the likelihood that the effect of the holes will show up in the results (you have increased your power to detect an effect). Alternatively, if you don't know your hole effect size, you don't know to what extent increasing the effect size you are testing for (increasing the hit rate) changes or reduces the likelihood that the test will be passed.

You still did not give an example of painting the walls red.

You're in medical research, right? I'm sure your studies have holes in them. In a drug trial, for example, is there an eyewitness to ensure that each subject actually swallows the pills at the specified time intervals? If so, does someone verify that they don't immediately go to the bathroom and puke it up? Does somebody ensure that no other medicines are taken without proper documentation? Is there a financial penalty or any other method for ensuring that reports of subjective side-effects are truthful?

To me those are all holes. I trust the medical establishment takes these issues into effect when deciding what effect sizes are deemed significant. I am not seeing the difference with these challenge protocols acknowledging their imperfection and wanting higher odds to give themselves more confidence.

One major difference to medical trials is that a researcher can reject any subject who doesn't agree to the rules. In a challenge, the organization and the subject have to negotiate something acceptable to both parties, and this typically entails compromises. Which is to say the organization trades possible gaps for more significant odds.

fls
8th December 2009, 01:05 PM
You still did not give an example of painting the walls red.

Well, I explained how the solution might not actually make the IIG money safer. Wouldn't that serve as an example?

You're in medical research, right? I'm sure your studies have holes in them. In a drug trial, for example, is there an eyewitness to ensure that each subject actually swallows the pills at the specified time intervals? If so, does someone verify that they don't immediately go to the bathroom and puke it up? Does somebody ensure that no other medicines are taken without proper documentation? Is there a financial penalty or any other method for ensuring that reports of subjective side-effects are truthful?

To me those are all holes. I trust the medical establishment takes these issues into effect when deciding what effect sizes are deemed significant. I am not seeing the difference with these challenge protocols acknowledging their imperfection and wanting higher odds to give themselves more confidence.

The difference is that medical trials have control groups. And this is a big deal when it comes to addressing this issue. Because the things that you describe will of course influence the result. But they will not influence the result differently between the groups, so they are effectively cancelled out. Because you don't have a control group, you don't have a way to cancel out the effect of any holes.

Linda

Uncayimmy
8th December 2009, 08:38 PM
Well, I explained how the solution might not actually make the IIG money safer. Wouldn't that serve as an example?
Honestly, I'm not clear at all what specific choice the IIG made that you think was painting a wall red to fix a microwave. Can you give a "they did this when they could have done that" example?


The difference is that medical trials have control groups. And this is a big deal when it comes to addressing this issue. Because the things that you describe will of course influence the result. But they will not influence the result differently between the groups, so they are effectively cancelled out. Because you don't have a control group, you don't have a way to cancel out the effect of any holes.

I disagree. Grapefruit juice is known to affect some antibiotics, right? If not, let's just pretend it does. We tell both groups not to drink grapefruit juice, but very clearly we have no way of knowing if they do or not. How does having a control group help that situation?

Or let's say you're testing a heart medication and tell people not to take aspirin because we know it can prevent heart attacks (humor me if I'm wrong). The medication seems to cause headaches, so the test group has a higher percentage of people with an incentive to break protocol and take aspirin.

We also know that it's virtually impossible to control for everything between the two groups. There's no guarantee that both groups will have the same percentage of people with arthritis or stressful jobs, so maybe one group is more likely because of that to break protocol and take aspirin.

To this layman that's one reason why you work with confidence levels and p-values. You use the statistics to give yourself a margin of error. And that's what groups like the IIG and JREF do, only they do it because for various reasons they can't construct "perfect" protocols.

steenkh
9th December 2009, 02:13 AM
I disagree. Grapefruit juice is known to affect some antibiotics, right? If not, let's just pretend it does. We tell both groups not to drink grapefruit juice, but very clearly we have no way of knowing if they do or not. How does having a control group help that situation?
Why would we want two groups not drinking grapefruit juice? The control group would be the one not drinking grapefruit juice, while the other group was drinking the juice. If the groups are big enough, there will be a difference between the two groups, even if some people cheat.

fls
9th December 2009, 05:12 AM
I disagree. Grapefruit juice is known to affect some antibiotics, right? If not, let's just pretend it does. We tell both groups not to drink grapefruit juice, but very clearly we have no way of knowing if they do or not. How does having a control group help that situation?

Because with random group assignment, those who drink grapefruit juice will be distributed into both groups. The 'effect' of drinking grapefruit juice will be present in both groups.

Or let's say you're testing a heart medication and tell people not to take aspirin because we know it can prevent heart attacks (humor me if I'm wrong). The medication seems to cause headaches, so the test group has a higher percentage of people with an incentive to break protocol and take aspirin.

We also know that it's virtually impossible to control for everything between the two groups. There's no guarantee that both groups will have the same percentage of people with arthritis or stressful jobs, so maybe one group is more likely because of that to break protocol and take aspirin.

But there's no particular reason to think that people with arthritis or stressful jobs or who take aspirin, will be distributed differently between the two groups. That is, if any of those things have an effect on the outcome, they will affect the outcome in both groups. If there are differences in the extent to which randomly distributed characteristics can affect the outcome in those who are taking the active treatment, then this falls under the treatment effect, which is the effect of interest.

The really valuable part of this is that we know the distribution of any of these characteristics came about due to random assignment. Which means that our statistics which are based on random sampling actually apply to them. So things like p-values and confidence intervals do actually accurately describe our confidence, instead of the situation in the IIG test where a distribution based on random sampling was applied to what was known to be a non-random distribution.

To this layman that's one reason why you work with confidence levels and p-values. You use the statistics to give yourself a margin of error. And that's what groups like the IIG and JREF do, only they do it because for various reasons they can't construct "perfect" protocols.

My point is simply that you don't know the extent to which the confidence intervals and p-values from a known distribution can be transferred to a distribution which is known to be different. Simply ignoring that problem and pretending that alterations to the margin of error based on the known distribution will take care of it is merely wishful thinking.

Linda

Uncayimmy
9th December 2009, 10:37 AM
Because with random group assignment, those who drink grapefruit juice will be distributed into both groups. The 'effect' of drinking grapefruit juice will be present in both groups.

Bolding mine. You say they "will be" distributed among both groups. You don't know that. You're banking on the distribution of grapefruit juice drinkers and protocol violators to be close enough not to skew the results.

But there's no particular reason to think that people with arthritis or stressful jobs or who take aspirin, will be distributed differently between the two groups.
Yes, there is a reason to believe they will be distributed unevenly. It's basic statistics. I'm sure you could tell me with degrees of confidence the likelihood of 50 of those people in a group of 200 being distributed 25-25, 15-35, 5-45 and 0-50.

The only difference I see between these situations and challenge situations is that you can quantify the risk statistically. In the challenges people make judgments regarding weaknesses in design. They figure, "If the claimant only sees their backs while they sit still in a chair, there's a 'very small' chance there might be some information the claimant can use to better the odds beyond pure chance. Therefore, instead of our usual threshold of 1,000 to 1 we are going to use 1,728 to 1."

My point is simply that you don't know the extent to which the confidence intervals and p-values from a known distribution can be transferred to a distribution which is known to be different. Simply ignoring that problem and pretending that alterations to the margin of error based on the known distribution will take care of it is merely wishful thinking.

I'm saying you do it all the time in the medical field because your protocols are often far less strictly enforced than challenge protocols. There are potentially a lot more things people can do to violate protocol, and while you have reasonable estimates of how they are distributed among the groups, you do not know the actual rates. You compensate for this with confidence levels and statistical practices.

I don't have a problem with the practice in the medical field or in the challenges. What I can't understand is why you see one as superior to the other. If anything, the challenge protocols are superior because the tests are so much shorter and tightly controlled.

fls
9th December 2009, 11:41 AM
Bolding mine. You say they "will be" distributed among both groups. You don't know that. You're banking on the distribution of grapefruit juice drinkers and protocol violators to be close enough not to skew the results.

Right, because we have very detailed knowledge about the distribution of random samples.

Yes, there is a reason to believe they will be distributed unevenly. It's basic statistics. I'm sure you could tell me with degrees of confidence the likelihood of 50 of those people in a group of 200 being distributed 25-25, 15-35, 5-45 and 0-50.

Exactly. We can describe exactly how confident we can be about a distribution based on random sampling. It's powerful information.

The only difference I see between these situations and challenge situations is that you can quantify the risk statistically.

Yes. You can quantify the risk. So you can quantify whether or not you have adequately accounted for the risk.

In the challenges people make judgments regarding weaknesses in design. They figure, "If the claimant only sees their backs while they sit still in a chair, there's a 'very small' chance there might be some information the claimant can use to better the odds beyond pure chance. Therefore, instead of our usual threshold of 1,000 to 1 we are going to use 1,728 to 1."

How are you going to change the odds? If you do it by increasing the number of trials, you make it easier for the claimant to pass using their "some information". If you do it by increasing your threshold, what makes you confident that a 1.728-fold increase is adequate given that claimants typically describe abilities (under conditions where they don't think that they are using normal sensing) that would represent a 5 to 10-fold increase?

I'm saying you do it all the time in the medical field because your protocols are often far less strictly enforced than challenge protocols. There are potentially a lot more things people can do to violate protocol, and while you have reasonable estimates of how they are distributed among the groups, you do not know the actual rates. You compensate for this with confidence levels and statistical practices.

We actually do a bunch of other stuff like measure how they are distributed, measure the resultant effect size, randomize on variables which may have an effect, give everyone the same intervention, etc. None of that really matters, though, since (as you mentioned a gazillion times) challenges are performed with different goals than scientific tests.

I don't have a problem with the practice in the medical field or in the challenges. What I can't understand is why you see one as superior to the other. If anything, the challenge protocols are superior because the tests are so much shorter and tightly controlled.

I didn't say that one is superior to the other. I pointed out that having a control group is an incredibly powerful tool when you are unable to eliminate bias - something which just happens to be of interest to both scientific study and challenges.

Linda

Uncayimmy
9th December 2009, 07:41 PM
[QUOTE]Yes. You can quantify the risk. So you can quantify whether or not you have adequately accounted for the risk.
Well, sorta. You can quantify how you *might* have adequately accounted for the risk. Look at poker. We can calculate absolute pot odds without a problem and determine that a heavy bet with a full house is a good idea. If the other guy has a straight flush, I did not "adequately" account for this situation. I made the "right" decision but I did the "wrong" thing because I lost the hand.

How are you going to change the odds? If you do it by increasing the number of trials, you make it easier for the claimant to pass using their "some information".
That's an assertion with no evidence and, quite frankly, it's rather counterintuitive unless you mean the likelihood of passing *one* trial. I can't believe you mean that since you *always* increase the likelihood of passing one trial when adding additional trials.

Suppose the "gap" we don't find practical to close is cheating by some form of signaling. With two trials, I stand virtually no chance of determining a pattern in the environment (finding a signal within the noise). With 2,000 trials you can bet I'm going to find that signal.

Suppose the gap is that there's some little thing a decoy might do to reveal that he's not the target. The step I would need to take to entirely prevent this is too expensive. I estimate that there's only a 1 in 100 chance that this might happen. So, the claimant has a 99 in 100 chance of having three trials with a 1 in 12 chance and a 1 in 100 chance of having three trials with a 1 in 10 chance (I'm on the 2 kidney thing).

If I add one more trial, then my worst case scenario is four trials of a 1 in 10 chance. Those odds are more difficult than my original best-case scenario of of three trials of 1 in 12 odds.

Clearly I have demonstrated that your assertion is not true in all cases. So, under what scenarios will adding trials increase the likelihood of passing? Please be specific.

fls
10th December 2009, 08:37 AM
Well, sorta. You can quantify how you *might* have adequately accounted for the risk. Look at poker. We can calculate absolute pot odds without a problem and determine that a heavy bet with a full house is a good idea. If the other guy has a straight flush, I did not "adequately" account for this situation. I made the "right" decision but I did the "wrong" thing because I lost the hand.

That's not really a good example, since a simple understanding of how you calculate odds on the probability of any particular poker hand does not actually include the conditions under which you encounter those hands.

That's an assertion with no evidence

It is basic statistics. I have simply described what "power" means in terms of your ability to demonstrate an effect.

and, quite frankly, it's rather counterintuitive unless you mean the likelihood of passing *one* trial.

It is somewhat counter-intuitive, which I suspect is why it usually gets little to no consideration in protocol discussions. It is not the likelihood of passing one trial. It is the likelihood of passing your threshold for success with a given effect size.

I discussed this in Pavel's thread with examples (http://forums.randi.org/showthread.php?postid=5032589#post5032589).

Let's take an effect size of 0.80, which represents a 'large' effect size. For trials with p=0.50, this means translates to the following numbers of hits for increasing trial numbers:

1/1, 9/10, 22/25, and 43/50.

The p-value for each of those results if due to chance are:

1.00, 0.01, 0.0001, and 0.0000001.

The number of hits necessary to exceed a standard of 0.001 would be:

N/A, 10/10, 21/25, and 37/50.

Which translates to success rates of:

N/A, 100%, 84%, and 74%.

Which reflects effect sizes of:

N/A, 1.571, 0.748, and 0.500.

While the person is able to accomplish the same thing in each in each set of trials, whether or not they will be able to exceed the threshold depends upon the total number of trials. Conversely, the larger the total number of trials, the lower their success rate needs to be in order to pass, and smaller and smaller effect sizes (i.e. the effect of 'holes') will allow them to pass.

I can't believe you mean that since you *always* increase the likelihood of passing one trial when adding additional trials.

Suppose the "gap" we don't find practical to close is cheating by some form of signaling. With two trials, I stand virtually no chance of determining a pattern in the environment (finding a signal within the noise). With 2,000 trials you can bet I'm going to find that signal.

Suppose the gap is that there's some little thing a decoy might do to reveal that he's not the target. The step I would need to take to entirely prevent this is too expensive. I estimate that there's only a 1 in 100 chance that this might happen.

But you realize that you are pulling this number out of your ass, right? What if it's one in ten or one in two?

So, the claimant has a 99 in 100 chance of having three trials with a 1 in 12 chance and a 1 in 100 chance of having three trials with a 1 in 10 chance (I'm on the 2 kidney thing).

If I add one more trial, then my worst case scenario is four trials of a 1 in 10 chance. Those odds are more difficult than my original best-case scenario of of three trials of 1 in 12 odds.

This works if the number you have pulled out of your ass is reasonable. How would you go about figuring out whether it is or not?

The amount of residual bias present in good-quality RCT's is estimated to be 0.10. In good-quality studies without control groups, it is estimated to be 0.20. Ray Hyman looked at the amount of bias which may be present in the ganzfeld studies (as in Anita's test, these involve making guesses whilst attempting to remove any possible sources of normal information) and found that there may be at least 0.30. What these numbers indicate is the proportion of studies which should be found to be negative, which will seem to be positive. Now, as you can see, a bias of 0.10 utterly dwarfs the effect of playing around with the odds. If you are worried about the one false-positive result due to chance in 1000 tests, this will be dwarfed by the 100 false positive results due to bias - an effect that won't even be touched by the removal of that one false positive due to chance.

Now, under the conditions of Anita's test, some of those sources of bias will not be present - the effect of multiple testing, flexibility in specifying outcomes (at least for the purpose of passing the test), and publication bias, will not be present. You've never taken other biases, like the bias introduced by the asymmetry in the location of the missing kidney and asymmetry in her guesses, or randomization (this isn't mentioned in the protocol) into consideration. But mostly we worry about the effect of her picking up subconscious or conscious clues from the subjects and examiners. And we've tried to mitigate this through partial blinding. So how successful are we at reducing bias. Is it a thousand-fold less? Is it ten-fold less?

Other situations, where people claim to have eliminated bias (parapsychology studies, claimants performing informal tests), when compared to subsequent testing, can show effect sizes of 0.20 or 0.50 or more due to the sort of bias we are worried about with Anita. And as I illustrated in my example above, increasing numbers of trials allow for smaller and smaller effect sizes to lead to a result which passes the criteria for a successful test.

Linda

References:
Why Most Published Research Findings Are False (http://www.plosmedicine.org/article/info%3Adoi%2F10.1371%2Fjournal.pmed.0020124).
Statistical Power Analysis for the Behavioral Sciences, Jacob Cohen.
Commentary on John P.A. Ioannidis' 'Why Most Published Research Findings Are False', Ray Hyman, Skeptical Inquirer, Vol. 30, March-April 2006.

Uncayimmy
10th December 2009, 12:19 PM
That's not really a good example, since a simple understanding of how you calculate odds on the probability of any particular poker hand does not actually include the conditions under which you encounter those hands.
We're getting sidetracked, but you obviously don't play poker. Do a Google search for pot odds for poker, and you'll find all sorts of handy charts.

In Texas hold 'em it's not terribly difficult to determine how many "outs" you have to draw to certain hands. From there it's not difficult to look at the size of the pot in terms of how many "bets" there are. If it's 5 to 1 to make your hand and 4 to 1 bets in the pot, you're far less inclined to make the bet than if it's 10 to 1 in the pot.

You also know how likely your hand is to take the pot. Sometimes you absolutely know you might draw to the best hand (or already have it) because you can see 3, then 4, then 5 of the possible 7 cards each opponent has, and you know which two cards they do not have.

My point in bringing that up is that making a bet on a 5 to 1 draw with 12 to 1 pot odds is the "correct" decision but you still might lose the hand. Thus "accounting" for probabilities doesn't mean controlling the outcome.

It is somewhat counter-intuitive, which I suspect is why it usually gets little to no consideration in protocol discussions. It is not the likelihood of passing one trial. It is the likelihood of passing your threshold for success with a given effect size.

Ah, I see. So you're concerned about effect sizes and the organization not taking that into account. This can be handled just fine in any individual case. I haven't seen any cases where this wasn't properly addressed by the organization. Have you?

But you realize that you are pulling this number out of your ass, right? What if it's one in ten or one in two?
The inability to determine something with precision is not the same as the inability to determine it with degrees of confidence. This is what I am trying to get you to understand about your studies with control groups. For example, you cannot say with any certainty the likelihood of people breaking protocol. You can say that in a random sample you expect with Y degree of confidence N people with arthritis. You can say with Y degree of confidence what the distribution of arthritis sufferers will be between the two groups. But you don't *know* for sure, nor do you know how likely they will be to break protocol and pop some aspirin.

The world of challenges is a lot more messy, especially when human targets are involved. We make observations of the world and discuss the possibilities. The numbers we assign are not pulled out of our asses. They are arrived at through discussion by intelligent people who are very cautious.

I don't need to crunch a bunch of numbers to say I'm more likely to get struck by a car at night while wearing dark clothes and crossing a busy street as opposed to crossing a quiet street at noon.

Where challenges have an advantage is that they can strictly enforce protocol. They have zero doubt that the protocol will be executed 100% correctly because they have people watching at the time, and they review the video later.

The amount of residual bias present in good-quality RCT's is estimated to be 0.10. In good-quality studies without control groups, it is estimated to be 0.20.
Estimated? You mean you pulled it out of your ass.

I'm done with this. I have repeatedly asked you for specific examples regarding gaps/holes, but I have received none. Next time there is a protocol discussion, I hope you'll come wrestle with the pigs. You could have joined the thread on the IIG protocol for VFF, but I think you'd rather argue in the abstract. I'm the opposite.

fls
10th December 2009, 01:27 PM
We're getting sidetracked, but you obviously don't play poker. Do a Google search for pot odds for poker, and you'll find all sorts of handy charts.

I don't think you understood what I meant, but I agree that it's a sidetrack.

Thus "accounting" for probabilities doesn't mean controlling the outcome.

I wasn't suggesting that the outcomes are controlled - that something has 1000 to one odds against happening due to chance doesn't mean that you can guarantee that it won't happen.

Ah, I see. So you're concerned about effect sizes and the organization not taking that into account. This can be handled just fine in any individual case. I haven't seen any cases where this wasn't properly addressed by the organization. Have you?

I have repeatedly pointed out that you have no idea what the effect size is for Anita's ability to pick up subconscious clues, which doesn't seem to be a "just fine" situation. :)

Pavel's protocol is another example where the JREF paid little or no attention to effect size in order to dismiss Pavel's application. Although, some might call that "just fine".

The inability to determine something with precision is not the same as the inability to determine it with degrees of confidence.

It's just that in this case you haven't done either. It's not just a matter of an imprecise measure, it's an inability to narrow it down to something smaller than 3 or 4 orders of magnitude.

This is what I am trying to get you to understand about your studies with control groups. For example, you cannot say with any certainty the likelihood of people breaking protocol. You can say that in a random sample you expect with Y degree of confidence N people with arthritis. You can say with Y degree of confidence what the distribution of arthritis sufferers will be between the two groups. But you don't *know* for sure, nor do you know how likely they will be to break protocol and pop some aspirin.

Well, we can know because we can collect that information from the people in the study. But for the purpose of illustration, let's pretend that we don't. Say that we are interested in whether or not a drug prevents heart attacks, so then the proportion of arthritis sufferers who are taking aspirin becomes important because this will also influence the heart attack rate. If we look at the intervention group (the group that took the drug), we see that they have a lower rate of heart attack than the general population, but we have no idea how much of that was due to the use of aspirin. All of the effect could be due to aspirin, half of the effect, or none of the effect - we have no clue as to the effect size of the intervention. But, if we have a placebo control group, we can accurately describe our confidence that whatever the effect of aspirin-taking arthritis sufferers has on one group, it has the same effect on the other group. So without having to know anything at all about the size of the aspirin effect, we can reasonably confidently state what effect size was due to the intervention.

The world of challenges is a lot more messy, especially when human targets are involved. We make observations of the world and discuss the possibilities. The numbers we assign are not pulled out of our asses. They are arrived at through discussion by intelligent people who are very cautious.

So what data did you use to estimate the effect size?

I don't need to crunch a bunch of numbers to say I'm more likely to get struck by a car at night while wearing dark clothes and crossing a busy street as opposed to crossing a quiet street at noon.

Where challenges have an advantage is that they can strictly enforce protocol. They have zero doubt that the protocol will be executed 100% correctly because they have people watching at the time, and they review the video later.

Yes. As I mentioned, they are able to eliminate some of those factors which contribute to bias.

Estimated? You mean you pulled it out of your ass.

I gave you the references for the numbers. They were based on measured examples.

I'm done with this. I have repeatedly asked you for specific examples regarding gaps/holes, but I have received none.

I'm sorry. I think I misunderstood your request. I thought you were asking for specific examples of how effect size, trial number and threshold change the likelihood of a claimant passing the test when they don't have paranormal abilities.

You are asking about gaps/holes in the protocol for Anita's test? Okay, let's start with one I mentioned in my last post. There is no randomization described in the protocol. There is no description of how the subjects were collected or arranged.

Next time there is a protocol discussion, I hope you'll come wrestle with the pigs.

I have been involved in a number of protocol discussions, both for the Challenge and for more informal tests. I didn't refer to the people I was talking to as 'pigs' though. ;)

You could have joined the thread on the IIG protocol for VFF, but I think you'd rather argue in the abstract. I'm the opposite.

It depends. I am interested in the concrete when it is meaningful, such as when we are working towards a real test. But in this case, the discussions with this claimant have been particularly acrimonious, and they quickly went down a path that I didn't think was useful, so I chose to stay out of it. And judging by the response I've received from you, my input would have been undesirable anyway.

Linda

Furcifer
10th December 2009, 02:04 PM
Humph :( I feel like I've been two timed, this discussion bares a striking resemblance to one I was having in another forum :D

I'm curious what the "bias" and "effect size" is. I'll have to read up on some of these posts.