View Full Version : Poll: Accuracy of Test Interpretation
Wrath of the Swarm
27th April 2004, 10:48 AM
Let's say that I went for an annual medical checkup, and the doctor wanted to know if I had a particular disease that affects one out of every thousand people. To check, he performed a blood test that is known to be about 99% accurate. The test results came back positive. The doctor concluded that I have the disease.
How likely is it that the diagnosis is correct?
It's best if you don't sit down to work it out. Just give your honest opinion about what you think is likely. If you happen to know the formula that gives the correct answer, feel free to use it.
Wrath of the Swarm
27th April 2004, 10:51 AM
S
P
O
I
L
E
R
S
I'll post the correct answer in three days when the poll shuts down.
slimshady2357
27th April 2004, 11:07 AM
Too easy, but what's the chances of you having the disease if you get three positives in a row?
Adam
Wrath of the Swarm
27th April 2004, 11:14 AM
A lot of doctors can't do this problem correctly. I expect the forumites to do better, but I think I'll still make my point.
(After three positive results, the chances of having the disease are about 99.9%.)
[edit] (This assumes, of course, that each test is independent of the others, which is not a realistic assumption. Still, tests with such high degrees of accuracy are generally unrealistic as well.)
Rolfe
27th April 2004, 11:28 AM
Hmmm, big piece of information missing. Are you showing any clinical signs of the disease, or is your probability of infection that of the general background population?
The thing is, to get the predictive value of a test (which is what Wrath is asking), you need to know the incidence of the condition in the population representative of the individual being tested. This is obviously higher if that population is "sick people with clinical signs typical of the disease in question". In fact, the relevant figure is the clinical probability that this individual is affected.
I also need to know if the figure of "99% accurate" refers particularly to specificity. Specificity is the percentage of positive results which are correct (i.e. identify an affected individual) and is the important parameter here. Sensitivity, on the other hand, is the percentage of negative results which are correct (i.e. identify an unaffected individual). The two together may be loosely combined and referred to as "accuracy", but if we are talking about a positive result, it is the specificity figure one needs to know.
However, if we assume for the moment that Wrath's probability of being affected is the same as the general population, that is 0.1%, and that the specificity of the test is 99% (I don't really care about sensitivity in this situation), a positive result is 90.98% likely to be wrong and 9.02% likely to be correct.
So there is only (approximately) a 9% chance that Wrath actually has the condition.
(I'm sorry, I cheated, I have a spreadsheet on my computer to produce this information and a pretty graph that demonstrates how the predictive value of a positive or a negative result varies with clinical probability of infection, and sensitivity and specificity of the test.)
http://www.b5-dark-mirror.demon.co.uk/graph.jpg
Rolfe.
geni
27th April 2004, 11:29 AM
It's no good becase of the way Wrath of the Swarm worded the question there is more than one correct answer
Rolfe
27th April 2004, 11:31 AM
Originally posted by geni
It's no good becase of the way Wrath of the Swarm worded the question there is more than one correct answerWell, I did state some assumptions. Did I leave anything out?
Rolfe.
Wrath of the Swarm
27th April 2004, 11:31 AM
No, there's only one correct answer.
Rolfe, you are both an idiot and a jerk. Now that you've posted an answer, the poll is completely useless.
(Oh, and your requests for more information are pointless. The information provided is more than sufficient to answer the question. Besides, this is the question that doctors have such trouble with - I don't think adding more would actually help them.)
Rolfe
27th April 2004, 11:33 AM
Now that it's been explained, you don't get to feel so superior. What a shame.
(And I didn't ask for more information, I asked for better information.)
Rolfe.
Wrath of the Swarm
27th April 2004, 11:34 AM
You mean now that it's been explained, we can no longer demonstrate that most people here have no grasp of probability theory as it applies to medical diagnosis?
Yep.
Deetee
27th April 2004, 11:35 AM
So what is Wrath's "point", anyhow?
Since in his hypothetical scenario he deliberately had the doctor jump to the wrong conclusion, I suspect he is just out to try and rubbish the poor medics again.
I suspect that docs know a bit more about sensitivity, specificity, predicitive values, odds ratios etc than WoS gives them credit for, anyway.
geni
27th April 2004, 11:36 AM
Originally posted by Rolfe
Well, I did state some assumptions. Did I leave anything out?
Rolfe.
Doing the maths the way I am suggests that you get wrath's answers by assuming the error is in false posertives. Playing around with false negativs screws the results.
And Wrath you had already given away way to much data in your posts
geni
27th April 2004, 11:37 AM
wrong button
Rolfe
27th April 2004, 11:38 AM
That was exactly the assumption I made. That the 99% "accuracy" (meaningless term, really) quoted was actually a specificity figure. If it included the sensitivity and the specificity was actually something different, then the result is void.
Anyway, five out of five posters already got it right while I was composing my post and before I hit the send button.
Rolfe.
geni
27th April 2004, 11:41 AM
Originally posted by Wrath of the Swarm
You mean now that it's been explained, we can no longer demonstrate that most people here have no grasp of probability theory as it applies to medical diagnosis?
Yep.
Er no beacuse most of the time information wont be presented the way you presented it (If it is You find the person responcible and tell them to get into the advertising bussness if they already are in the advertising bissness then why were you beliving them in the first place?).
All you've done is present a waiter pocketing the tip type problem
Brian the Snail
27th April 2004, 11:41 AM
Originally posted by Wrath of the Swarm
You mean now that it's been explained, we can no longer demonstrate that most people here have no grasp of probability theory as it applies to medical diagnosis?
Yep.
Actually, before Rolfe revealed the answer there were at least 4 votes cast, all of which gave the correct answer.
So what makes you think that most people here would have got the answer wrong?
Wrath of the Swarm
27th April 2004, 11:41 AM
The accuracy covers both true positives and true negatives. If I had specified the rate of alpha error only, you would have needed to know the beta error. But since I didn't, you didn't.
Wrath of the Swarm
27th April 2004, 11:45 AM
Originally posted by Deetee
So what is Wrath's "point", anyhow?
Since in his hypothetical scenario he deliberately had the doctor jump to the wrong conclusion, I suspect he is just out to try and rubbish the poor medics again. It must be nice to be able to psychically determine what someone's position is before they state it and tear apart the holes in the arguments they haven't made yet.
Have you considered trying out for the million? A debating technique like that could make you a very rich person.
Soapy Sam
27th April 2004, 11:48 AM
And to add to the chaos, I just guesstimated 10% before I read the rest of the thread. So it's unanimous.
Wrath- I'd suggest you are playing to the wrong audience ith this one. Most folk here will have been reading J.A.Paulos and Martin Gardner for years. You said you expected us to beat GPs and you were right, but I think you underestimated by how much. You would really need a more random test group. (And a larger one of course).
Wrath of the Swarm
27th April 2004, 11:50 AM
Originally posted by Brian the Snail
Actually, before Rolfe revealed the answer there were at least 4 votes cast, all of which gave the correct answer.
So what makes you think that most people here would have got the answer wrong? Most medical students (and a significant fraction of doctors) get the question wrong.
Given that people here can check the other answers, take the time to work out the answer fully, and generally are better educated in statistics and critical thought, I figured many more people than normal would get the right answer. I expected the numbers of wrong answers to still be significant, though.
Since I clearly stated that I would post the correct answer when the poll cleared, it was extremely rude of Rolfe to post the explanation. After all, if there's the least chance it could be used to support a position she doesn't like, she couldn't permit it to continue unchecked, could she?
Rolfe
27th April 2004, 11:58 AM
Now, to take this on to where it needs to go, can you see the point I'm making, which is perhaps more interesting?
The figure quoted is correct if Wrath's clinical probability of being affected is that of the general population. Actually, that is difficult to imagine unless the condition in question is symptom-free in the vast majority of cases. Most interesting diseases do show some clinical signs at some stage.
Now, if Wrath is clinically symptom-free, then his probability of being afffected is the probability that someone showing no symptoms is affected. If only half of all sufferers show some clinical signs, this is already down to 0.05%, as only 1 in 2,000 of the asymptomatic population is affected.
The probability that the doctor is right is in fact only 4.72%, in this situation.
On the other hand, if the reason his doctor wanted to test him is that he came in demonstrating clear clinical signs suggestive of the condition in question, then his probability of being affected is the probability that anyone showing these clinical signs has the condition. This depends a lot on how pathognomonic the clinical signs are for the disease. But let's say he was a very typical case, and that 80% of people with these presenting signs actually have the condition.
Now look at the graph, and what it does over at the right-hand side, at the 80% probability of infection level (hint, it's the line that is almost indistinguishable from the 100% abscissa at that level).
There is a 99.75% probability that the doctor is right.
This explains why it is vital to take the real likelihood that the patient you are looking at is affected into consideration when interpreting tests like this. That is the conclusions you have come to from your clinical examination and history-taking. Otherwise, if you use a figure for incidence in the general population regardless of the individual's own circumstances, positive results are always judged to be very probably wrong and negative results to be very probably right.
Not much point doing the test if that's how you think.
In fact, it's a good illustration that it's statistically valid to say that if the test gives you the answer you were expecting, it's probably right, but if it gives you a result you weren't expecting, be very cautious. In practice, the unexpected result has to be re-checked by a reference method.
If you are screening well people, it will be the positive results you regard with suspicion, but if you are testing on a strong clinical evidence the positive result is pretty safe to accept, and you may well want to check a negative (depending on how suspicious you were in the first place, refer to graph again).
Rolfe.
JamesM
27th April 2004, 11:58 AM
Now that the answer's been given, can someone tell me: is the answer exactly 10% or just close to 10%?
Wrath of the Swarm
27th April 2004, 11:59 AM
Originally posted by Rolfe
The thing is, to get the predictive value of a test (which is what Wrath is asking), you need to know the incidence of the condition in the population representative of the individual being tested. This is obviously higher if that population is "sick people with clinical signs typical of the disease in question". In fact, the relevant figure is the clinical probability that this individual is affected. Actually, this isn't even correct.
Rolfe is confusing the utility of clinical indications with test accuracy. The significance of the test's results to the diagnosis depends upon the proportion of the population that actually has the condition. Its accuracy does not depend on that value.
Wrath of the Swarm
27th April 2004, 12:00 PM
Originally posted by JamesM
Now that the answer's been given, can someone tell me: is the answer exactly 10% or just close to 10%? The answer is about 9%. Since there were a limited number of poll options, 10% was the correct answer.
Rolfe
27th April 2004, 12:00 PM
Originally posted by JamesM
Now that the answer's been given, can someone tell me: is the answer exactly 10% or just close to 10%? 9.02% with the number of "decimal places" set to 2.
Rolfe.
geni
27th April 2004, 12:02 PM
Originally posted by Wrath of the Swarm
The accuracy covers both true positives and true negatives. If I had specified the rate of alpha error only, you would have needed to know the beta error. But since I didn't, you didn't.
But by playing around with the false posertive negative numbers I can keep the total accuracy at 99% while geting a number of different answers to you question (particuly sine you put the about in I can get the chances up to to 100% (10,000 tested wrong 9 times all false negatives) (ok that quite a bit higher than 99% acuricy unless you have ver big error bounds
Wrath of the Swarm
27th April 2004, 12:03 PM
Obviously, performing more than one test increases the chance of getting the correct result significantly.
But Rolfe can't distinguish between the accuracy of the test and the usefulness of combining it with another selection procedure. She's also ignoring the important point that many conditions (like certain kinds of cancer) don't have obvious symptoms.
I'm just glad she's a vet in another country instead of a doctor here. She wouldn't have made it through med school, of course, so I suppose the actual risk she would pose is minimal.
Wrath of the Swarm
27th April 2004, 12:05 PM
Originally posted by geni
But by playing around with the false posertive negative numbers I can keep the total accuracy at 99% while geting a number of different answers to you question (particuly sine you put the about in I can get the chances up to to 100% (10,000ed test wrong 9 times all false negatives) Since I didn't specify different values for alpha and beta error, the single value I have for the accuracy holds for both.
Thanks for trying, though.
geni
27th April 2004, 12:07 PM
Originally posted by Wrath of the Swarm
Since I didn't specify different values for alpha and beta error, the single value I have for the accuracy holds for both.
Thanks for trying, though.
Simplify it's two years since I did sats I fail to see why it should hold for both rather than the sum of both.
Wrath of the Swarm
27th April 2004, 12:09 PM
Because I gave you an overall accuracy. Given any particular input, the test has a 99% chance of giving the correct answer. That holds whether the person has the disease or not.
In reality, tests don't always have equal chances of false positives and false negatives. That's not the case for the hypothetical test, though.
Rolfe
27th April 2004, 12:17 PM
Originally posted by Wrath of the Swarm
The significance of the test's results to the diagnosis depends upon the proportion of the population that actually has the condition. Its accuracy does not depend on that value. NO, no and thrice no.
This is the whole point. This is the mistake most likely to be made by young graduates who have been brainwashed by statistics of the sort Wrath is peddling.
Now do you see why I went for the answer early and widened the question?
Predictive value of a test depends on the sensitivity, the specificity, and the prevelance of the condition in a population representative of the patient in question. ("Accuracy" is a meaningless term in the context of a test of this nature.)
The prevelance of the condition in a population representative of the patient in question means the prevelance of the condition in a population presenting clinically exactly as the patient in question presents. That is, the clinical probability that the patient in question has the disease.
For example, people often quote the incidence of FeLV (feline leukaemia virus) as 1% in the population as a whole. But if you confine your FeLV testing to cats presented chalk white with a lymphocyte count of > 20 x 10<SUP> 9</SUP>/l, the proportion of correct positive results you get will be a hell of a lot less than the 66.89% the sums Wrath would like you to do might suggest. Let us assume that the specificity of the FeLV test is about 98% (as it is). Since these cats are perhaps 70% likely to be infected (that is, if the only cats you test are in that group, 70% of the cats you test will be infected), only 0.87% of your positive results will be wrong.
Conversely, if you spend all your time screening healthy pedigree cats from tested-negative households (people do do this, prior to breeding), where there is maybe only a 1 in 10,000 chance that a cat has sliped the net and become infected, 99.51% of your positives will be wrong.
Experienced clinicians understand this. People who've just blindly read a rather superficial statistical explanation of predictive value don't, and routinely underestimate the reliability of a ositive result form a clinically sick uindividual with suggestive clinical signs.
I know it's hard to get your brain round, Wrath, but do try.
Rolfe.
yersinia29
27th April 2004, 12:19 PM
Originally posted by Wrath of the Swarm
Most medical students (and a significant fraction of doctors) get the question wrong.
Link please.
Nothing you say can be trusted at all. Nobody should believe a word you say unless you provide links and evidence backing up your claims.
Wrath of the Swarm
27th April 2004, 12:19 PM
Well, since this example has been ruined by that whore Rolfe, let's go over the math.
p = fraction of people who have the condition
x = error rate of the test
(1-p)x = fraction of false positives
p(1-x) = fraction of true positives
When are these values equal?
x - xp = p - xp
x = p
When the accuracy of the test is equal to the proportion of the population that has the condition, any positive result has a 50% chance of being correct. The calculation becomes more complicated if we presume there are different error rates for false and true positives, of course.
Luciana
27th April 2004, 12:21 PM
This thread has been reported but I see no breaking of the forum's rules. No action will be taken.
Civility, however, is always desirable...
Rolfe
27th April 2004, 12:21 PM
Originally posted by Wrath of the Swarm
Because I gave you an overall accuracy. Given any particular input, the test has a 99% chance of giving the correct answer. That holds whether the person has the disease or not.
In reality, tests don't always have equal chances of false positives and false negatives. That's not the case for the hypothetical test, though. This is meaningless. You didn't say that the test was both 99% sensitive and 99% specific. I had to assume it before I could even begin.
Suppose the test was 100% specific and only 98% sensitive (bloody good test if it managed that). Would you, by that reasoning, still call that "99% accurate"? However, in that case, all positive results are correct, so the doctor knows he's right a priori.
Rolfe.
Wrath of the Swarm
27th April 2004, 12:25 PM
Originally posted by Rolfe
NO, no and thrice no.
This is the whole point. This is the mistake most likely to be made by young graduates who have been brainwashed by statistics of the sort Wrath is peddling.
Now do you see why I went for the answer early and widened the question?
Predictive value of a test depends on the sensitivity, the specificity, and the prevelance of the condition in a population representative of the patient in question. Wrong.
If there are diagnostic criteria that must be met before a test is performed, that's performing two different tests. One just isn't done in a laboratory. We then must consider the error rate of the initial screening by symptoms. After all, surely it's not falliable.
What Rolfe describes (winnowing the population before lab tests are performed) is good medicine, but she's incorrectly asserting what she's doing. We're talking about whether the test is correct or not, but she's talking about whether clinical judgments based on its result are correct, and that's a completely different issue.
Wrath of the Swarm
27th April 2004, 12:28 PM
Originally posted by Rolfe
This is meaningless. You didn't say that the test was both 99% sensitive and 99% specific. I had to assume it before I could even begin. I did say that. I said the test is 99% accurate. That sets both values. If I said that the test would correctly identify a person with the condition 99% of the time, then there wouldn't be enough information for anyone to answer the question - you'd know the false negative rate, but not the false positive. But that isn't what I said.
It's a good thing you can look up the answers on a chart, because you sure as hell can't handle the concepts involved.
ceptimus
27th April 2004, 12:33 PM
Hmmm. Seeing as the cat is already out of the bag :) let's work this out using a population of 1,000,000 people. As the disease affects one out of every 1,000 people, we know that 1,000 people will be infected.
Of the one thousand people that are infected 990 will be told they have the disease and 10 will test negative.
Of the remaining 999,000 people who don't have the disease, 1% (i.e. 9,990) will be told they tested positive and the remaining 989,010 will be told they tested clear.
So a total of 990 + 9,990 = 10,980 people will be told they tested positive and of those only 990 people will really be ill.
So if you are told you tested positive for the disease, the chances that you actually have it are:
990 / 10,980 = 9.016393443 %
Paul C. Anagnostopoulos
27th April 2004, 12:37 PM
Wrath said:
It must be nice to be able to psychically determine what someone's position is before they state it and tear apart the holes in the arguments they haven't made yet.
You mean there wasn't a hidden agenda here? Wow, fooled me, too.
~~ Paul
Wrath of the Swarm
27th April 2004, 12:40 PM
Correct.
This is why people shouldn't be overly concerned about screening tests that return positive results. One positive HIV test doesn't mean very much - which is why when someone is found to be HIV positive, a second round of testing commenses with a more expensive but higher-quality test that's less likely to give the wrong answer.
Of course, it's always possible that some unlucky person will get a false positive for multiple tests... but that's not as bad as the poor saps who get a false negative and never go on for more testing.
Anyway, it has been shown that a very large number of medical students have problems with this question - and even doctors interpreting the results of things like mammograms, PSAs, and HIV tests. A lot of research has gone into ways of presenting test data that are less likely to cause people to reach the wrong conclusions. When results are returned in terms of population frequency, people are much less likely to misunderstand what the tests mean.
Rolfe
27th April 2004, 12:43 PM
Hmmm, accuracy determined as the percentage of overall tests performed which are correct, irrespective of whether they are positive or negative.
This depends absolutely on the population you choose to test.
If you are testing overall a population which has a low disease incidence, you will come to the conclusion that virtually all your positives are wrong and virtually all your negatives are right. Thus so long as the test has good specificity, that is not spewing out too many false positives (99% is bloody brilliant), it will seem to have great "accuracy" no matter how bad the sensitivity.
If almost all the patients you test are unaffected, almost all your negative results will be right even if the test is actually missing quite a high proportion of affected individuals. You could have a sensitivity of only 50%, missing half of the true positives, but still claim 99% "accuracy" in this way. And it has been done.
However, such a test will be useless to you if you are testing sick individuals you suspect of having the disease. It willl miss half of the real cases.
This is why the term "accuracy" is meaningless. First it is made up of sensitivity and specificity, whch will almost certainly be different, and secondly if you're looking at overall numbers of "correct" results, you can get any answer you want just by choosing how you use the test.
Lousy sensitivity - display the figures of how it performs as a well-animal screen. You can't lose.
Lousy specificity - assume that the user will only be using it where the condition is strongly suspected on clinical grounds. It may still not look wonderful, but you can make it a lot rosier than it really is.
I've seen both ploys used to make things look better than they are. I'm wise to it. That's one of the reasons they ask me to scrutineer papers submitted to a number of professional journals.
Oh, hold still for one of the only two jokes I ever invented all by myself.
________________________________________
<CENTER>New! Cutting-edge technology! Statistically proven!
<FONT SIZE="+3">THE
NEG-TEST™</FONT>
Over 99.5% of all negative results guaranteed correct!*
NEVER produces a false positive!
Simple and inexpensive!
<FONT SIZE="-1">Method: Simply take the Neg-Test™ ballpoint pen provided, find the cat's clinical record, and write the words "FeLV negative". That's it. No need to take a blood sample, no messy reagents, no fiddly timing, no laboratory skill required.</FONT>
Change to the Neg-Test™
in your practice today!
* <FONT SIZE="-2">Statistics only valid when the prevalence of infection in the population being tested is less than 0.5%.</FONT></CENTER>
____________________________________________
OK, you can quit with the hysterical laughter now. :D
Rolfe.
Wrath of the Swarm
27th April 2004, 12:48 PM
Originally posted by Paul C. Anagnostopoulos
You mean there wasn't a hidden agenda here? Wow, fooled me, too.
The "hidden agenda" was to permit me to begin a discussion of why people in general (and sometimes physicians) have problems with the question. Also to demonstrate that certain people aren't nearly as knowledgeable as they think they are.
For the record: I approve of most of modern medicine, and disapprove of most of alternative "medicine". It's the stuff I don't approve of in modern medicine that bothers me - and the unwillingness of some to admit that it could be made better.
pgwenthold
27th April 2004, 12:53 PM
Originally posted by ceptimus
Hmmm. Seeing as the cat is already out of the bag :) let's work this out using a population of 1,000,000 people. As the disease affects one out of every 1,000 people, we know that 1,000 people will be infected.
Of the one thousand people that are infected 990 will be told they have the disease and 10 will test negative.
Of the remaining 999,000 people who don't have the disease, 1% (i.e. 9,990) will be told they tested positive and the remaining 989,010 will be told they tested clear.
So a total of 990 + 9,990 = 10,980 people will be told they tested positive and of those only 990 people will really be ill.
So if you are told you tested positive for the disease, the chances that you actually have it are:
990 / 10,980 = 9.016393443 %
Read what Rolfe has written. This will be the case if the test is administered to everyone. However, if you administer the same test only to those who show other clinical signs, then the incidence of the _tested population_ is far higher than 1/1000.
Let's use a pregnancy test as an example. Suppose the pregnancy test is 99% accurate (in both directions) and that 1/1000 women are pregnant. If every woman is given a preg test, then 90% will be false positive.
OTOH, suppose that the only women who get tested are those who have missed a period. Now, there are lots of reasons to miss a period, but the main one is pregnancy. Let's say that 80% of the time when a woman misses a period, it is because of pregnancy. Thus, if only the women who have missed a period are given an exam, then 800/1000 would be pregnant. For a 99% test, 792 of the pregnant women would test positive, but 2 non-pregnant would test positive. Thus, the probably of being pregnant, given a missed period and a positive preg test, is 99.75%.
This is the same point that Rolfe has been making. Tests are not carried out in a vacuum.
Now throw in a woman who has not only missed a period but is also suffering morning sickness. At that point, the positive test is even more solid.
Rolfe
27th April 2004, 12:58 PM
Originally posted by Wrath of the Swarm
I said the test is 99% accurate. That sets both values.Please be more clear, Wrath.
When you say that the test is 99% accurate, do you mean that 99% is the arithmetical mean of the sensitivity and the specificity, or do you mean that 99% of the results you get when you actually do the test are correct?
If the former, then I submit that 99% accurate could describe a test with 98% sensitivity and 100% specificity. In which case the doctor would be right anyway.
If the latter, then it would depend entirely on the percentage of the tested population which is actually being affected, and on the (possibly differing) values of sensitivity and specificity. (In the usual scenario, the majority of the tested population is assumed to be unaffected. This means that a test with great specificity will always look very good, no matter how lousy the sensitivity, while a test with great sensitivity may look diabolical if the specificity is poor.)
Thus one can be misled into thinking that good specificity is what matters and to hell with the sensitivity - especially for screening well patients.
In fact the opposite is true. You need almost perfect sensitivity quite desperately. Because you don't want to have to keep doubting and re-checking all your negative results, which will after all be in the large majority. If you can trust a negative result to be highly unlikely to miss an affected individual, then double-checking all your positives, within reason, isn't too much of a chore.
A test with 99.5% sensitivity and only 95% specificity is much more use to me in a screening situation than one with 99.5% specificity and only 95% sensitivity. That's because I can virtually rely on the negatives and only have to recheck 5% (or a bit more) of my results, the positives, with the latter. With the former, I can't rely on either the positives or the negatives.
But the former has better "accuracy" according to the second definition, in a mostly-unaffected population.
However, I'd settle for knowing which definition of "accuracy" you were using, for a start.
Rolfe.
Rolfe
27th April 2004, 01:02 PM
Originally posted by pgwenthold
This will be the case if the test is administered to everyone. However, if you administer the same test only to those who show other clinical signs, then the incidence of the _tested population_ is far higher than 1/1000.
Let's use a pregnancy test as an example. Suppose the pregnancy test is 99% accurate (in both directions) and that 1/1000 women are pregnant. If every woman is given a preg test, then 90% will be false positive.
OTOH, suppose that the only women who get tested are those who have missed a period. ....pgwenthold, I think I'm in love with you.
You know, I have to explain this concept to two groups of people. Those who haven't originally heard it Wrath's way, for whom the light bulb comes on almost at one, and those who have heard the "predictive value" spiel without really thinking about what representative of the population in question actually means. They have a great deal of trouble, usually.
Rolfe.
Mercutio
27th April 2004, 01:03 PM
Originally posted by Wrath of the Swarm
Anyway, it has been shown that a very large number of medical students have problems with this question - and even doctors interpreting the results of things like mammograms, PSAs, and HIV tests. A lot of research has gone into ways of presenting test data that are less likely to cause people to reach the wrong conclusions. When results are returned in terms of population frequency, people are much less likely to misunderstand what the tests mean. It seems to me you have been asked for your sources for this more than a couple of times in this thread.
There is, of course, a long line of research on cognitive heuristic use (your problem is one example of the "base-rate fallacy" within this literature). I don't know which sources you are refering to, but I am guessing it is probably Kahneman & Tversky, one of several different publication dates...
Anyway, this paper (http://www.amstat.org/publications/jse/v9n3/keeler.html) sums up quite a bit of the research--I don't see your particular claim in it, but I only did a quick once-over of the paper. If you have another source or sources in mind...please cite them.
Rolfe
27th April 2004, 01:07 PM
Originally posted by Wrath of the Swarm
It's a good thing you can look up the answers on a chart, because you sure as hell can't handle the concepts involved. Sorry, I just realised what that implied.
Wrath, I wrote the spreadsheet. From scratch. In order to produce that graph I posted earlier, in order to demonstrate the absolute importance of assuming the correct value for the x-axis when deciding whether a result can be relied on or not.
Before anyone gets twitchy, yes, the graph was scanned in from a book. But as I am the author of the book, I think this is allowed, yes?
Rolfe.
drkitten
27th April 2004, 01:08 PM
Originally posted by Wrath of the Swarm
I did say that. I said the test is 99% accurate. That sets both values. If I said that the test would correctly identify a person with the condition 99% of the time, then there wouldn't be enough information for anyone to answer the question - you'd know the false negative rate, but not the false positive. But that isn't what I said.
It's a good thing you can look up the answers on a chart, because you sure as hell can't handle the concepts involved. [/B]
The situation isn't as clear-cut as you seem to think, Wrath. It's simply sloppy writing to cite one number and to assume that it applies equally to both the alpha and beta error rates. Another equally legitimate interpretation is that the test has a 99% accuracy rate in practice, but that figures aren't available to support breaking them out into false-positive and false-negative rates.
The standard terminology exists for a reason. Use it.
Your central point, however, is well-taken. This is a classic med-student error. I believe, however, that most experienced physicians have seen enough to know abou this error. Have you relevant evidence on medical error rates? The JREF forum is hardly typical of medical practitioners in either math sophistication or medical training.....
Wrath of the Swarm
27th April 2004, 01:10 PM
Originally posted by Rolfe
Hmmm, accuracy determined as the percentage of overall tests performed which are correct, irrespective of whether they are positive or negative.
This depends absolutely on the population you choose to test. No. Think about what you're saying. If all that matters is whether the result is correct, the distribution of the condition in the population is irrelevant unless there are different error rates for positives and negatives.
In this hypothetical 99% accurate test, it doesn't matter one bit whether everyone tested doesn't have the condition, everyone has the condition, or there's some intermediate state. 99% of the results are accurate, and 1% are not.
The strength of conclusions drawn from the results will depend on the population - but that's not what we're talking about. The power of the test is not the same as its accuracy.
Thank the beneficient powers you don't deal with people.
Wrath of the Swarm
27th April 2004, 01:13 PM
Originally posted by Mercutio
It seems to me you have been asked for your sources for this more than a couple of times in this thread. Yes, I know.
The basic problem is a classic one. I'm trying to find the sources in which I read about the implications for screening tests several years ago.
If I recall correctly, doctors get the right answer more frequently than the general population, but they still tended to reach grossly wrong conclusions about whether a particular patient had a disease. I believe they overestimated the power of the tests significantly.
If I find some good sources on the subject, I'll get back to you.
Wrath of the Swarm
27th April 2004, 01:19 PM
Originally posted by drkitten
It's simply sloppy writing to cite one number and to assume that it applies equally to both the alpha and beta error rates. Another equally legitimate interpretation is that the test has a 99% accuracy rate in practice, but that figures aren't available to support breaking them out into false-positive and false-negative rates. Overall accuracy includes both forms of error. If it's not stated that the probabilities can be further broken down, then there's no reason to presume that they do.
It's not sloppy. I avoided unnecessary complexities (which people are now trying to hide behind, I see.)
Wrath of the Swarm
27th April 2004, 01:22 PM
Well, I found this (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=11036111) at PubMed. I've found several other references to the research finding that doctors frequently have problems with Bayesian inferences, but not the research itself.
It's common knowledge within the profession, though. Let me keep looking.
geni
27th April 2004, 01:23 PM
Originally posted by Wrath of the Swarm
Overall accuracy includes both forms of error. If it's not stated that the probabilities can be further broken down, then there's no reason to presume that they do.
Context is import. In the JREf forums making that kind of asumption with this kind of question is probaly means you are getting the wrong answer (check the puzzels section if you don't belive me)
drkitten
27th April 2004, 01:26 PM
Originally posted by Wrath of the Swarm
Yes, I know.
The basic problem is a classic one. I'm trying to find the sources in which I read about the implications for screening tests several years ago.
If I recall correctly, doctors get the right answer more frequently than the general population, but they still tended to reach grossly wrong conclusions about whether a particular patient had a disease. I believe they overestimated the power of the tests significantly.
If I find some good sources on the subject, I'll get back to you.
Daniel Kahneman won the Nobel Prize for this some years ago. The classic work on cognitive errors in the general public is Judgement Under Uncertainty, but I assume you have something more specific for medical professionals?
Wrath of the Swarm
27th April 2004, 01:28 PM
9. For the base-rate neglect question, the important finding from these studies (see also Hogarth and Einhorn, 1992, and Robinson and Hastie, 1985) is that the order in which people get the information makes a difference. Although it shouldn't make any difference what order they get information in, subjects usually put greater weight on the most recently received information (Adelman, Tolcott, and Bresnick, 1993, with military intelligence experts dealing with realistic military intelligence problems; Tubbs, Gaeth, Levin, and Van Osdol, 1993, with college students on everyday problems such as troubleshooting a stereo; Chapman, Bergus, Gjerde, and Elstein, 1993, with medical doctors on a realistic diagnosis problem). In more ambiguous situations the first impression had a lasting effect (Tolcott, Marvin, and Lehner, 1989).11. Does it matter that people cannot accurately revise numerical probabilities (Christensen-Szalanski, 1986)? The deeper study of what people actually do, as called for by Koehler, can provide perspective. What do doctors do, for example, when ideally they should be forming hypotheses and revising hypothesis probabilities as they gather evidence?
12. It is not that they do a numerical integration more complex than Bayes' Theorem to revise probabilities (Gregson, 1993), as Hamm's (1987) explorations show. Doctors thinking aloud about cases don't even speak explicitly of probabilities (Kuipers, Moskowitz, and Kassirer, 1988), though when they are induced to do so it improves their decisions (Pozen, D'Agostino, Selker, Sytkowski, and Hood, 1984; Carter, Butler, Rogers, and Holloway, 1993).
13. Nor do doctors rely exclusively on learning probabilities from experience, like rats learning the contingencies on a lever (Spellman, 1993). While some of their knowledge is based on this kind of experience (Christensen-Szalanski and Beach, 1982; Christensen- Szalanski and Bushyhead, 1981), doctors have to know what to do with both the common diagnoses (8 out of 10) and the rare ones (1 in 10,000). Though in some situations, where people experience an event repeatedly, they can implicitly learn a base rate, in other situations, where people do not experience an event repeatedly but rather learn about it abstractly, they may also be able to take account of a base rate—but if they cannot, the consequences may be important.
14. How, then, do doctors usually handle diagnostic problems? Experts generally organize their extensive knowledge into mental scripts (Schmidt, Norman, and Boshuizen, 1990), complex rules that function with the speed of recognition to provide responses for familiar and unfamiliar situations. Explicit calculation of Bayesian probabilities is not a strength of this type of rule (cf. Hamm, 1993). Instead, experts' accuracy may be a function of the recognition processes, which can bring ideas to mind optimally (Anderson and Milson, 1989). Or accuracy may be due to well-tuned judgment processes governing response choice (Chapter 8 of Abernathy and Hamm, 1994).
15. If doctors' scripts are used accurately, producing results similar to those that wise use of Bayes' theorem would produce, this is due not only to the feedback of experience but also to reflection and to others' criticism (Chapter 11 of Abernathy and Hamm, 1994). Any form of argument can be applied toward justifying a change in a script, including arguments based on probabilistic analysis.
16. For example, when the screening tests for HIV first came out, Meyer and Pauker (1987) warned against ignoring the base rate, i.e., against assuming that someone with no risk factors has AIDS if their screen is positive for AIDS. Guided by such explicit discussion of the probabilities, and by individual cases of people devastated by false positive HIV screens, doctors' shared scripts were adjusted until now they don't recommend that patients be screened unless there are risk factors. The "1993 script" produces behavior that is, for the most part, consistent with a Bayesian analysis. Individual doctors using the script need neither think about probabilities nor understand the Bayesian principles. They just think of the rules, or of cases in which the script is implicit (Riesbeck and Schank, 1989). Note, of course, that this scenario depends on there being someone who understands the probabilistic principles and can shape the script that everyone else will use.
From this site (http://dieoff.org/page19.htm)
Paul C. Anagnostopoulos
27th April 2004, 01:29 PM
Wrath said:
Thank the beneficient powers you don't deal with people.
Sheesh.
~~ Paul
pgwenthold
27th April 2004, 01:38 PM
Originally posted by Rolfe
pgwenthold, I think I'm in love with you.
"First you have to move that damn cat."
Oh, sorry.
hey, I have a high affinity for vets (my wife begins fourth year rotations in 2 weeks). However, I'm not a cat person. We wouldn't get along. Besides, the aforementioned wife wouldn't go for it.
You know, I have to explain this concept to two groups of people. Those who haven't originally heard it Wrath's way, for whom the light bulb comes on almost at one, and those who have heard the "predictive value" spiel without really thinking about what representative of the population in question actually means. They have a great deal of trouble, usually.
Actually, I am one of the latter, and am very familiar with the John Allan Paulos take on the matter. However, you made a good point about testing populations. Since I don't know much about the tests for feline leukemia, I figured I'd put it in terms that most people would recognize.
Wrath of the Swarm
27th April 2004, 01:53 PM
This site (http://yudkowsky.net/bayes/bayes.html) has a nice discussion of the issue in simple terms. More importantly, it references research studies and asserts that the problem has been replicated many times.
Okay, so it's not a stellar reference... but I think it proves my point. My problem is what medical resources don't discuss the issue much - you'll find a lot more if you do a general Google on "do doctors have problems with Bayesian reasoning?"
Rolfe
27th April 2004, 01:56 PM
Originally posted by Rolfe
Please be more clear, Wrath.
When you say that the test is 99% accurate, do you mean that 99% is the arithmetical mean of the sensitivity and the specificity, or do you mean that 99% of the results you get when you actually do the test are correct?
If the former, then I submit that 99% accurate could describe a test with 98% sensitivity and 100% specificity. In which case the doctor would be right anyway.
If the latter, then it (gets considerably more complicated....)This seems to have been missed. Please address. (Unless you did while I was writing this post, sorry, carried away again.)
I seem to have posted once assuming Wrath meant the former, then a second time assuming he meant the latter, then I see from yet another post that maybe he means the former after all.
Not hiding behind anything, Wrath.
Reference to simplistic form of the explanation that Wrath is peddling, at least the one that has caused me the most grief over the years.
JACOBSON, R. H. (1989) How well do serodiagnostic tests predict the disease status of cats? J. Am. Vet. Med. Ass. 199 (10), 1343-1347.
My pet hate quote from this pile of misinformation:A negative test result .... is reliable in predicting that a cat does not have the infection/disease.
.... negative test results are good prognosticators of non-infected cats even if the sensitivity .... of the test is not good.The example he used was that a sensitivity of 90% was just peachy, because in his scenario (only 1% of the "population" infected), 99.9% of the negative results would still be correct. He even remarked that even if the specificity was only 20% (!), this was OK because >99% of the negative results would still be correct.
It was at this point I was driven to grasp him methaphorically by the throat and point out that 90% sensitivity was still missing 10% of all infected cats, and I really didn't want that. 20% sensitivity is missing an incredible 80% of the infected cats, and there's no way this can be acceptable except by his crazy logic (which Wrath never extrapolated to, but it's where it goes if you don't rein it in).
Of course, this is where my "NEG-TEST<SUP>TM</SUP>" was born. The reductio ad absurdum of his premise is that if the sensitivity is zero, nevertheless, 99% of the negative results are still correct. The NegTest has a sensitivity of zero. I just tweaked it a little by reducing the hypothetical incidence of infection to 0.5% (not unreasonable in a healthy population, and iin fact if you are talking closed and tested pedigree breeding establishments even 0.5% is a gross libel).
I don't know whay the reductio ad absurdum wasn't spotted when the paper was published, but it wasn't.
This is the reason I have this explanation honed - it seems to imply that sensitivity doesn't matter, and so long as specificity is good (not too many false positives), you're laughing.
Of course, as I said above, the opposite is the case. For a viable screening test, you must be able to trust your negatives, not just trust to luck that the cats you're testing are in fact negative! If you can trust the negatives, you only need to get the (relatively few) positives double-checked. No problem. If you know that the bloody test is missing 10% or more of the infected cats, why do it at all?
In fact Jacobson did say something perfectly sensible in his paper.When evaluating a serodiagnostic test result, the veterinarian should first consider whether the cat is at high risk (from a high prevelance group) or low risk (from a low prevalence group) for the condition under consideration.The problem is that he didn't understand that "population" doesn't mean "the village where the cat lives", it means "cats like this one". Including the clinical presentation.
He spent the entire five pages only looking at the left-hand side of the graph, because he couldn't imagine a (geographical) population with more than about 10% incidence of infection. Of course, he isn't a veterinarian. He simply didn't think about the selection-to-test based on clinical presenting signs and the "population" you will be testing if you choose (as many vets do) to test only cats presenting with clinical signs suggestive of infection by the virus.
Once you think about that scenario, you realise you are way up to the right-hand-side of the graph, and positive results become relatively reliable while negative results are untrustworthy as hell.
And of course a "population" in this sense can be one cat, with all its features which put it closer to one side or the other of the graph. Indeed, the bottom line is you don't think about what other cats you tested or didin't choose to test that day, or week, or year, you assess that cat as an individual with probability/risk of infection of x.
I know it's not easy to get your head round, especially if you've got the sloppy version strongly pre-conceived. But it would be nice if Wrath at least read my posts.
Rolfe.
Rolfe
27th April 2004, 02:02 PM
Sorry, this is unworthy, but I'm getting a bit narked. (I only just saw the word I suspect led to the reporting of the thread, and yes, I'm not terribly flattered.)
Is it relevant that Wrath had to search PubMed to find something to back him up, after he'd been called on it? Whereas I reached for journals already in my bookcase, and was able to illustrate my point with a graph copied from a book of which I am in fact the author?
Rolfe.
Wrath of the Swarm
27th April 2004, 02:02 PM
I have read them. The question is whether you ever do - or if you think about what you read.
The quote you've presented from this person you argued with is utterly correct. "A negative test result .... is reliable in predicting that a cat does not have the infection/disease. .... negative test results are good prognosticators of non-infected cats even if the sensitivity .... of the test is not good."
That's the simple truth. It may not be a desirable test in a wider sense because it misses infected cats... but that has nothing to do with the points you quoted.
Obviously, the population tested will affect the results, in the sense that if everyone has the condition, there's no such thing as a false positive (or if everyone lacks it, there's no false negative and so forth). What's important is the accuracy of the test (whether in general or in distinguishing between alpha and beta error), because that is an objective, universal property of the test that doesn't change across populations.
Your hysterical rantings don't echo, Rolfe.
Wrath of the Swarm
27th April 2004, 02:07 PM
Why perform a test that misses 10% of the infected patients?
Maybe because not performing the test misses 100% of the infected patients by definition.
Because tests with low false negative rates usually also have high false positive rates? Because overall accuracy in diagnostic testing is extremely difficult to accomplish?
No, obviously he's just performing the test to make money off of his victimized clients, and you're swooping in to save the day! ;)
ceptimus
27th April 2004, 02:27 PM
If we have a test for a disease that doesn't exist in the population (say we have a test for smallpox - and smallpox has been eradicated) then no matter how 'good' the test is, any positives that it reports will be wrong.
I think when Wrath said the test was 99% accurate, and gave no further information, then you have to assume that out of every 100 people tested, whether they have the disease or not, 1 person will be told the incorrect result.
I realize this is not a likely scenario in the real world - I was just treating it as a puzzle. I like puzzles.
Rolfe
27th April 2004, 02:34 PM
Originally posted by Wrath of the Swarm
Your hysterical rantings don't echo, Rolfe. Anyone else care to say if I'm ranting hysterically?
Wrath, you made two classic errors of presentation when you posted that question. With, I note, the not-very-well-hidden agenda of showing how clever you are and how stupid medical professionals are.
The first was the one which was obvious to everyone, where you quoted an "accuracy" figure which was meaningless as it stood, without stating that you were implying that sensitivity and specificity were both 99%.
I assumed this was just a sloppy way of saying that specificity was 99%, because sensitivity wasn't relevant to the question anyway. You have however dug yourself a deeper and deeper hole by declaring that this "99% accuracy" is some sort of combined sensitivity and specificity figure. This is a meaningless concept. You can't simply take an arithmetical mean of the sensitivity and specificity and call it "accuracy", and to assume (and to assume that we would assume) that they were equal is ludicrous.
100% sensitivity, 50% specificity. 75% accuracy?
100% specificity, 50% sensitivity. 75% accuracy?
75% sensitivity, 75% specificity. 75% accuracy?
These are three very different products, and nobody in their right mind would consider them all under the same banner, as "75% accuracy". By the way, if these were all that was available, which would you choose to stock, and why?
We can go round the houses on this relatively minor point all night.
However, the more important point is, why did the doctor decide to do the test? He knows that, and he will take it into account in deciding whether to believe a positive result or not.
Scenario 1. Wrath goes to the doctor for an insurance medical, feeling fine. Doctor checks him over carefully, and finds nothing wrong. But the insurance form requires that he tick the box to test for Galloping Varicella, as a routine.
In that situation, a 9.02% probability that the test Wrath described is correct is actually an overestimate unless all people with Galloping Varicella are clinically normal. What you need is the incidence of Galloping Varicella in the clinically normal population - which will undoubtedly be less than the incidence in the population as a whole, which of course includes those who are in the last stages of terminal disease from the condition.
Scenario 2. Wrath goes to the doctor for an insurance medical, feeling fine. Doctor checks him over carefully, and notes a couple of worrying things. He has a mole in the middle of his back, where he can't see it, and there is a faint but just perceptible cast to his left eye. The doctor knows that these two features, found together, are very suggestive indeed of Galloping Varicella, in fact he is about 80% sure Wrath has the condition. Although nothing is said about it on the insurance form, he decides to perform the test. Of course, knowing that there is still a 20% chance he is wrong, he just tells Wrath that the test is "routine", to save possibly unnecessary worry.
In that situation, there is a 99.75% probability that a positive result is right.
Do we criticise the doctor if he decides to break the bad news at that point?
This is a classic error. Look at a problem from outside, ignoring the inbuilt assumptions with regard to way of working that people build up over the years. Assume one scenario, and only one, because you don't have the experience to imagine any other. Then ambush some professionals with your assumed scenario, and completely fail to realise that they may be (consciously or unconsciously) answering the question from the point of view of a different scenario, the scenario they are familiar with.
The fact is, there is always a reason for testing, and that reason is part of the interpretation. Insurance requirement, although no suspicion? Strong clinical suspicion? Wrath believes that doctors do the test without thinking about this. They don't. But it's often so instinctive that you can make them look unreasonably stupid by pulling a Wee Kirkcudbright Centipede (http://sniff.numachi.com/~rickheit/dtrad/pages/tiKIRKCPED.html) on them. (Note, that text is all that is available, but it is the result of a bad transcription from a sung version by an American who didn't understand the lingo. For a start, the dance is the "Palais Glide", not the "parlor glide". What were they thinking of!)
And just to note this again, five out of five posters got the "right" answer before I posted a syllable, so Wrath's stated purpose of showing how bad the people on this forum are at this sort of reasoning was doomed from the start.
But we're having a much more entertaining discussion now, aren't we? :D
Rolfe.
Late edits only for spelling typos.
Dragon
27th April 2004, 02:37 PM
Originally posted by Wrath of the Swarm
This site (http://yudkowsky.net/bayes/bayes.html) has a nice discussion of the issue in simple terms. More importantly, it references research studies and asserts that the problem has been replicated many times.
Okay, so it's not a stellar reference... but I think it proves my point. My problem is what medical resources don't discuss the issue much - you'll find a lot more if you do a general Google on "do doctors have problems with Bayesian reasoning?"
Good link, Wrath - pity you didn't read it before your OP.
From the link (using breast cancer as an example) - Figuring out the final answer always requires all three pieces of information - the percentage of women with breast cancer, the percentage of women without breast cancer who receive false positives, and the percentage of women with breast cancer who receive (correct) positives. Oh dear - how many pieces of information in your OP?
Dragon
27th April 2004, 02:48 PM
On the subject of Bayes' Theorem - this thread (http://www.randi.org/vbulletin/showthread.php?s=&threadid=39384) that ca3799 has just started in GS&P got me thinking - surely it applies to polygraph testing? Can't find anything on http://antipolygraph.org/ about it yet - hmmm....
pgwenthold
27th April 2004, 03:02 PM
Originally posted by Rolfe
And just to note this again, five out of five posters got the "right" answer before I posted a syllable, so Wrath's stated purpose of showing how bad the people on this forum are at this sort of reasoning was doomed from the start.
But we're having a much more entertaining discussion now, aren't we? :D
Oh, but it is fun. The clown came in as pompous as could be intent on showing us how much smarter he was than everyone else, and then after fooling no one, got schooled big time.
After having it blow up in his face, then he runs to the literature (where, as demonstrated by Dragon, blows it again), apparently abandoning his attempted exercise.
Entertaining it has been, hmmmm?
Wrath of the Swarm
27th April 2004, 03:11 PM
Originally posted by Dragon
From the link (using breast cancer as an example) - Oh dear - how many pieces of information in your OP? All three. You were told the base rate of the disease in the population, and the accuracy of the test, which includes with the alpha and beta rate.
It's nice to see that most of the people who bothered responding did indeed choose the correct answer, although now we'll never know how many formuites would have chosen differently. A shame.
Dragon
27th April 2004, 03:17 PM
Originally posted by Wrath of the Swarm
All three. You were told the base rate of the disease in the population, and the accuracy of the test, which includes with the alpha and beta rate.
...
Nope - we had to assume what you meant by "accuracy". Rolfe has explained this to you already.
Rolfe
27th April 2004, 03:23 PM
Just a comment.
The first (permanent) part of my sig line wasn't chosen by accident.
Rolfe.
Wrath of the Swarm
27th April 2004, 03:25 PM
Originally posted by Rolfe
Wrath, you made two classic errors of presentation when you posted that question. With, I note, the not-very-well-hidden agenda of showing how clever you are and how stupid medical professionals are. Pointless character assassination. The problem certainly doesn't make me look any smarter - I failed it the first time I saw it, many years ago.
It does make you look dumber, but since you're not a medical professional that doesn't quite count, does it?
The first was the one which was obvious to everyone, where you quoted an "accuracy" figure which was meaningless as it stood, without stating that you were implying that sensitivity and specificity were both 99%. I wasn't "implying" it - it's a consequence of what I said. Sloppy interpretation.
You have however dug yourself a deeper and deeper hole by declaring that this "99% accuracy" is some sort of combined sensitivity and specificity figure. This is a meaningless concept. You can't simply take an arithmetical mean of the sensitivity and specificity and call it "accuracy", and to assume (and to assume that we would assume) that they were equal is ludicrous. No one's claimed anything about an arithmatic mean - except you.
And it's certainly not ludicrous for the alpha and beta rates to be equal. It's somewhat improbable in the same sense that it's improbable for any two values to be the same, but there's no reason it can't happen.
That's two 'misinterpretations' made by you.
These are three very different products, and nobody in their right mind would consider them all under the same banner, as "75% accuracy". Except anyone who uses English in the standard manner. Of course, we weren't considering such a situation - the accuracy of the test was established without reference to alpha and beta rates.
We can go round the houses on this relativly minor point all night. But that would involve discussing how wrong your statements are and how pointless your objections have been, and we don't want that.
In that situation, a 9.02% probability that the test Wrath described is correct is actually an overestimate unless all people with Galloping Varicella are clinically normal. What you need is the incidence of Galloping Varicella in the clinically normal population - which will undoubtedly be less than the incidence in the population as a whole, which of course includes those who are in the last stages of terminal disease from the condition. That's two tests. The first test involves an examination of the obvious symptoms - it has an accuracy all its own. Considering the results of test B in the light of test A is perfectly reasonable medical practice - but it's not an effective way to determine the accuracy of test B.
You have continually ignored this point, I suspect because you know you've made a major error and want desperately to direct attention away from it.
In the question, there were no other tests specified other than the one I gave you all information on. The potential ability to perform other tests has no bearing on the question I asked.
In that situation, there is a 99.75% probability that a positive result is right. True. When we discuss that situation, we'll drop you a line.
Wrath believes that doctors do the test without thinking about this. Liar. You have no idea what I believe, so you make up a position for me that you know you can successfully attack.
The point is that not this is how medical testing is performed. The point is that doctors fail to answer the question correctly. Being the self-appointed forum apologist for the field of medicine in general, you leap to explain how doctors base their judgments on additional clinical data, blah blah blah... ignoring the point that THEY CAN'T CARRY OUT A SIMPLE MATH PROBLEM.
Wrath of the Swarm
27th April 2004, 03:29 PM
Originally posted by Dragon
Nope - we had to assume what you meant by "accuracy". Rolfe has explained this to you already. If I say that I can identify a randomly-chosen card while blindfolded with 80% accuracy, would you have problems understanding that as well? Would you demand I offer accuracy ratings for each type of card?
The test has 99% accuracy; without further specificiation, that means that any response it gives has a 99% chance of being correct and a 1% chance of being wrong. There's your alpha and beta rates right there.
No further categorization is given; none is needed. You had all the information needed to answer the provided question.
geni
27th April 2004, 03:41 PM
I didn't. I don't know the standard equations so was trying to work it out from scratch and ran into an equation with two unknows which is of course unsolverble. You card analogy is false since the way you have stated it the question being asked is different from t he one in this thread.
Wrath of the Swarm
27th April 2004, 03:45 PM
Two unknowns? Then you certainly weren't working the problem properly.
It is perfectly reasonable to talk about a test that has an accuracy rating. Not all tests have different rates of false positives and false negatives - and even if they did, we don't always care.
When given an accuracy and the population prevalence, you had sufficient information to solve the problem. I could have made it more complex and somewhat more realistically probable, but that not only wasn't necessary, it would have invalidated my point that doctors were unable to answer the question correctly. If I changed the question, why would I bring up those studies?
Admit it - your objection is groundless.
Rolfe
27th April 2004, 03:51 PM
Sob.
There are an infinite number of answers to this question, based on the information Wrath didn't clarify. However, Wrath chooses to declare only "his" answer to be correct.
Once again, Wrath, the scenario you post cannot exist in the abstract. You didn't tell us why the doctor did the test.
If it was for no other reason than because it was a box that had to be ticked (for example in an insurance medical), your 10% probability of the positive result being correct is in fact an overestimate, because you didn't tell us the incidence of the condition in the population with no suspicious clinical signs (less that 0.1% obviously, but we don't know how much less).
If it was because you came in with clear clinical symptoms suggestive of the condition, then the probability of the positive result being correct is pretty high (depending on a number of clinical factors).
Forgive me if I'm inclined to assume that if the doctor decided the result was correct, it might have been because he knew he was in the latter scenario.
You cannot put forward a hypothetical situation like this, then get miffed when people point out that your "correct" answer is only correct if a number of details which you haven't specified are exactly the way you have tacitly assumed them to be.
You did imply that the doctor's appointment was for a routine checkup, without any particular presenting signs. That's fine. But you didn't say why the doctor wanted to check for the condition. Now you impose more conditions than you originally stated, that he was just doing it for fun, or the greater enrichment or the laboratory, or (more likely) the test was a condition of an insurance policy or an employment contract. It could easily have been because the doctor's clinical acumen smelled a very aromatous rat.
You, however, want to assume the scenario that makes the doctor look a fool.
Now, for God's sake put me out of my misery and tell me what the bloody blue blazes that "99% accuracy" figure is supposed to mean. Stop assuming that sensitivity and specificity are equal, I know and you know and the entire medical laboratory profession knows that only hypothetical tests come like that.
So, for a real-life test, like the ones I deal with every day, which have unequal sensitivity and specificity, how are you calculating what you call "accuracy"?the accuracy of the test was established without reference to alpha and beta ratesI assume that by "alpha and beta rates" you mean sensitivity and specificity - that's OK, we obviously come from areas with a different vocabulary. But I'd like to get this clear. So, if not like that, how in all that's holy was it established?
(Note, this part of the argument is not of my making. I originally assumed that Wrath meant 99% specificity, since specificity was the only figure relevant to the sum he had set. It's Wrath himself who keeps saying now that the 99% somehow incorporates both sensitivity and specificity. Not my problem if he can't then explain how.)
Wrath. Who can't carry out a simple maths problem? People here got it right. And if you still think I used a crib, I'll repeat that I wrote the spreadsheet I used myself, years ago, and only mentioned it to explain why I could do multiple scenarios of the problem relatively quickly.
We made the assumptions you wanted us to make. We got the "right" answer by your lights. However, we also realised where you were mistaken, which was in assuming that the reason for carrying out the test was irrelevant to the doctor's decision as to whether to go with the result or not.
Deal with it.
Rolfe.
yersinia29
27th April 2004, 03:56 PM
Wrath is a troll guys. Dont waste your time. He wont address your points, and will just continue to spill his bile. He gets a philosophical thrill out of trying to obfuscate the arguments involved.
Wrath, I'm still waiting for your Bayesian analysis of MRI and x-rays.
Wrath of the Swarm
27th April 2004, 03:58 PM
Why the doctor performed the test has no bearing on the correct answer! Does it matter why Farmer Brown took away three apples from the box that held seven? No!
And there aren't an infinite number of answers. Without specifying different values for alpha and beta, we consider only error. Alpha and beta values follow from the overall accuracy.
Again: the hypothetical test was 99% accurate, so there was a 99% chance that any result it came up with would be correct. That tells you what the alpha and beta rates are - they're equal in this particular case.
Even you aren't stupid enough not to recognize this, so I'm forced to conclude you're being intentionally deceptive in order to support your 'point'.
yersinia29
27th April 2004, 04:10 PM
Originally posted by Wrath of the Swarm
Why the doctor performed the test has no bearing on the correct answer!
Of course it does you fool.
The incidence of lupus in the overall population of women is x%
The incidence of lupus in women with a butterfly rash, photosensitivity, and Raynaud's phenomenon is x+y%
doctors dont just run lupus tests on random women. Therefore, the incidence thats used in calculating false positives and other parameters depends on x+y, NOT x.
ceptimus
27th April 2004, 04:13 PM
Wrath's original question was quite clear, and had a definite answer. If you wish to make up your own questions, it is quite likely that they will have different answers.
geni
27th April 2004, 04:20 PM
Originally posted by Wrath of the Swarm
Two unknowns? Then you certainly weren't working the problem properly.
We we need to know what 99% accurcy means. It means that some of the tests are giving false posertive or negatives. Therfore the inacurcies are due to either false posertives or negatives. What is is the ratio of these inacrucies ah can't work that one out on the data given problem is unsolverble.
It is perfectly reasonable to talk about a test that has an accuracy rating. Not all tests have different rates of false positives and false negatives - and even if they did, we don't always care.
But in this case we do case because it can have a big effect on the answer.
Admit it - your objection is groundless.
You used to work for edexcel didn't you? The problem as stated is unsolverble.
geni
27th April 2004, 04:22 PM
Originally posted by ceptimus
Wrath's original question was quite clear, and had a definite answer. If you wish to make up your own questions, it is quite likely that they will have different answers.
Nope try pluging in the figurers for what hapens when the inacucay is entiry due to false negatives.
Rolfe
27th April 2004, 04:25 PM
Originally posted by ceptimus
Wrath's original question was quite clear, and had a definite answer. If you wish to make up your own questions, it is quite likely that they will have different answers. If you thought that was clear, you don't understand the question.
What Wrath intended to be assumed was reasonably clear, because we know how his mind works, and by assuming that, the desired result was obtained.
However, he was dishonest because his unstated assumption was that the scenario was such that the doctor was wrong in assuming the positive result to be correct. It is equally if not more likely, in real life, that the scenario was not that assumed by Wrath, and that the doctor had a perfectly valid reason for assuming the result to be right.
Rolfe.
Wrath of the Swarm
27th April 2004, 04:32 PM
Point 1: The question, as I presented it, is the same question that was used in research with doctors.
Point 2: Even if you're so obsessed with proving me wrong that you're willing to claim I had phrased the question inappropriately, you must also claim that the hordes of psychology researchers and statisticians who wrote the question also screwed up... which I think goes just a bit farther.
Point 3: The question, as it stands, is perfectly comprehensible.
Point 4: It doesn't matter why the doctor ordered the test. There are plenty of tests that are used as screens. Furthermore, even in the ones that aren't, the error rates of the test are not dependent on the makeup of the tested population.
Point 5: If you want to link together multiple tests, fine. The analysis of the results becomes much, much more complicated. We have to determine the error rating(s) of the first test, the degree to which the first and second tests are independent, determine whether the initial tests are actually uniform (doctors can plausibly use many different symptoms to develop suspicious, and the probabilities for each might not be the same) and so forth.
Point 6: You're only making yourself look more like a fool the more you continue this, Rolfe. Admit you were wrong and get it over with.
Wrath of the Swarm
27th April 2004, 04:35 PM
Oh, and by the way: Rolfe is an excellent example of why the medical practitioners generally failed to answer the question properly:
They assumed facts not in evidence, and had excessive confidence in the ability of doctors to make accurate judgements.
Of course, Rolfe is not a qualified medical professional. So her inability to interpret a simple question properly means little.
ceptimus
27th April 2004, 04:37 PM
I disagree (with Rolfe and geni)
If someone states that a test is 99% accurate, and gives no other information, then you must assume that one test out of every 100 will give the wrong result, regardless of whether the persons being tested are diseased, healthy, or any mixture of the two.
It follows from this assumption (which is the only sensible one to make, given how the original question was phrased) that the error rates for both false positives and false negatives is 1%
I think your familiarity with the subject is making you try to read things into Wrath's question that simply were not there.
Geni - if you look back through the thread, you will see I gave a simple worked out solution in my first post. Wrath gave quite sufficient information to allow the question to be answered fully. As I already said, if you wish to ask different questions (or choose to believe that Wrath did) then they will likely have different answers.
geni
27th April 2004, 04:40 PM
Originally posted by Wrath of the Swarm
Point 1: The question, as I presented it, is the same question that was used in research with doctors.
Appeal to authority logical fallicy
Point 2: Even if you're so obsessed with proving me wrong that you're willing to claim I had phrased the question inappropriately, you must also claim that the hordes of psychology researchers and statisticians who wrote the question also screwed up... which I think goes just a bit farther.
next it going to be 100,000 european doctors isn't it I can just tell
Context is everything.
Point 3: The question, as it stands, is perfectly comprehensible.
If by that you mean that I can guess what you mean then yes. However without makeing this guess the problem is unsolverble
Wrath of the Swarm
27th April 2004, 04:45 PM
Originally posted by geni
Appeal to authority logical fallicy No, you fool! The next bit is the appeal to authority! That's just the "appeal to keeping experimental modalities the same".
If by that you mean that I can guess what you mean then yes. However without makeing this guess the problem is unsolverble I quite agree. The problem is completely unsolverble. No one can solverb it!
Flan flan flan flan...
geni
27th April 2004, 04:47 PM
Originally posted by ceptimus
I disagree (with Rolfe and geni)
If someone states that a test is 99% accurate, and gives no other information, then you must assume that one test out of every 100 will give the wrong result, regardless of whether the persons being tested are diseased, healthy, or any mixture of the two.
It follows from this assumption (which is the only sensible one to make, given how the original question was phrased) that the error rates for both false positives and false negatives is 1%
(my Italics) I see no reason to assume. The error would be enogh to get the question throw out of an exam paper. The way Wrath of the Swarm persented the question made it clear that it was ment to throw you. In such cases it is vital that the question is sound and makes sure that the person trying to solve it does not have to make any assumptions. In this case an assumption had to be made for which I saw no reason to belive such an assumption should be relible. As such the question was unsolverble.
ceptimus
27th April 2004, 04:51 PM
Originally posted by geni
(my Italics) I see no reason to assume. The error would be enogh to get the question throw out of an exam paper. The way Wrath of the Swarm persented the question made it clear that it was ment to throw you. In such cases it is vital that the question is sound and makes sure that the person trying to solve it does not have to make any assumptions. In this case an assumption had to be made for which I saw no reason to belive such an assumption should be relible. As such the question was unsolverble. I think you are being unfair. If I told you a remote viewer was asked to view whether someone was in a town or the country, and they were right 99% of the time, what would you assume then?
geni
27th April 2004, 04:52 PM
Originally posted by Wrath of the Swarm
No, you fool! The next bit is the appeal to authority! That's just the "appeal to keeping experimental modalities the same".
they both are appeals to authority it's just the second one contians an appeal to popularity as well. You didn't keep the experimental modalities the same since you changed the context.
Rolfe
27th April 2004, 04:54 PM
Originally posted by Wrath of the Swarm
Why the doctor performed the test has no bearing on the correct answer! Does it matter why Farmer Brown took away three apples from the box that held seven? No!
And there aren't an infinite number of answers. Without specifying different values for alpha and beta, we consider only error. Alpha and beta values follow from the overall accuracy.
Again: the hypothetical test was 99% accurate, so there was a 99% chance that any result it came up with would be correct. That tells you what the alpha and beta rates are - they're equal in this particular case.This is confusing and conflating my two separate assumptions about what Wrath meant by "accurate" that I'm barely capable of disentangling them. It's now clear that Wrath had even less idea about what he was talking about than I realised. Breathtaking.there was a 99% chance that any result it came up with would be correctDo you realise that you've just soundly contradicted yourself? The entire thrust of this thread was to demonstrate (correctly, for the conditions you assumed but did not state) that the chance the result in queston was correct was in fact less than 10%.
Make up your mind.
There are only two ways I can see to get this "accuracy" figure.
An arithmetical mean of the sensitivity and specificity. If they were equal, then that would be right enough. But you've explicitly denied that this is how you calculate the figure.
Or the percentage of tests carried out in practice which are correct (positive or negative). This would seem more likely for a figure you now relabel as "error", but to calculate this you need all of the sensitivity, the specificity and the incidence of the condition in the population being tested.
Dream of a thousand cats (with apologies to Neil Gaiman).
1000 cats. Incidence of FeLV infection 10% (for whatever reason).
FeLV test, sensitivity 98%, specificity 95%.
We have 100 infected cats, and 900 uninfected cats.
Of the 100 infected, 98 are true-positive and 2 are false-negative.
Of the 900 uninfected, 855 are true-negative and 45 are false-positive.
Total results:
143 positive, of which 68.5% are correct.
857 negative, of which 99.8% are correct.
1000 results, of which 47 are wrong. Therefore 95.3% of the results on this population are correct. With the positives much more likely to be wrong than the negatives, as is quite often the case, special circumstances pertaining to individuals with very pathognomonic clinical presentations notwithstanding.
And you can see that if you plug in different values for the three original variables, you can get a wide variety of different answers.
OK Wrath. These are two ways of calculating "accuracy" to definitions I can comprehend. Now would you please do me the maths for your derivation of 99%?
And it's quite ridiculous to assert that because you gave only one figure, we should assume the same figure applies to sensitivity and specificity. This pretty much never happens in the real world. To say that since only specificity was relevant to the question, you therefore meant to say "specificity", is reasonable and it's what I originally assumed.
But if 99% is some calculated figure from sensitivity and specificity, I at least want to know how you are going to calculate it when the two values are not equal.
Rolfe.
geni
27th April 2004, 04:56 PM
Originally posted by ceptimus
I think you are being unfair. If I told you a remote viewer was asked to view whether someone was in a town or the country, and they were right 99% of the time, what would you assume then?
If 999 people in your sample were in the town and 1 in the country I know excatly what I would assume. The assumption can totaly mess up the results and as such is serious.
Wrath of the Swarm
27th April 2004, 04:57 PM
Originally posted by Rolfe
Breathtaking.Do you realise that you've just soundly contradicted yourself? The entire thrust of this thread was to demonstrate (correctly, for the conditions you assumed but did not state) that the chance the result in queston was correct was in fact less than 10%. Um, no.
The point of the thread was that, for a particular individual who had been given a positive result, there was only about a 10% chance they actually had the disease.
The chance that the test would give out the correct result was still 99%. But the disease was sufficiently uncommon that the chance the test would wrongly give a positive was much greater than the chance of a true positive.
Do you understand that the set of people given the test and the set of people who tested positive are not the same?
Would it help if I typed more slowly?
Doo yoou unnnderstaaaannnnd?
ceptimus
27th April 2004, 04:59 PM
Originally posted by Rolfe
The entire thrust of this thread was to demonstrate (correctly, for the conditions you assumed but did not state) that the chance the result in queston was correct was in fact less than 10%.No. Out of every 100 tests, 1 gave the wrong answer. You are misreading what Wrath said.
ceptimus
27th April 2004, 05:02 PM
This 'sensitivity' and 'specificity' is what is confusing you Rolfe. Wrath made no mention of those.
On average, out of every 100 tests carried out, 99 will give the correct answer, and 1 will give the wrong answer. That is all you need to know, and it is perfectly clear.
Rolfe
27th April 2004, 05:12 PM
Originally posted by Wrath of the Swarm
Point 1: The question, as I presented it, is the same question that was used in research with doctors.
Point 2: Even if you're so obsessed with proving me wrong that you're willing to claim I had phrased the question inappropriately, you must also claim that the hordes of psychology researchers and statisticians who wrote the question also screwed up... which I think goes just a bit farther.
Point 3: The question, as it stands, is perfectly comprehensible.
Point 4: It doesn't matter why the doctor ordered the test. There are plenty of tests that are used as screens. Furthermore, even in the ones that aren't, the error rates of the test are not dependent on the makeup of the tested population.
Point 5: If you want to link together multiple tests, fine. The analysis of the results becomes much, much more complicated. We have to determine the error rating(s) of the first test, the degree to which the first and second tests are independent, determine whether the initial tests are actually uniform (doctors can plausibly use many different symptoms to develop suspicious, and the probabilities for each might not be the same) and so forth.
Point 6: You're only making yourself look more like a fool the more you continue this, Rolfe. Admit you were wrong and get it over with. Trawling through the ad-homs to get to the argument, such as it is....
Point 1. I don't care whether the same flawed question was used to ambush doctors. Appeal to authority. Geni spotted the flaw too while I was typing my initial post, so it wasn't exactly subtle.
Point 2. Same as point 1, appeal to authority.
Point 3. Only if you make the effort to figure out the unstated assumptions.
Point 4. There are plenty of tests that are used as screens. Furthermore, even in the ones that aren't, the error rates of the test are not dependent on the makeup of the tested population.(a) Yes, there are plenty of tests that are used as screens. But whether or not that is the case in this particular instance is something you didn't see fit to tell us.
(b) Kindly define "error rates of the test". Show me the maths. I showed you mine. Specificity and sensitivity are independent of the composition of the population being tested. That is why they are the figures to look for when assessing a product. You can then plug these in to different "populations" to get positive and negative predictive value, which are. Your arguments seem to be slewing wildly between one definition and the other, which your lack of defining what you mean by either "accuracy" or "error rate" simply obfuscates completely.
Point 5. Combining sensitivities, specificities and clinical probability of infection to get an estimated predictive value for an individual test on an individual patient is more complex, I agree. Which is why I presented it as a graph (see page 1). You are making it needlessly complicated dragging in differential probabilities for individual clinical signs. While this might be a further refinement, to say "the clinical probability that this patient is affected is x%, to my most educated guess" is a perfectly workable way to go about it, and much superior to "people in general have a y% incidence of this condition" when you are dealing with a specified individual. It's not that hard.
Point 6.You're only making yourself look more like a fool the more you continue this, Wrath. Admit you were wrong and get it over with.I agree.
Rolfe.
geni
27th April 2004, 05:19 PM
Originally posted by ceptimus
This 'sensitivity' and 'specificity' is what is confusing you Rolfe. Wrath made no mention of those.
On average, out of every 100 tests carried out, 99 will give the correct answer, and 1 will give the wrong answer. That is all you need to know, and it is perfectly clear.
Problem is that the sensitivity and specificity are the two things that make up the accucery. The result is I end up with an equation looking something like this:
P<sub>1</sub>+P<sub>2</sub>=X now I know X but I don't know either of the other two values so I'm slightly stuck.
Rolfe
27th April 2004, 05:21 PM
Originally posted by ceptimus
This 'sensitivity' and 'specificity' is what is confusing you Rolfe. Wrath made no mention of those.
On average, out of every 100 tests carried out, 99 will give the correct answer, and 1 will give the wrong answer. That is all you need to know, and it is perfectly clear. Wrath made no mention, but he should have.
In fact, only the specificity is required for the calculation Wrath posed. Therefore I initially assumed that the sloppily-used "accuracy" figure was intended to be specificity. And stated this assumption clearly. It is Wrath himself who is denying this is what he meant.
Now, please think long and hard about the different things your second paragraph might mean, and the ways in which it is not "perfectly clear". Have another look at the "Dream of a thousand cats".
You are assuming that by 99% accurate, we can assume that 1% of unaffected individuals will test false positive, and 1% of affected individuals will test false negative.
That is simply an invalid assumption. Real tests in the real world have different values for these two figures, and they have to be quoted separately. You might say as a sweeping generalisation that a test was "highly accurate" if both figures were very good, but there's no meaningful way to combine them to a single "error rate" unless you do the entire thousand cats dance.
I'm not confused. I do this for a living. I have published a chapter in a book about it. And got very good book reviews from eminent professors, by the way.
I know what Wrath assumed, and I know what he wanted us to assume. That was clear from the first post. What is being discussed is the way this was set up without making these assumptions clear, and the fact that if you make other, equally valid assumptions, you get a completely different answer to the one Wrath wanted us to get.
Rolfe.
Rolfe
27th April 2004, 05:33 PM
Originally posted by Wrath of the Swarm
The point of the thread was that, for a particular individual who had been given a positive result, there was only about a 10% chance they actually had the disease.
The chance that the test would give out the correct result was still 99%. But the disease was sufficiently uncommon that the chance the test would wrongly give a positive was much greater than the chance of a true positive.Quit with the ad-homs, it just gives me eyestrain.
For a particular individual who gets the test, there are all sorts of different probabilities that he is actually affected. Depending on the assumptions you make.
Now, could you do me the maths again (oh sorry I mean for the first time) to demonstrate how you arrive at "the chance that the test will give out the correct result is 99%". Accuracy, error rate, I don't care what you call it, just TELL ME HOW YOU WORK IT OUT.
Now tell me how, if you are treating every individual the same, that is as members of this "population" with 0.1% incidence, you can still simultaneously declare that the chance the test has given out the correct result is only 9.02%.
Where are these test being done that have the 99% probability of being right, and what's it about this particular patient that gives him only a 9.02% chance of getting a correct result?
(Hint: You are going to have to consider the people getting the negative results here. I want to see the maths. And I want to bottom line to come out at 99% exactly, using the parameters you yourself have set.)
Rolfe.
Wrath of the Swarm
27th April 2004, 05:38 PM
Real tests in the real world do not necessarily have different values for the two numbers. They do frequently.
Your inability to comprehend his point makes the rest of your claims even more suspicious than they already are.
For my argument to be valid, I would have to use the same question as was used in the studies. That's a basic point of experimental design - which Rolfe clearly knows nothing about. You can't test the validity of an experiment without recreating its structure.
The second point *is* appeal to authority... just as Rolfe's claims about having written a book are appeals to authority. Who is more credible - psychological researchers, medical doctors, and statisticians, or Rolfe?
There are no unstated assumptions. The only person with assumptions is Rolfe, who can only think by rote and can't comprehend that alpha and beta values do not have to be specified, nor do they even have to be different.
Your attention-whoring is not going unnoticed.
Wrath of the Swarm
27th April 2004, 05:41 PM
Originally posted by Rolfe
Now tell me how, if you are treating every individual the same, that is as members of this "population" with 0.1% incidence, you can still simultaneously declare that the chance the test has given out the correct result is only 9.02%. Strawman. The conclusion from a positive result that a given person has the disease has the 9.02% chance of being correct.
The chance of the test being wrong for any person is 1%. The chance of the test being wrong for a particular subset of people isn't necessarily the same.
Don't you understand any statistics at all?!
Paul C. Anagnostopoulos
27th April 2004, 05:47 PM
Mentioning one's own book is an appeal to authority?
~~ Paul
Rolfe
27th April 2004, 05:49 PM
Originally posted by Paul C. Anagnostopoulos
Mentioning one's own book is an appeal to authority?Probably. One has to counter the ad-homs somehow.
Rolfe.
geni
27th April 2004, 05:51 PM
Originally posted by Wrath of the Swarm
For my argument to be valid, I would have to use the same question as was used in the studies. That's a basic point of experimental design - which Rolfe clearly knows nothing about. You can't test the validity of an experiment without recreating its structure.
The test is already invalid as a repeat due to the change in context. Thinking that you have equiverlence suggests that you havn't though th through properly
There are no unstated assumptions
Even your supporter dissagrees with you here. You assumption is that p<sub>1</sub>=p<sub>2</sub>. You can see this in you own calculations back on page one
Wrath of the Swarm
27th April 2004, 05:59 PM
Originally posted by Paul C. Anagnostopoulos
Mentioning one's own book is an appeal to authority? The implication is that she's an expert on the subject, as she had cause to write a book.
Since she's demonstrating a complete lack of comprehension in this thread, I'd hate to read her book.
geni: Nothing about the context was changed. May I ask what you think changed between my question and the one presented in the studies?
steve74
27th April 2004, 06:02 PM
Originally posted by Wrath of the Swarm
Point 1: The question, as I presented it, is the same question that was used in research with doctors.
Point 2: Even if you're so obsessed with proving me wrong that you're willing to claim I had phrased the question inappropriately, you must also claim that the hordes of psychology researchers and statisticians who wrote the question also screwed up... which I think goes just a bit farther.
Point 3: The question, as it stands, is perfectly comprehensible.
Point 4: It doesn't matter why the doctor ordered the test. There are plenty of tests that are used as screens. Furthermore, even in the ones that aren't, the error rates of the test are not dependent on the makeup of the tested population.
Point 5: If you want to link together multiple tests, fine. The analysis of the results becomes much, much more complicated. We have to determine the error rating(s) of the first test, the degree to which the first and second tests are independent, determine whether the initial tests are actually uniform (doctors can plausibly use many different symptoms to develop suspicious, and the probabilities for each might not be the same) and so forth.
Point 6: You're only making yourself look more like a fool the more you continue this, Rolfe. Admit you were wrong and get it over with.
With regard to point 1: I'd like to know how you know the question you asked was the same question asked in the research, when you fully admit that you don't can't remember which research you read?
I'd guess you are referring to Casscells et al. (1978) (certainly the most famous example of a base rate neglect study in med students) where a similar example was given to a group of faculty, staff and fourth-year students at Harvard Medical School. Only 18% got anywhere near the correct answer. The question, in their study, was phrased rather more exactly than in your question, specifically:
“If a test to detect a disease whose prevalence is 1/1000 has a false positive rate of 5%, what is the chance that a person found to have a positive result actually has the disease, assuming that you know nothing about the person’s symptoms or signs?_ ____%”
As you can see this study was quite specific in mentioning false positives rather than your vague talk of 'accuracy'. Your question is underspecified as has been pointed out to you many times.
Casscells, W., Schoenberger, A. and Grayboys, T. (1978). Interpretation by physicians of clinical laboratory results. New England Journal of Medicine, 299, 999-1000.
Of course if you were referring to another study I'm sure you'll cite it.
Interesting Ian
27th April 2004, 06:04 PM
Originally posted by slimshady2357
Too easy, but what's the chances of you having the disease if you get three positives in a row?
Adam
Just looked at it now and I thought . .ummm . .surely it's very obviously 10%. But when I voted I scracely thought that 13 out of the previous 14 would have voted the same as me! :eek:
I'm agreeing with everyone. This is seriously worrying :( ;)
Rolfe
27th April 2004, 06:10 PM
Originally posted by Wrath of the Swarm
Strawman. The conclusion from a positive result that a given person has the disease has the 9.02% chance of being correct.
The chance of the test being wrong for any person is 1%. The chance of the test being wrong for a particular subset of people isn't necessarily the same.
Don't you understand any statistics at all?! Wrath, it seems I understand them better than you.
You are simply continuing to assert that under the very restricted conditions you impose, this figure is correct. What I am trying to get through to you is that in the real world, these conditions do not apply.
Just tell me one actual diagnostic test which has been through proper sensitivity and specificity testing and has come out with identical results for both. And I don't want vague manufacturers' claims, I want your actual studies, with actual patients and actual reference testing for comparison. I can find you plenty that aren't identical. (Unfortunately they're in a box in my office, not on the Net, because the Veterinary Record has only dragged itself into the IT age this year.)
[Digression. If you go through the entire thousand cats for 0.1% incidence, 99% sensitivity and 99% specificity, you do indeed get 99% of results correct for that particular combination. Though I doubt if you knew that. I'd still like to see you do the working.]
No, it's easier than that JUST TELL ME THE CALCULATION YOU USE TO DERIVE THAT 99% ACCURACY/ERROR RATE FIGURE. Can you do it? Given that even you can't possibly assert that EVERY test has equal sensitivity and specificity.
Or simply admit that you meant to say "99% specificity" in the first place (because that was all we needed to know), but were a bit loose in your terminology.
Now, back to the more interesting question.
If you test everybody in the world, and pool their results, then your figure is correct. 9.02% of the positive results are correct. (And 99.99898% of the negative results are correct, to save you a job.)
Guess what. We don't care. We don't test everybody in the world, and even if we did, we wouldn't be testing them as unidentified zombies, but as individuals with their own characteristics.
If the individual in question is in a group less likely than the whole to be affected (that is clinically normal) then we reduce the probability of the positive result being correct accordingly, by plugging in the correct incidence in the group to which this patient belongs. As the prevalence to be considered has to be the prevalence in the population to which the patient belongs.
But if he is in a group more likely to be affected, we increase the probability of the positive result being correct.
The bare probabilities applicable to the population of "everybody in the world" may be of interest to statisticians, but they are only a starting point when making a clinical decision.
If you had worded the question in a totally abstract way, asking for the percentage of positive results which would be right assuming that the incidence in the population being tested was 0.1% (and we'd managed to agree that it was the specificity which was 99%), then fine.
But you can't ask a question about an individual patient, with vital information which you simply leave out, and continue to assert that this validity still maintains.
Rolfe.
Paul C. Anagnostopoulos
27th April 2004, 06:13 PM
Ian said:
I'm agreeing with everyone. This is seriously worrying
That is because this thread is causing a rampant, free-floating vortex in the space/time continuum. Consider the players. Consider the opinions. Consider the personalities. Nothing like this has ever occurred in this universe before.
~~ Paul
Wrath of the Swarm
27th April 2004, 06:33 PM
There is no vital information being left out. You've given everything you need to know about the conclusion and the factors affecting the test.
Symptoms are irrelevant. They matter only to presorting, which is a form of test. Combining two tests makes everything much more complex, and it's not the situation asked about in the research (I mentioned several posts after I complained about not being able to find it that I was looking in the wrong place for references).
What if the disease we were discussing was HIV, or a similiar infection with few (if any) obvious symptoms?
The test has objective and universal error rates that are sometimes the same for alpha and beta and sometimes different. In this example, they are the same. We do not want to consider extraneous details that would make answering the question harder - you can barely manage this one as it is.
I am pointing out that, when presented with a simple question involving the use of a diagnostic test with known accuracy in a particular generic situation, the vast majority got it wrong.
I suspect that if you had asked them, most of them would have predicted they'd get it right.
This is the problem. Rolfe, as the resident mindless-defender-of-the medical-status-quo, denies that there is a problem and attacks that which makes the existence of the problem clear. When dealing with people whose positions are grossly incorrect, her mindless rancor actually aids her. But she can't tell the difference - she'll attack anything and everything.
Rolfe
27th April 2004, 06:34 PM
Originally posted by steve74
The question, in their study, was phrased rather more exactly than in your question, specifically:
“If a test to detect a disease whose prevalence is 1/1000 has a false positive rate of 5%, what is the chance that a person found to have a positive result actually has the disease, assuming that you know nothing about the person’s symptoms or signs?_ ____%”Now that is valid. State clearly that the figure being given is the "false positive rate", which is simply 100 - specificity (or to be absolutely exact, the other way around, that is specificity is defined as 100 - the false positive rate). So, specificit