Forum Index Register Members List Events Mark Forums Read Help

 JREF Forum The Rationality Behind Ockham's Razor

 Welcome to the JREF Forum, where we discuss skepticism, critical thinking, the paranormal and science in a friendly but lively way. You are currently viewing the forum as a guest, which means you are missing out on discussing matters that are of interest to you. Please consider registering so you can gain full use of the forum features and interact with other Members. Registration is simple, fast and free! Click here to register today.

 9th January 2008, 01:15 PM #1 rocketdodger Philosopher     Join Date: Jun 2005 Location: Hyperion Posts: 6,668 The Rationality Behind Ockham's Razor So I was reading through Artificial Intelligence: A Modern Approach and I came across a proof of the rationality of the razor. If you are like me, and you have wondered why the simplest solution is usually the best choice, you might be interested in this: Suppose we have a set of hypotheses, all of which match the data perfectly. In fact this set is infinite, but that is not important. Furthermore, suppose there is some metric that can be used to measure the "complexity" of the hypotheses, which loosely defined would be something like the number of terms in a polynomial for a data point fitting problem, or the number of steps in an alogrithm, etc. Thus, there is some discretization that can be applied and "complexity" can be reported as an integer. For some number n, there is a hypothesis of complexity n, that is the simplest hypothesis, I.E. for all hypotheses in the set their complexity is greater than or equal to n. Mathematically, it can be shown that as complexity increases, the number of possible hypothesis does not decrease. Mathematically, |{hypothesis with complexity m}| <= |{hypothesis with complexity m+}|. Now, out of an infinite set of hypotheses, which should we choose? Because as complexity increases the number of "apparently" correct hypotheses (those that match the data) increases, the chances of choosing the actually correct hypothesis (the one modeling the real process used to generate the data) at a given complexity level decrease. In mathematical terms, the prior probability of a hypothesis being true decreases as complexity increases. Because one should choose a hypothesis to maximize the combination of prior and posterior probability, and all hypotheses that match the data have a posterior probability of 1, one should base the choice on prior probability alone. In plain language, one should choose the hypothesis with the lowest probability of being wrong. Mathematically, this is always the least complex hypothesis that matches the data.
 9th January 2008, 01:17 PM #2 Loss Leader Opinionated JerkModerator     Join Date: Jul 2006 Location: New York Posts: 11,882 Thanks. Very informative. __________________ Follow me on Twitter! @LossLeader This force is receiving all the right to vote through the use of magic. - Miernik Wieslaw VOTE FOR ME JUST BECAUSE
 9th January 2008, 01:30 PM #3 fls Penultimate Amazing     Join Date: Jan 2005 Posts: 10,236 Originally Posted by rocketdodger So I was reading through Artificial Intelligence: A Modern Approach and I came across a proof of the rationality of the razor. If you are like me, and you have wondered why the simplest solution is usually the best choice, you might be interested in this: Suppose we have a set of hypotheses, all of which match the data perfectly. In fact this set is infinite, but that is not important. Furthermore, suppose there is some metric that can be used to measure the "complexity" of the hypotheses, which loosely defined would be something like the number of terms in a polynomial for a data point fitting problem, or the number of steps in an alogrithm, etc. Thus, there is some discretization that can be applied and "complexity" can be reported as an integer. For some number n, there is a hypothesis of complexity n, that is the simplest hypothesis, I.E. for all hypotheses in the set their complexity is greater than or equal to n. Mathematically, it can be shown that as complexity increases, the number of possible hypothesis does not decrease. Mathematically, |{hypothesis with complexity m}| <= |{hypothesis with complexity m+}|. Now, out of an infinite set of hypotheses, which should we choose? Because as complexity increases the number of "apparently" correct hypotheses (those that match the data) increases, the chances of choosing the actually correct hypothesis (the one modeling the real process used to generate the data) at a given complexity level decrease. In mathematical terms, the prior probability of a hypothesis being true decreases as complexity increases. Because one should choose a hypothesis to maximize the combination of prior and posterior probability, and all hypotheses that match the data have a posterior probability of 1, one should base the choice on prior probability alone. I don't understand this part. How can all hypotheses have a posterior probability of 1 if only one is actually correct? ETA: Have I go the question right? "What is the probability this hypothesis is the actually correct hypothesis given 'this' level of complexity?" ETA2: I think that what is meant is that the likelihood ratio is 1? Linda __________________ God:a capricious creative or controlling force said to be the subject of a religion. Evidence is anything that tends to make a proposition more or less true.-Loss Leader SCAM will now be referred to as DIM (Demonstrably Ineffective Medicine) Look how nicely I'm not reminding you you're dumb.-Happy Bunny When I give an example, do not assume I am excluding every other possible example. Thank you. Last edited by fls; 9th January 2008 at 01:39 PM.
 9th January 2008, 01:32 PM #4 Trakar Philosopher     Join Date: Oct 2007 Posts: 5,920 Originally Posted by rocketdodger In plain language, one should choose the hypothesis with the lowest probability of being wrong. Mathematically, this is always the least complex hypothesis that matches the data. Shouldn't that be the least complex hypothesis that properly accounts for all relevent data/factors? And the obvious problem with mistaking this assessing tool, for a divining rod of reality, is that we are almost never able to completely and accurately know all the relevent data/factors. So an Ockham based adjudgement should always conditional to the known information.
 9th January 2008, 01:36 PM #5 MWare Muse     Join Date: Dec 2005 Location: Brooklyn Posts: 666 I thought Occam's Razor postulated that the best explanation for phenomena is the one that requires the least amount of assumptions, not necessarily the simplest explanation. Am I wrong about that? __________________ “The plural of anecdote is not evidence." --George Stigler "I am all in favor of a dialogue between science and religion, but not a constructive dialogue. One of the great achievements of science has been, if not to make it impossible for intelligent people to be religious, then at least to make it possible for them not to be religious. We should not retreat from this accomplishment." --Steven Weinberg Last edited by MWare; 9th January 2008 at 01:37 PM.
 9th January 2008, 02:14 PM #6 rocketdodger Philosopher     Join Date: Jun 2005 Location: Hyperion Posts: 6,668 Originally Posted by fls I don't understand this part. How can all hypotheses have a posterior probability of 1 if only one is actually correct? ETA: Have I go the question right? "What is the probability this hypothesis is the actually correct hypothesis given 'this' level of complexity?" ETA2: I think that what is meant is that the likelihood ratio is 1? Linda I probably have the terms wrong, sorry (I always f--- up these two terms..). Here is how I use them: prior probability = mathematical probability that a hypothesis is correct in exclusion to all the others in its' level of complexity, which all else being equal is simply 1/ posterior probability = probability that the hypothesis, if true, will lead to the observed data. Last edited by rocketdodger; 9th January 2008 at 02:27 PM.
 9th January 2008, 02:16 PM #7 rocketdodger Philosopher     Join Date: Jun 2005 Location: Hyperion Posts: 6,668 Originally Posted by TShaitanaku Shouldn't that be the least complex hypothesis that properly accounts for all relevent data/factors? And the obvious problem with mistaking this assessing tool, for a divining rod of reality, is that we are almost never able to completely and accurately know all the relevent data/factors. So an Ockham based adjudgement should always conditional to the known information. Thats why I stipulated the assumption all the hypothesis match the data. If only some of them match, or none of them match but some are close, then obviously the decision is more complex.
 9th January 2008, 02:22 PM #8 rocketdodger Philosopher     Join Date: Jun 2005 Location: Hyperion Posts: 6,668 Originally Posted by MWare I thought Occam's Razor postulated that the best explanation for phenomena is the one that requires the least amount of assumptions, not necessarily the simplest explanation. Am I wrong about that? It seems they are equivalent. By definition, an assumption doesn't influence the hypothesis in a way that can be tested; otherwise it is not an assumption. Thus assumptions simply increase the complexity of a hypothesis. Thank you for bring up the question though, it made me think for a minute or two!
 9th January 2008, 04:20 PM #9 athon Kowalski     Join Date: Aug 2001 Location: gone Posts: 9,286 Originally Posted by MWare I thought Occam's Razor postulated that the best explanation for phenomena is the one that requires the least amount of assumptions, not necessarily the simplest explanation. Am I wrong about that? I was thinking the same thing. The way I've always known it (and taught it) was that Occham's Razor gives you an efficient starting point for evaluating hypotheses to gain confidence in their being productive. Given two hypotheses, the one which can account for the most observations with the least number of assumptions has a higher chance of not encountering an observation which will falsify it. An assumption has a higher chance of being untrue than an observation, which means those hypotheses accounting for the most observations without needing assumptions to make it true will be more likely to be useful. Then again, I don't think this 'proof' is incompatible with this view, ultimately. I just think it complicates it. Athon Last edited by athon; 9th January 2008 at 04:21 PM.
 10th January 2008, 02:58 AM #10 Egg Graduate Poster     Join Date: Nov 2007 Posts: 1,241 Originally Posted by rocketdodger prior probability = mathematical probability that a hypothesis is correct in exclusion to all the others in its' level of complexity, which all else being equal is simply 1/ Doesn't this suggest that all hypotheses of the same level of complexity, however ridiculous they may be, are equally likely? Is that a reasonable assumption to make? Also this probability wouldn't include all of the hypotheses we haven't thought of.
 10th January 2008, 06:17 AM #11 fls Penultimate Amazing     Join Date: Jan 2005 Posts: 10,236 Originally Posted by Egg Doesn't this suggest that all hypotheses of the same level of complexity, however ridiculous they may be, are equally likely? Is that a reasonable assumption to make? Also this probability wouldn't include all of the hypotheses we haven't thought of. There would be some number of possible hypotheses that fit the data equally well (i.e. they are all equally ridiculous or equally non-ridiculous) at each level of complexity. This would include hypotheses we have thought of and those we haven't. Each level of complexity contains at least the same number of hypotheses as the one below it. I haven't thought this thing through thoroughly, but it seems that, without any prior constraints on the level of complexity that the 'true' hypothesis will have, that even though it will be easier to find the 'true' hypothesis at the lowest level of complexity, that will only be the case if the 'true' hypothesis is at the lowest level of complexity. However, without prior constraints, it is more likely that one of the higher levels of complexity contains the 'true' hypothesis, just because the higher levels contain the greatest proportion of all hypotheses. I can't tell from the description in the OP how these two competing influences are reconciled to come up with the answer they give. Fortunately, I think the point is moot, though. Hypotheses seem to compete on the issue of explanatory power, not number of assumptions. Linda __________________ God:a capricious creative or controlling force said to be the subject of a religion. Evidence is anything that tends to make a proposition more or less true.-Loss Leader SCAM will now be referred to as DIM (Demonstrably Ineffective Medicine) Look how nicely I'm not reminding you you're dumb.-Happy Bunny When I give an example, do not assume I am excluding every other possible example. Thank you.
 10th January 2008, 06:57 AM #12 Beerina Sarcastic Conqueror of Notions     Join Date: Mar 2004 Location: A floating island above the clouds Posts: 23,835 Originally Posted by athon An assumption has a higher chance of being untrue than an observation, which means those hypotheses accounting for the most observations without needing assumptions to make it true will be more likely to be useful. I like that. I.e. the more complicated, the more that can go wrong. __________________ "Great innovations should not be forced [by way of] slender majorities." - Thomas Jefferson The government should nationalize it! Socialized, single-payer video game development and sales now! More, cheaper, better games, right? Right?
 10th January 2008, 09:33 AM #13 CaptainManacles Muse     Join Date: Apr 2005 Posts: 818 Originally Posted by rocketdodger So I was reading through Artificial Intelligence: A Modern Approach and I came across a proof of the rationality of the razor. If you are like me, and you have wondered why the simplest solution is usually the best choice, you might be interested in this: Suppose we have a set of hypotheses, all of which match the data perfectly. In fact this set is infinite, but that is not important. Furthermore, suppose there is some metric that can be used to measure the "complexity" of the hypotheses, which loosely defined would be something like the number of terms in a polynomial for a data point fitting problem, or the number of steps in an alogrithm, etc. Thus, there is some discretization that can be applied and "complexity" can be reported as an integer. For some number n, there is a hypothesis of complexity n, that is the simplest hypothesis, I.E. for all hypotheses in the set their complexity is greater than or equal to n. Mathematically, it can be shown that as complexity increases, the number of possible hypothesis does not decrease. Mathematically, |{hypothesis with complexity m}| <= |{hypothesis with complexity m+}|. Now, out of an infinite set of hypotheses, which should we choose? Because as complexity increases the number of "apparently" correct hypotheses (those that match the data) increases, the chances of choosing the actually correct hypothesis (the one modeling the real process used to generate the data) at a given complexity level decrease. In mathematical terms, the prior probability of a hypothesis being true decreases as complexity increases. Because one should choose a hypothesis to maximize the combination of prior and posterior probability, and all hypotheses that match the data have a posterior probability of 1, one should base the choice on prior probability alone. In plain language, one should choose the hypothesis with the lowest probability of being wrong. Mathematically, this is always the least complex hypothesis that matches the data. This is the way I've always thought about it.
 10th January 2008, 10:21 AM #14 rocketdodger Philosopher     Join Date: Jun 2005 Location: Hyperion Posts: 6,668 Originally Posted by fls However, without prior constraints, it is more likely that one of the higher levels of complexity contains the 'true' hypothesis, just because the higher levels contain the greatest proportion of all hypotheses. I think the structure of the hypothesis space puts some prior constraints into effect that negate this. For instance, it seems to me that each complexity level is a subset of all the ones above it. As an example, a line is simply a 3rd degree polynomial with one coefficient set to zero, so an equivalent hypothesis could be put in both the 2nd degree and 3rd degree complexity levels.
 10th January 2008, 10:21 AM #15 becomingagodo Banned   Join Date: Mar 2007 Posts: 698 This is non sense. Quote: If you are like me, and you have wondered why the simplest solution is usually the best choice, you might be interested in this: Yeah, when I open the textbooks on Quantum mechanics it is simple. Quote: Suppose we have a set of hypotheses, all of which match the data perfectly. In fact this set is infinite, but that is not important. Okay. Even, if that was possible, which is not. So do we have the abillity to make a infinte amout of hypotheses, well no. So techniquely, this argument only works if we knew all the data and if we had infinte amout of hypotheses. Genius! I am totally convinced. Since, I can count to infinty I will now find the last digit of pi. It is 7. Also, can you define the concept simple and complex. It impossible to argue something when you use ill define concepts. Also, if all the hypotheses matched the data perfectly, wouldn't they be the same. As their would be no different, their would only be superficial difference. Last edited by becomingagodo; 10th January 2008 at 10:29 AM.
 10th January 2008, 10:25 AM #16 blutoski Philosopher     Join Date: Jan 2006 Location: Vancouver BC Canada Posts: 5,966 Originally Posted by athon Then again, I don't think this 'proof' is incompatible with this view, ultimately. I just think it complicates it. I agree. I think it's just reiterating the same thing in new language. The point of Ockham's Razor is to allocate resources, not answer actual questions. Ockham used the expression 'entities,' as in "entities should not be multiplied unnecessarily." By calling it 'more complex' instead of 'more entities' is not really adding anything. It's really just a rule of thumb that enforces the somewhat conservative nature of science as a social phenomenon. It's a generalization of uniformitarianism and an application of induction. __________________ "Sometimes it's better to light a flamethrower than curse the darkness." - Terry Pratchett Last edited by blutoski; 10th January 2008 at 10:30 AM.
 10th January 2008, 11:13 AM #17 CaptainManacles Muse     Join Date: Apr 2005 Posts: 818 Originally Posted by becomingagodo This is non sense. You should be more careful about what you say before you start flinging around insults. Your post was so stupid it made me fear for humanity. Quote: Yeah, when I open the textbooks on Quantum mechanics it is simple. It is simple considering the data that it explains. And it is a matter of the simplest hypothesis that explains the data. Quote: Okay. Even, if that was possible, which is not. So do we have the abillity to make a infinte amout of hypotheses, well no. So techniquely, this argument only works if we knew all the data and if we had infinte amout of hypotheses. Genius! It requires neither of these things. The statement was that the amount of possible hypotheses are infinite, not that we make an infinite amount of them. There are an infinate amount of possible hypotheses. We, by definition, have all the data that we have. We don't have to have all possible data, this is for building hypotheses based on current data that will maximize future predictive success. If we had all the data possible the whole proccess would be pointless. Quote: Also, can you define the concept simple and complex. It impossible to argue something when you use ill define concepts. They did define these concepts. Quote: Also, if all the hypotheses matched the data perfectly, wouldn't they be the same. As their would be no different, their would only be superficial difference. No. Clearly not. Why would that be the case? You clearly missed the point of the entire OP, you don't seem to understand basic algerbra or scientific method well enough to be participating in this discussion. Try drawing a few points on a piece of paper. Consider how many different pictures one could draw that would include these dots. Consider if these pictures would be exactly the same with only superficial differences or could you draw things completely different from one another? This is what we do in science. We have a bunch of dots and we try to paint a picture of what lies inbetween then. There are an infinate amount of ways to do this, but we have rules to pick the best. One of the best rules is Ockham's Razor, it basically says that if we have 100 dots in a neat line, then we draw a line, we don't draw the battle of waterloo, even if the battle of waterloo would include those dots. Similarly, we currently say, based on the data, that the universe started by rapidly expanding and cooling. We don't say "A magical figure who loves us and speaks english and reads our minds and lives in some fairy realm beyond our understanding wished the universe into creation for his own unfathomable reasons. And the creator hates certain people" and we don't say that for the same reason we don't say a giant purple duck quacked the universe into existence. Such a theory might even fit the data, but it makes unnecisary assumptions, it's not as simple as the alternative. consider heliocentric theories versus geocentric theories. Both actually describe the data. One is far more simple. That is why we accept one and not the other. Then relativity is even more simple then that. It may seem complex to say speed and gravity warp time, but it is more simple then stating "the planets revolve around the sun in accordence with the laws of gravity, except Mercury, which decides for no reason to dance around every now and again." A problem that the bending of light and relativistic forces solved.
 10th January 2008, 12:03 PM #18 patnray Graduate Poster   Join Date: Mar 2002 Location: San Jose, CA Posts: 1,008 Originally Posted by becomingagodo This is non sense. No, it isn't. What you really mean is "This doesn't make any sense to me." Just because you don't understand it does not make it nonsense. Infinity is a powerful and useful concept. Mathematics is a very precise tool for helping us understand things, including the concept of infinity. Please review this post again after you have passed a course in analytical mathematics and you understand the following: The proof that the repeating decimal 0.999... equals 1, One infinite set can be larger than another infinite set, and The sum of an infinite number of terms can have a finite value. __________________ Infidel by Ayaan Hirsi Ali A powerful and moving story of a strong and courageous woman’s struggle to free herself from a culture that treats women as property. Despite repeated death threats from religious zealots, she campaigns tirelessly for the rights of Muslim women. A tearful, chilling, yet inspiring, tale of personal triumph and dedication to free expression.
 10th January 2008, 12:37 PM #19 becomingagodo Banned   Join Date: Mar 2007 Posts: 698 Quote: It is simple considering the data that it explains. And it is a matter of the simplest hypothesis that explains the data. Yeah, but then it relative. Also, it a assumption that their is not a simpler explanation, the point is that complex and simple is poorly defined, as it is subjective. Since, I don't know the defintion of simple, I can only assume it is the same. Quote: The statement was that the amount of possible hypotheses are infinite, not that we make an infinite amount of them. There are an infinate amount of possible hypotheses. If we don't have all the possibillities how do you pick the simplest one. Quote: We, by definition, have all the data that we have. We don't have to have all possible data, I'm confused. You either do have all the data or all the possible data. What do you mean by possible data? Does the hypotheses have some randomness. Quote: They did define these concepts. Can somebody post the defintion. Quote: Try drawing a few points on a piece of paper. Consider how many different pictures one could draw that would include these dots. Consider if these pictures would be exactly the same with only superficial differences or could you draw things completely different from one another? Do you know what superficial means? It means on the surface. Again, if you connect the dots in every possible way, it would be superficial. Yes you would have different pattarns, but the pattarn will have the same structure, which is the dots. Again, structure gives something substances, not the outside. Even if has lots of lines drawn on it surface. Quote: We have a bunch of dots and we try to paint a picture of what lies inbetween then. There are an infinate amount of ways to do this, but we have rules to pick the best. Not really, you can't look at evidence for evolution and then say here this proves creationism. It just wouldn't make any sense. Quote: One of the best rules is Ockham's Razor, it basically says that if we have 100 dots in a neat line, then we draw a line, we don't draw the battle of waterloo, even if the battle of waterloo would include those dots. Lines are superficial i.e. the surface. The structure i.e. the dots only matter. Science is not connecting the dots. Quote: Similarly, we currently say, based on the data, that the universe started by rapidly expanding and cooling. This has nothing to do with the best or simplest theory. Is a finite universe simpler then an infinte universe? Again, the big bang made a prediction, which was correct. Quote: Such a theory might even fit the data, but it makes unnecisary assumptions, it's not as simple as the alternative. First of all you can't disprove it. Second, you can't make prediction. So I don't see how this is a scientific theory. God did it is not a scientific theory. Quote: One is far more simple No its not. Again, it a subjective thing. Quote: That is why we accept one and not the other. Wow, and I thought it was because of the evidence. You know Gallieo. Quote: Then relativity is even more simple then that. It may seem complex to say speed and gravity warp time, but it is more simple then stating "the planets revolve around the sun in accordence with the laws of gravity, except Mercury, which decides for no reason to dance around every now and again." A problem that the bending of light and relativistic forces solved. So this is going in the assumption that gravity is simple. Can you please give me the theory of everything as I want to know it. Again, you can say something simple by sneaking pass complex ideas. Relativity, well isn't relativity really QFT. Can you explain that in simple terms and explain the mathematics behind it. As I really want to understand the mathematics of relativity. It must be simple by Ockham's Razor. Quote: No, it isn't. What you really mean is "This doesn't make any sense to me." Just because you don't understand it does not make it nonsense. No I don't. The infinty argument, was my weakest argument. However, I was just saying it is impractical to say something is correct in real life that involves a argument with infinty. As infinty does not occur in nature. Last edited by becomingagodo; 10th January 2008 at 12:48 PM.
 10th January 2008, 12:57 PM #20 patnray Graduate Poster   Join Date: Mar 2002 Location: San Jose, CA Posts: 1,008 Originally Posted by becomingagodo No I don't. The infinty argument, was my weakest argument. However, I was just saying it is impractical to say correct in real life that involves a argument with infinty. As infinty does not occur in nature. Nonsense. Numbers do not occur in nature. But numbers are very useful things. The square root of negative one does not occur in nature, but it is very convenient for modeling some physical processes (ask any electrical engineer). Infinity is a very powerful and useful concept for understanding "real life". __________________ Infidel by Ayaan Hirsi Ali A powerful and moving story of a strong and courageous woman’s struggle to free herself from a culture that treats women as property. Despite repeated death threats from religious zealots, she campaigns tirelessly for the rights of Muslim women. A tearful, chilling, yet inspiring, tale of personal triumph and dedication to free expression.
 10th January 2008, 01:03 PM #21 rocketdodger Philosopher     Join Date: Jun 2005 Location: Hyperion Posts: 6,668 Originally Posted by becomingagodo Yeah, but then it relative. Also, it a assumption that their is not a simpler explanation, the point is that complex and simple is poorly defined, as it is subjective. Since, I don't know the defintion of simple, I can only assume it is the same. Is a straight line more simple than a cubic curve? Originally Posted by becomingagodo If we don't have all the possibillities how do you pick the simplest one. You don't, you pick the simplest one you know of, which is kind of the whole point. Originally Posted by becomingagodo I'm confused. You either do have all the data or all the possible data. What do you mean by possible data? A straight line can generate an infinite amout of data, which is the set of "all possible data." A scientist might only have a few data points ouf of that set, which is "the data." Originally Posted by becomingagodo Does the hypotheses have some randomness. It can -- the concept is not affected. It is much easier to understand with purely deterministic hypotheses, however. Originally Posted by becomingagodo Can somebody post the defintion. It is different depending on the problem. Clearly, if the hypothesis space is the set of all polynomials that satisfy the data, then complexity is related to the degree of the polynomial. You can extrapolate this to all kinds of hypothesis spaces, for example if they are all bayesian networks then the number of edges and nodes in the network are the primary factor. Originally Posted by becomingagodo Again, if you connect the dots in every possible way, it would be superficial. Yes you would have different pattarns, but the pattarn will have the same structure, which is the dots. Not if the dots are only a subset of all possible data. Originally Posted by becomingagodo Not really, you can't look at evidence for evolution and then say here this proves creationism. It just wouldn't make any sense. Sure it does -- all of evolution is nothing more than a mechanism put in place and guided constantly by God. Prove me wrong. Originally Posted by becomingagodo Lines are superficial i.e. the surface. The structure i.e. the dots only matter. Science is not connecting the dots. To non-thinking entities, yes what you say is true. To humans, however, who want to make predictions about the future, the difference between three dots generated by a straight line and three identical dots generated by a cubic curve are very important. Seriously, do you have any clue as to what you are babbling about? Originally Posted by becomingagodo No I don't. The infinty argument, was my weakest argument. However, I was just saying it is impractical to say correct in real life that involves a argument with infinty. As infinty does not occur in nature. That is probably why I said the fact that we can generate infinitely many hypotheses doesn't matter -- as in, it is not an essential part of the argument.
 10th January 2008, 01:54 PM #22 Lucky Graduate Poster     Join Date: Apr 2004 Location: Yorkshire Posts: 1,196 rocketdodger, this 'proof' is very interesting, and I'd say there's something in it, but as described by you it doesn't work - the handling of probability is wrong. Originally Posted by rocketdodger Because one should choose a hypothesis to maximize the combination of prior and posterior probability, and all hypotheses that match the data have a posterior probability of 1, one should base the choice on prior probability alone. Scrub this paragraph – it's meaningless. There is no prior vs posterior probability, as we are not performing any test or operation that changes the probability. The prior probability (i.e. prior to some hypothetical test that could actually distinguish the single correct hypothesis) is what we want to determine. Originally Posted by rocketdodger Now, out of an infinite set of hypotheses, which should we choose? Because as complexity increases the number of "apparently" correct hypotheses (those that match the data) increases, the chances of choosing the actually correct hypothesis (the one modeling the real process used to generate the data) at a given complexity level decrease. This is the core of the argument, and it is missing some crucial points. Let's simplify. We divide all possible hypotheses (i.e. ones that fit the data) into groups - the rationale doesn't matter, but they must have differing numbers of members. The proof boils down to saying that we should choose from the smallest group. But why? The interesting thing is to look at the hidden assumptions. Now, if we simply assume each hypothesis is equally probable then the proof fails, because the categorisation becomes irrelevant (obviously, we can't assume that lower-complexity hypotheses are a priori more probable by Ockham's razor, as that's what we're trying to prove). My initial thought was that the proof would work, though, if for some reason it gets harder to create each hypothesis as group size increases (i.e. each potential one has a greater chance of being missed). You state that the larger groups are higher complexity than the smaller ones, so there does seem some reason to expect that these will also be the groups where hypotheses are missed. But on second thoughts, my argument seems wrong. It would still be the case that each discovered hypothesis had an equal probability. Was there anything to suggest the authors had this explanation in mind? On the other hand, we could assume that the probability is the same for each group. In this case, hypotheses in smaller groups obviously have a higher probability of being the correct one. But is it a reasonable assumption? At the very least, complexity would have to be a meaningful grouping parameter for independent reasons. Also, hypotheses at the same level of complexity would somehow have to 'pool' their probability. It occurs to me that this could have to do with information content and redundancy - perhaps they are more likely to contain the same incorrect information as each other. Again, these requirements seem not implausible, but would need to be proved. Note that the two possible explanations are not the same. The proof should make clear which one it's using. Hmm, needs more thinking about. __________________ I believe that economic advances merely provide the opportunity for a step forward which, as yet, hasn't happened. All we have done is to advance to a point at which we could make a real improvement in human life, but we shan't do it without the recognition that common decency is necessary. George Orwell
 10th January 2008, 02:02 PM #23 becomingagodo Banned   Join Date: Mar 2007 Posts: 698 Quote: Is a straight line more simple than a cubic curve? No Quote: You don't, you pick the simplest one you know of, which is kind of the whole point. Simplest is subjective. Quote: Clearly, if the hypothesis space is the set of all polynomials that satisfy the data, then complexity is related to the degree of the polynomial. You can extrapolate this to all kinds of hypothesis spaces, for example if they are all bayesian networks then the number of edges and nodes in the network are the primary factor. So your saying size equals complexity. Poor argument. Polynomial basically follow the same structure. The point is you can make a generalization of the polynomial, and then use it like crazy. Quote: Not if the dots are only a subset of all possible data. I'm assuming the dots important structurally. If their not then the lines will still be superifricial. The point I'm trying to make is drawing lines on a surface is trivial. Quote: Sure it does -- all of evolution is nothing more than a mechanism put in place and guided constantly by God. Prove me wrong. The point is I can't prove you wrong, theirfore it is pointless. Science works by the scientific method. Or do we have to abandon the scientific method so your argument holds water. Quote: To non-thinking entities, yes what you say is true. To humans, however, who want to make predictions about the future, the difference between three dots generated by a straight line and three identical dots generated by a cubic curve are very important. That wasen't the question he posed. He said that their was lots of different pattarns and that we had to pick the simplest. That different then only having four dots and a cubic equation fits it perfectly. Quote: That is probably why I said the fact that we can generate infinitely many hypotheses doesn't matter -- as in, it is not an essential part of the argument. Not really, their would still be infinte amount of simple or even simpler hypotheses. Since we can't get them then the argument falls apart. Last edited by becomingagodo; 10th January 2008 at 02:03 PM.
 10th January 2008, 02:07 PM #24 CaptainManacles Muse     Join Date: Apr 2005 Posts: 818 Originally Posted by becomingagodo Yeah, but then it relative. The truth of a hypothesis can be greater or lesser, relative to other hypotheses, yes. Quote: Also, it a assumption that their is not a simpler explanation No their not. *eyeroll* Quote: the point is that complex and simple is poorly defined No, the OP provided an exact definition. Quote: If we don't have all the possibillities how do you pick the simplest one. You pick the simplest out of the one's available. Ockham's Rzor doesn't give absolute truth, it just gives the best truth available. Quote: I'm confused. You either do have all the data or all the possible data. What do you mean by possible data? All the data that it is potentially possible to ever collect on the subject. We don't need that, all we need is the data that we have collected thus far. Quote: Does the hypotheses have some randomness. What? Quote: Can somebody post the defintion. They did, in the OP. Quote: Do you know what superficial means? It means on the surface. Again, if you connect the dots in every possible way, it would be superficial. Yes you would have different pattarns, but the pattarn will have the same structure, which is the dots. The dots are no more the "structure" then any other aspect of the picture. I'm trying to help you understand this issue but you're obviously not interested in doing anything but trying to talk down to people who are clearly smarter then you. Quote: Again, structure gives something substances, not the outside. Even if has lots of lines drawn on it surface. Both the lines and the dots are equally on "the surface" Quote: Not really, you can't look at evidence for evolution and then say here this proves creationism. It just wouldn't make any sense. How does that disprove what I said? Quote: Lines are superficial i.e. the surface. The structure i.e. the dots only matter. Science is not connecting the dots. No, science is connecting the dots, the picture is what matters, the dots are just dots. Quote: First of all you can't disprove it. Are you trolling, how does this matter? Quote: No its not. Again, it a subjective thing. Wrong. Quote: Wow, and I thought it was because of the evidence. You know Gallieo. Both theories can explain the evidence, one does so in a more simple way. You can still predict the movements of the planets within geocentricism. Quote: So this is going in the assumption that gravity is simple. Can you please give me the theory of everything as I want to know it. What? Quote: Again, you can say something simple by sneaking pass complex ideas. What? Quote: Relativity, well isn't relativity really QFT What? Quote: Can you explain that in simple terms and explain the mathematics behind it. As I really want to understand the mathematics of relativity. It must be simple by Ockham's Razor. This has already been explained to you
 10th January 2008, 02:35 PM #25 Wowbagger The Infinitely Prolonged     Join Date: Feb 2006 Location: Westchester County, NY (when not in space) Posts: 13,499 Strictly speaking, Occam's Razor has nothing to do with simplicity. As applied to science, Occam's Razor is an economy of assumptions. It is NOT that the simplest answer is most likely right, it is which answer makes the fewest and most trivial of a priori assumptions, given the data we have available thus far. Originally Posted by becomingagodo Yeah, when I open the textbooks on Quantum mechanics it is simple. When godo opens up his Quantum Mechanics book, it is not simple. But, the information it contains fits what we have been able to determine about QM so far, without extraneous assumptions (that is, assuming it is a properly scientific book, and not one of those woo-woo ones). __________________ WARNING: Phrases in this post may sound meaner than they were intended to be. SkeptiCamp NYC: http://www.skepticampnyc.org/ An open conference on science and skepticism, where you could be a presenter! By the way, my first name is NOT Bowerick!!!! Last edited by Wowbagger; 10th January 2008 at 02:38 PM.
 10th January 2008, 03:55 PM #26 saizai Graduate Poster     Join Date: Jul 2005 Posts: 1,374 Originally Posted by rocketdodger So I was reading through Artificial Intelligence: A Modern Approach Good book. Written, IIRC, by my AI prof at Berkeley. One of the better classes I took. Quote: Mathematically, |{hypothesis with complexity m}| <= |{hypothesis with complexity m+}|. Almost; depends what you mean by 'hypothesis'. If you mean 'factual claim about the world' (eg "I have a brown-eyed wife"), then this is correct. If you mean 'hypothesis about how things work' (eg theism, gravity, etc), then it's inapplicable. Quote: In plain language, one should choose the hypothesis with the lowest probability of being wrong. Mathematically, this is always the least complex hypothesis that matches the data. No. Because it's <=, not <, and you don't get to escape that by bringing in another (IMO, invalid) argument about posterior probability. In plain language, this just means that less complex claims MIGHT be more likely. It's going to be easier to test, surely - less variables = fewer things to control for, fewer trials needed, etc - but not more likely to be true. P.S. If you've read the entirety of the book, I presume* you realize that 'complexity' is only loosely definable, and mostly only in rather limited domains. As applied to theology, 'complexity' definition is rather difficult, if at all possible, to do objectively. * Not certain that this discussion is actually in the book rather than limited to the extra material he presented in class... P.P.S. You do have the terms significantly wrong. Prior probability = P(X) Posterior probability = P(X|Y). |Y means "given that Y is true". E.g. the monty hall problem. Doors A,B,C; one has a prize behind it. Prior probability: P(A)=P(B)=P(C)=1/3. We choose A. Host opens door C, showing it has no prize. Let's call this X for simplicity*. Now we have posterior probability: P(A|X)=1/3. P(B|X)=2/3, P(C|X)=0. (Because the host necessarily would not have opened door A, X only is informative about B and C). *For a more elaborate explanation that doesn't collapse X, see http://en.wikipedia.org/wiki/Monty_H...yes.27_theorem. P.P.P.S. Which is more complex: gravity or intelligent falling? __________________ Friendly advice: if in an argument with me, don't make tons of fallacies or you'll just embarass yourself when I call you on 'em. Kthx! See my Youtube videos for more good argument, as well as bits about my various other interests, like ASL, cogsci, neurosci, meditation, cooking, etc. Last edited by saizai; 10th January 2008 at 05:05 PM.
 10th January 2008, 03:57 PM #27 saizai Graduate Poster     Join Date: Jul 2005 Posts: 1,374 dp __________________ Friendly advice: if in an argument with me, don't make tons of fallacies or you'll just embarass yourself when I call you on 'em. Kthx! See my Youtube videos for more good argument, as well as bits about my various other interests, like ASL, cogsci, neurosci, meditation, cooking, etc. Last edited by saizai; 10th January 2008 at 04:01 PM.
 10th January 2008, 03:59 PM #28 saizai Graduate Poster     Join Date: Jul 2005 Posts: 1,374 dp __________________ Friendly advice: if in an argument with me, don't make tons of fallacies or you'll just embarass yourself when I call you on 'em. Kthx! See my Youtube videos for more good argument, as well as bits about my various other interests, like ASL, cogsci, neurosci, meditation, cooking, etc. Last edited by saizai; 10th January 2008 at 04:00 PM.
 10th January 2008, 04:05 PM #29 fls Penultimate Amazing     Join Date: Jan 2005 Posts: 10,236 Originally Posted by saizai dp Don't you mean "tp"? Linda __________________ God:a capricious creative or controlling force said to be the subject of a religion. Evidence is anything that tends to make a proposition more or less true.-Loss Leader SCAM will now be referred to as DIM (Demonstrably Ineffective Medicine) Look how nicely I'm not reminding you you're dumb.-Happy Bunny When I give an example, do not assume I am excluding every other possible example. Thank you.
 10th January 2008, 06:10 PM #30 saizai Graduate Poster     Join Date: Jul 2005 Posts: 1,374 Originally Posted by fls Don't you mean "tp"? I noticed the bottom one first and thought it was just a double. Didn't feel like going back to edit for accuracy. __________________ Friendly advice: if in an argument with me, don't make tons of fallacies or you'll just embarass yourself when I call you on 'em. Kthx! See my Youtube videos for more good argument, as well as bits about my various other interests, like ASL, cogsci, neurosci, meditation, cooking, etc.
 11th January 2008, 07:54 AM #31 rocketdodger Philosopher     Join Date: Jun 2005 Location: Hyperion Posts: 6,668 Thanks for all the replies, I see the many errors I made when I wrote the O.P. To avoid any misunderstanding, it turns out the book was talking about actual A.I. and that it mentioned something like my argument "naturally leading into" Ockhams razor -- or in other words, a good way to emulate it in the A.I. world. I turned that into the mess you see (which isn't bad for a first try, I tell myself). However, it does give some insight into the matter, or at least reinforces the question "why." I am convinced the real reason why the razor is rational must involve something similar. Using a better definition now, given by (among others) wowbagger: Quote: it is which answer makes the fewest and most trivial of a priori assumptions, given the data we have available thus far. ...why is that? Why is making less and weaker assumptions better? Clearly, assumptions have a higher chance of being wrong than observations. However, making no assumptions is equivalent to simply zeroing out all possible assumptions. So one has to show that zeroing out assumptions is better than giving them some value. Superficially it seems like it wouldn't be, because after all choosing zero has just as much probability of being the wrong choice as choosing anything else. Thus at the heart of the matter I still think it has something to do with complexity levels and the probability of a hypothesis being correct. I will think about this today. Last edited by rocketdodger; 11th January 2008 at 07:55 AM.
 11th January 2008, 08:33 AM #32 saizai Graduate Poster     Join Date: Jul 2005 Posts: 1,374 Ockham's Razor is a rational measure of the utility of a theory. Simpler theories are easier to investigate, make stronger predictions (because fewer variables are unknown compared to previous data), and are simpler to calculate and model. However, they are not more likely. This is one of the reasons, IMO, that complexity is so hard to define: it simply has no truly valid definition on the theoretical end (which, if it existed, might lead to a proof like you tried, of whether more complex things are a priori more likely to be false). But it DOES have easily and rigorously definable measures when we're talking about utility - e.g. how many lines of code; how many hours to program; how long to explain through interpretive dance; how many new words needed to explain it. These are all perfectly valid as utility functions on their domain, which is exactly what OR is good for. "Fewest assumptions" is, unfortunately, not a sufficiently sound definition. No matter what you are talking about almost (outside of fundamental set theory, perhaps?), if it's a real world question, you make a countably infinite number of assumptions. And not only that, it's not very well possible to delineate what constitutes "one" assumption. I could make a reductio ad absurdium on this using intelligent falling vs gravity, so that IF had fewer assumptions than gravity. Hell, IF only postulates one new 'entity' (sentient objects), whereas gravity postulates lots (atoms, weak magnetic force, quarks, etc etc etc). I challenge that you CANNOT in fact define complexity in a way that is objective, applicable to real world problems, completely sound and determined, and not merely a covert measure of utility. Without being able to do so, all of the rest of your argument falls apart because you will not be able to show which proposition involves A+B and which just A, so you won't know which side of the <= equation anything is. I suggest that you try looking at OR again as a measure of pragmatic scientific utility, rather than of truth value. I think you'll find it far more defendable and useful that way. __________________ Friendly advice: if in an argument with me, don't make tons of fallacies or you'll just embarass yourself when I call you on 'em. Kthx! See my Youtube videos for more good argument, as well as bits about my various other interests, like ASL, cogsci, neurosci, meditation, cooking, etc.
 11th January 2008, 09:02 AM #33 Beerina Sarcastic Conqueror of Notions     Join Date: Mar 2004 Location: A floating island above the clouds Posts: 23,835 IF's "sentient objects" sounds like one thing, but what is that thing itself? It, in turn, postulates atoms, energies, etc., (or something even more exotic, some kind of "soul stuff") that make up the "sentient object". Perhaps the point would be a little more obvious if, instead of "IF" for a "sentient object", you instead postulated "gigantic, invisible robotic Rube Goldberg thingie"-falling. A giant, mechanical device makes the complexity more obvious. __________________ "Great innovations should not be forced [by way of] slender majorities." - Thomas Jefferson The government should nationalize it! Socialized, single-payer video game development and sales now! More, cheaper, better games, right? Right?
 11th January 2008, 09:27 AM #34 rocketdodger Philosopher     Join Date: Jun 2005 Location: Hyperion Posts: 6,668 Originally Posted by saizai I suggest that you try looking at OR again as a measure of pragmatic scientific utility, rather than of truth value. I think you'll find it far more defendable and useful that way. Well that is what I was getting at, I hope I didn't give someone the wrong idea that I am arguing that the simplest hypothesis is the true hypothesis. Still, aside from the pragmatic concerns of working with a less complex hypothesis potentially being easier and faster, isn't there still something to be said for simplicity? Lets concentrate on the simplest example, to see if I can make headway. Suppose we have a bunch of points and we are trying to fit an equation to model their distribution. Furthermore, suppose that in reality the points were sampled from a straight line using no noise. Now, the question is, why should we use the simple straight line hypothesis generated, rather than any of the high degree polynomial hypotheses that also fit the points perfectly? It seems to me that the answer must lie in the probabalistic relationship between possible hypotheses on different complexity levels (yes I know I shouldn't be using "complexity" but I can't help it!). In this case, though I have yet to prove it mathematically, it seems intuitive that the chances of many samples from a non-linear curve ending up completely colinear are very slim while the chances of many samples from a line ending up colinear are very high -- thus we should go with the line as our model. However, this idea seems to rely on the presupposition that the line was the correct choice all along (because we feel we are intuitively looking at linearity), and I don't yet know how to resolve that problem. Last edited by rocketdodger; 11th January 2008 at 09:28 AM.
 11th January 2008, 10:51 AM #35 saizai Graduate Poster     Join Date: Jul 2005 Posts: 1,374 rocket - I didn't think you're saying it is the true one, just that it's more likely. And even that, I think, is not true. Simplicity is certainly a valuable thing - utilitarianly (which includes aesthetics). Don't get me wrong, I <3 simplicity and beauty in theories. Simple theories tend to be more powerful, give rise to more complex things, etc. Think e.g. Conway's Life or e=mc2. They're wonderful, and I think that they actually are literally better for us humans to work with for neurological reasons. I don't, honestly, understand your last paragraph. However, goodness of fit and the problem of overfitting vs precision of prediction is certainly an important one. Again though, I would say that this is utilitarian - and that you are a bit conflating things. Are you familiar, perhaps, with Bayesian networks? In addition to my AI class using AIMA (http://aima.cs.berkeley.edu/ btw), I took one about neural theory of language (http://www.icsi.berkeley.edu/NTL/), which I very highly recommend. It helped me think about things like this, and covered Bayes nets fairly well - including quite specifically this topic of goodness of fit. The ubersummarized version is that: a) you need to make a fit based on one sample and test it on another b) the more nodes & levels in a bayes net, the better the predictive ability but the harder it is for us to understand wtf it's doing c) there is a diminishing return on node addition d) tweaking the constants involved is a black art, and not to be attempted by the faint of heart (or possibly at all, because it's possible to set up a meta-bayes net to find out what the best constants are) e) there can be a problem with overfitting, which is solved by doing (a) repeatedly for multiple test sets until you get best average fit... but even so it's a very hard problem to do perfectly. Fortunately, "good enough" isn't as hard. B above directly contradicts your hypothesis that more complexity = worse predictiveness. However, it does support my position, that complexity is simply worse for us poor humans' ability to understand and test things. Simplicity is beautiful, grokkable, easy to maintain. Complex approaches (like backpropagating Bayes nets) are not... but they do work disturbingly well anyway, and there is a certain holistic beauty in that too. One other related thing is what in cogsci is called 'chunking'. I'd suggest you look it up (e.g. "the magic number 7+-2"), as it's fairly relevant. __________________ Friendly advice: if in an argument with me, don't make tons of fallacies or you'll just embarass yourself when I call you on 'em. Kthx! See my Youtube videos for more good argument, as well as bits about my various other interests, like ASL, cogsci, neurosci, meditation, cooking, etc.
 11th January 2008, 11:34 AM #36 rocketdodger Philosopher     Join Date: Jun 2005 Location: Hyperion Posts: 6,668 Originally Posted by saizai I don't, honestly, understand your last paragraph. I am saying, given a set of data points that are colinear, why should we decide to go with the hypothesis that they were sampled from a line rather than some extremely complex curve? The answer, it seems to me, is because the probability of sampling colinear points from a line is high -- 1.0, in fact -- whereas the probability of sampling colinear points from a complex curve is much lower. This, in turn, suggests that the probability of the points actually having been generated by a linear function is much higher than the probability of them having been generated by a non-linear more complex one. I think this is just another way of measuring predictive power, right? If a hypothesis has a higher chance of generating new samples that will match new observations, then it has more predictive power, and we should choose it over those with less. In the case where we can't make new observations, and must decide on the basis of existing data, some kind of analysis like above should work. With respect to bayesian networks, I don't think it contradicts what I said (or rather, meant) about simplicity -- maybe it just makes it a moot point? Before I thought about it, I meant to say "among hypotheses of equal predictive power..." but now I realize that the mathematics I have been using to argue with actually concern predictive power. If two hypotheses have equal predictive power, then they should be equivalent as far as we can tell, and thus utilitarian concerns should be the only factor in deciding between them. Last edited by rocketdodger; 11th January 2008 at 11:36 AM.
 11th January 2008, 05:18 PM #37 saizai Graduate Poster     Join Date: Jul 2005 Posts: 1,374 I think that your colinearity probability is improper, because there are an infinite number of possible generative functions for any given set of points. Surely you're aware of that? Claiming it is "linear" is purely a matter of our own perception/analysis, unless you're specifying an infinite number of points - which you're not. Thus there are an infinite number of both linear and nonlinear functions to match any set of "linear" points... so determining which generated it is extremely difficult if at all possible. This is even worse for something that's non-linear. As a related thought puzzle: Suppose you have a black box with a button on it. You press the button one million times and carefully observe that nothing happens. How do you know whether: a) the box does nothing; b) the box has been doing something that you cannot detect; or c) the box does something only on the million-and-tenth press? The answer is simply, you don't have any way to know. But for the sake of *utility*, we assume that it is (a). Predictive power, while sidestepping the question of whether you're right about the probability there, is a utility function again - not one of the truth value of the hypothesis. I think your move to predictive power has completely abandoned your original point, namely that complexity correlates to probability. If that's your intent, then you're now just making a utilitarian argument, and I think we've come to agreement about what OR is for. __________________ Friendly advice: if in an argument with me, don't make tons of fallacies or you'll just embarass yourself when I call you on 'em. Kthx! See my Youtube videos for more good argument, as well as bits about my various other interests, like ASL, cogsci, neurosci, meditation, cooking, etc. Last edited by saizai; 11th January 2008 at 05:22 PM.
 11th January 2008, 10:16 PM #38 rocketdodger Philosopher     Join Date: Jun 2005 Location: Hyperion Posts: 6,668 Originally Posted by saizai I think that your colinearity probability is improper, because there are an infinite number of possible generative functions for any given set of points. Surely you're aware of that? Claiming it is "linear" is purely a matter of our own perception/analysis, unless you're specifying an infinite number of points - which you're not. Yes I know there is an infinite number of generative functions, and I know the linearity is a matter of my own perception (thats what I was hinting at in a previous post regarding the problem of presupposing a hypothesis). Still, I think my probability argument might hold some water, but I still need to formulate it properly -- I fully understand all of the problems you and everyone else have brought up. I will work on it this weekend and try to polish it up. Originally Posted by saizai I think your move to predictive power has completely abandoned your original point, namely that complexity correlates to probability. If that's your intent, then you're now just making a utilitarian argument, and I think we've come to agreement about what OR is for. Well I think I am now trying to argue that complexity correlates to probability, which correlates to predictive power. We do agree about OR, though (I think).
 13th January 2008, 08:54 PM #39 rocketdodger Philosopher     Join Date: Jun 2005 Location: Hyperion Posts: 6,668 Ok, I think I have it! Ignore the fact that "complexity" is extremely difficult to define -- just think of it for the purposes of this argument as "measure of assumption level." One can simply ask "what are the chances that a hypothesis on a given level generated the data that can also be modeled by hypotheses on other complexity levels?" Mathematically, the chances of a lower complexity hypothesis generating data that can be modeled by higher complexity hypotheses is very high. In contrast, the reverse is typically very low. For example, take a bunch of colinear points. There is a both a linear hypothesis and a high degree polynomial one. The chances of the high degree hypothesis generating points that can be fit by any linear function are slim. On the other hand, the chances of a linear hypothesis generating points that can be fit by any high degree polynomial are very high. Furthermore, we know that hypotheses on many complexity levels model the data -- thats why we are trying to choose among them to begin with. Thus, we should choose the hypothesis that has the highest probability of generating such data. For the reason above, it is always the lowest complexity hypothesis. P.S. I haven't figured out if this has anything to do with the fact that higher complexity levels have more possible hypotheses, I.E. the argument in the OP (which I now understand to be incorrect). Last edited by rocketdodger; 13th January 2008 at 08:55 PM.
 14th January 2008, 11:37 AM #40 saizai Graduate Poster     Join Date: Jul 2005 Posts: 1,374 I think you're making some pretty complicated errors there. I'd prefer not to get into actual measure theory since it's a bit too complex for my taste. However, in plainer English there are these issues: 1. You assume that the measure of 'complexity' is some sort of polynomial equation over a set of real-number datapoints. Neither of these is the case for theories in general, such as what this is all about: theisms. 2. You confuse anterior and posterior probability, i.e. the probability that a) given a certain data set, the theories that produce it will turn out to be of a given "complexity"; and b) given a certain theory, the data will turn out as they happen to actually do. 3. You confuse what I would call "claims" (i.e. statements of fact about *what* is true) and "theories" (i.e. explanatory frameworks that try to say *why* and *how* things work). (I believe you can reasonably discuss the probability of a claim, but not the probability of a theory - only how well a theory matches available data.) 4. You continue to assume, despite your opening paragraph trying to brush it away, that there is some single, objective measure of "complexity" and that it is well-ordered - i.e. that some item A and some other item B are necessarily either A>B, A=B, or A

JREF Forum

 Bookmarks Digg del.icio.us StumbleUpon Google Reddit