PDA

View Full Version : What is a random variable...

mijopaalmc
2nd July 2008, 05:59 PM
and how is it "random"? Or does the phrase "random variable" mean something different that a combination of "random" and "variable" (i.e., a variable that is random, can take on random values)?

Note: I do have some knowledge of probability theory beyond that which is taught in primary and secondary school (e.g., I know what a "probability bability space" and a "measurable function" are), but I am having trouble explaining the concept of a "random variable" is without using the fancy-schmancy language of probability theory to accurately capture the concepts in detail.

Can anyone help me come up with non-jargon-ridden answers to the questions above?

Reality Check
2nd July 2008, 09:38 PM
and how is it "random"? Or does the phrase "random variable" mean something different that a combination of "random" and "variable" (i.e., a variable that is random, can take on random values)?

Note: I do have some knowledge of probability theory beyond that which is taught in primary and secondary school (e.g., I know what a "probability bability space" and a "measurable function" are), but I am having trouble explaining the concept of a "random variable" is without using the fancy-schmancy language of probability theory to accurately capture the concepts in detail.

Can anyone help me come up with non-jargon-ridden answers to the questions above?

The Wikipedia article (http://en.wikipedia.org/wiki/Random_variable) is fairly clear in its description.

A discrete random variable is a variable that takes on specific values according to an associated probability distribution.
For example the random variable X for tossing a coin has values X = 1 (if heads) or X = 0 (if tails) and the associated probability distribution is p(x) = 0.5 (if x = 1), 0.5 (if x = 0) and 0 otherwise.

A continuous random variable is a variable that takes on any of the values that are associated with a probability density function.

2nd July 2008, 09:51 PM
Simply speaking, a random variable is a variable that can take on a random value - no great surprise. The values are, of course, controlled by the statistical parameters of the collection they are drawn from - a Gaussian distribution with mean zero and standard deviation of 1 will have values clustered within a short distance of 0 usually, with the occasional value farther away from zero.

Walter Wayne
2nd July 2008, 10:09 PM
A random variable is a numerical quantity whose value depends on chance.

69dodge
2nd July 2008, 11:28 PM
I do have some knowledge of probability theory beyond that which is taught in primary and secondary school (e.g., I know what a "probability space" and a "measurable function" are), but I am having trouble explaining the concept of a "random variable" is without using the fancy-schmancy language of probability theory to accurately capture the concepts in detail.

More context would be useful: To whom are you trying to explain the concept, and how much do they know about probability?

But, basically, a random variable is the sort of thing about which one can meaningfully ask, for any number x, what the probability is that it is less than x. That is, if X is a random variable, the statement "X < 5", for example, is not something definitely true or definitely false, but rather is something that has some probability of being true. (That probability might be 1 or 0, so I don't mean to exclude the possibility that the statement is definitely true or definitely false; I just mean to include other possibilities.)

PingOfPong
3rd July 2008, 02:24 AM
This easiest way is to narrow down a little bit.

Just think about a 6 sided die. If you roll it 6 millions times then you should expect approxiamately 1 millions "1's", one million "2's" and so on. You would also expect that there would be no discernible pattern in the rolls because it's realisticly unpredictable.

So, you could define a set of random numbers as any sequence of numbers which are drawn from a set in such a way that all outcomes (1 to 6 in the case of a die) are equally likely and the sequence lacks any discernible pattern. You could, of course, throw in a caveat that not all outcomes have to be equally likely ( as if the die had a weighted side). This is a just a quick and dirty explanation that I think gets to the meat of the matter. I'm not an expert in random numbers though.

sphenisc
3rd July 2008, 03:24 AM
Depending on who you're aiming at, a bottom-up approach might be more accessible. Start with a concrete scenario, "I lost my cat this morning. Sometime she goes to the park, and sometimes to the fish shop. What should I do? " Discuss what's the variable in the scenario, what are the values and how probability fits in. This can lead on to discussing outcomes with zero probability (are they really outcomes?), outcomes with probability=1 (is such a variable really random?) etc, etc. The concept can be compared with constants and deterministic variables, "She always goes to the park", " She goes to park during the day, and heads to the fish shop at 5pm when they throw out the old fish."

The example I've given may not be very appropriate for your purposes - but having something concrete to "abstract from" can help those of us who are less mathematically inclined.

Meridian
3rd July 2008, 03:42 AM
Well, mathematically a "random variable" is just a (measurable) function on a probability space, normally whichever probability space you happen to be talking about at the time, and including trivial cases such as constant functions. As to why:

the way probability is formalized in maths is to put all the randomness into the probability space. For example, if you are discussing questions about sequences of dice rolls, your probability space would consist of all possible (infinite, say) sequences such as 1435235333462... There's also a "measure" on the space that says what the probabilities of certain subsets are. (Ignoring technicalities, you can think of any subset as having a probability.) A random variable is then something like "the 2nd roll" (here 4), or "the sum of the 2nd and 4th rolls", which is a function taking a sequence (element of the probabilty space) and returning a number.

To analyze the behaviour of the random variables you just need to understand the probability space itself. For example, the probability that the sum of the 2nd and 4th rolls is 7 is the probability of a certain subset of the probability space, namely the one where the sum of the 2nd and 4th terms in the sequence is 7.

As to why it's formalized like this: well, it seems to work!

mijopaalmc
3rd July 2008, 07:53 AM
Here is the comment that spurred this thread:

Technically the term you defined was 'random variable' not 'random'.

You inferred that 'random' would have the same sense when it was found in other terms. Generally, people rejected your inference for good reason. While it is okay to separate adjective from a phrase with a given sense and apply them with that sense to other phrases in common language, this is not acceptable with technical terms. Each technical term has a specific definition that may not follow from the senses of its constituent parts of speech.

Thus, since you used a technical definition of 'random variable', your inference is invalid. The observation that this inference makes all systems random, is an example of the odd sorts of conclusions you reach when you make false inferences.

My contention in the thread in which the above comment occurs is that evolution is mathematically random because not every individual of a given phenotype produces reproductively viable offspring by virtue of their possessing that specific phenotype. As I understand it, the above observable fact jibe really well with the concept of a probability measure designating how often an event will occur with respect to all existing events in the sigma-algebra of a probability space.

Now it is also possible that I am misunderstanding evolutionary biology or the application of probability theory to evolutionary biology, but I am primarily interested in checking my factual knowledge of probability theory in and of itself.

sol invictus
3rd July 2008, 10:02 AM
I am primarily interested in checking my factual knowledge of probability theory in and of itself.

In several references on probability and statistics I once checked, the term "random" is not defined and is almost never used by itself (e.g. outside phrases like "random variable", which are defined). There is a good reason for that.

As Meridian says above, one defines a "random variable" by putting all the randomness into the probability space. Which more or less begs the question - it gives you a concrete framework to work with, but it does not explain what the connection is to the physical world, or precisely what "random" means, or where the probability space came from.

Those are (in my opinion) very deep questions, to which no one knows the answer. As I have said before, I think the best definition of "random" is something that is fundamentally unpredictable.

Let me try to formulate that more precisely: A random event is an event whose outcome cannot be predicted with a confidence that tends to 1 when the errors in your knowledge of the initial data tend to 0.

sphenisc
3rd July 2008, 10:08 AM
In several references on probability and statistics I once checked, the term "random" is not defined and is almost never used by itself (e.g. outside phrases like "random variable", which are defined). There is a good reason for that.

As Meridian says above, one defines a "random variable" by putting all the randomness into the probability space. Which more or less begs the question - it gives you a concrete framework to work with, but it does not explain what the connection is to the physical world, or precisely what "random" means, or where the probability space came from.

Those are (in my opinion) very deep questions, to which no one knows the answer. As I have said before, I think the best definition of "random" is something that is fundamentally unpredictable.

Let me try to formulate that more precisely: A random event is an event whose outcome cannot be predicted with a confidence that tends to 1 when the errors in your knowledge of the initial data tend to 0.

Does your "cannot" mean logically impossible, mathematically impossible, physically impossible, currently impractical or something else?

sol invictus
3rd July 2008, 10:15 AM
Does your "cannot" mean logically impossible, mathematically impossible, physically impossible, currently impractical or something else?

That's a good question. As a physicist I like to define things operationally, so I would go with "physically impossible under any circumstance". You're allowed to collect as much data as you like, have as powerful a computer as you like, make as precise measurements as is poassible, so long as none of those things violate the laws of physics.

The canonical example (of random by my definition) is predicting the results of a measurement of the z-axis spin of an electron prepared in a state of spin up along the x-axis. A slightly less canonical one is determining (from outside) whether an unstable particle which passed into a black hole horizon decays before it hits the singularity.

sol invictus
3rd July 2008, 10:26 AM
Incidentally, one problem with my proposed definition is that it probably either makes everything or nothing in the physical world random (depending on what you think about quantum mechanics).

However it can be modified to address that: one could say that an event is almost non-random (or maybe epsilon-predictable is better) if your confidence in your prediction tends to a number larger than 1-epsilon when the precision goes to infinity (where epsilon is to be specified depending on the application).

cyborg
3rd July 2008, 02:33 PM
A random variable produces an infinite sequence not producible by a finitely describable mechanism.

mijopaalmc
3rd July 2008, 03:20 PM
A random variable produces an infinite sequence not producible by a finitely describable mechanism.

So any finite sequence is not a product of a random variable?

Walter Wayne
3rd July 2008, 06:36 PM
Said sequence (infinite or otherwise) would not be a product of a random variable. It would be a sequence of random variables.

mijopaalmc
3rd July 2008, 07:07 PM
Said sequence (infinite or otherwise) would not be a product of a random variable. It would be a sequence of random variables.

So are you saying that cyborg hasn't really defined a random variable? Or that there is just something wrong with how I interpreted what he said?

Walter Wayne
3rd July 2008, 07:12 PM
I'm saying he hasn't defined "random variable" correctly.

cyborg
4th July 2008, 01:18 AM
So any finite sequence is not a product of a random variable?

Undecidable. See Godel.

69dodge
4th July 2008, 09:24 AM
Undecidable. See Godel.

Too brief. Explain.

69dodge
4th July 2008, 10:03 AM
Said sequence (infinite or otherwise) would not be a product of a random variable. It would be a sequence of random variables.

I think he intends each element of his sequence to be a definite value. Considered individually, there's nothing random about it. Only the sequence as a whole is considered random, because the various values "have nothing to do with each other".

I agree with you that this is not the usual definition of "random variable". I'd call such a sequence "an uncomputable sequence". (Every finite-length sequence is computable, which perhaps answers mijopaalmc's question.)

If I were planning to roll a die once and then destroy it, so that there is no possibility of it generating an infinite sequence, I'd still be perfectly happy to consider the number that will result from the single roll to be a random variable.

(I don't mean that after I roll, say, a 4, I consider "4" to be a random variable. I mean that before I roll anything, I consider "the number that I will roll" to be a random variable.)

cyborg
4th July 2008, 07:17 PM
The simple problem I am trying to get across is that it is undecidable in general as to whether or not some variable is random when one attempts to decide that by induction across its output.

Any "random" finite sequence has a "non-random" representation and you can't make a decision about an infinite sequence.

Randomness is defined by what it is not, not what it is.

5th July 2008, 12:11 AM
Here is the comment that spurred this thread:

My contention in the thread in which the above comment occurs is that evolution is mathematically random because not every individual of a given phenotype produces reproductively viable offspring by virtue of their possessing that specific phenotype. As I understand it, the above observable fact jibe really well with the concept of a probability measure designating how often an event will occur with respect to all existing events in the sigma-algebra of a probability space.

Now it is also possible that I am misunderstanding evolutionary biology or the application of probability theory to evolutionary biology, but I am primarily interested in checking my factual knowledge of probability theory in and of itself.I may be off the mark, but it seems you are picking semantic nits here. You say “evolution is mathematically random” and it “jibe[s] really well with the concept of a probability measure”. The “nit” is the difference between being random and appearing random (or being able to be described as random). As sol invictus said above, this difference goes deep into questions about what “random” is, which could be everything or nothing or something approaching everything or nothing or some other quantum equation. Those questions seem to be independent of the meaning of random that you use. A “mathematically random” or “probability measure” is a theoretical explanation, evaluation, or prediction, which is what the theory of evolution is.

So if you want to get technical on the term, you have to define what you mean by “random”. It could mean anything from a general sense of uncertainty to a theoretical concept of probability to an unknown quantum equation. It seems you mean a general sense of the term and looks like someone is setting you up to say you mean a “theoretical concept of probability” to knock down evolution theory as something that can only occur in theory and not in the actual physical world. Which is throwing the evolution baby out with the bath water by adding nits on the meaning of the term “random”.

technoextreme
7th July 2008, 06:29 PM
Simply speaking, a random variable is a variable that can take on a random value - no great surprise. The values are, of course, controlled by the statistical parameters of the collection they are drawn from - a Gaussian distribution with mean zero and standard deviation of 1 will have values clustered within a short distance of 0 usually, with the occasional value farther away from zero.