I think Bayes theorem is one of the most beautiful things I have come across - its so simple and insanely efficient. In this post, I am going to present an idea about how Bayes theorem can be used for predicting the future ... Yes! you heard it right - for predicting the future.
The idea came to me while attending a talk by Peter Norvig. If you haven't already, you must, and I insist, take a look at the spell checker written by him which uses Bayes theorem. (Search Google!)
Before the idea, what is Bayes theorem: Well, it relates two events A and B in the following way:
Pr (A | B). Pr(B) = Pr (B | A). Pr(A)
How you may ask? And I must explain that intuitively upholding the spirit of this blog.
So here it is:
Pr(A | B) means the probability of A happening after event B has already taken place. So imagine two circles which are intersecting and represent A and B. Stretch your imagination and think that both the circles are invisible for now. What happens next?
B.
Event B happens next.
At this point of time, you should imagine a circle representing event B emerge out. What happens next?
A.
Event A happens next.
At this point of time, you should be careful. The whole of circle representing event A should not emerge out. Since B has happened, we are restricted to play around only inside B. So what should emerge out is the intersecting area between A and B - which is already a part of B.
Now at this point of time, either you are confused enough and about to leave or have in your imagination - a circle representing B with a shaded area that represents A and B happening together. Voila! What have we got?
Pr (A | B) = Pr ( A intersecting B) / Pr(B) -- see the circle in your head - this is simple maths.
Swap A and B and what have you got:
Pr (B | A) = Pr(B intersecting A) / Pr(A)
Well, Pr (A intersecting B) = Pr (B intersecting A) , so if you do that, you get the form which I originally mentioned:
Pr( A | B) = Pr(B | A). Pr(A) / Pr(B)
Predicting the future - how?
This is the interesting part. Let B represent your present(or past) and A represent one possible state of your future.
What you ought to do is:
Pr( Your future state | (Given) Your present state) =
Pr ( Your present state| (Given) Your future state). Pr(Your future state) / Pr(Your present state)
Now note that given your present state (pathetic as it is - reading this blog) , your future state could be many. So dont forget to apply a max function and take the future state with the maximum probability.
Also note that Pr (Your present state) is the common denominator across all your future states - so it can safely be ignored or if you are a stickler to pure maths - still can you ignore it (humble request)?
But the million $ question would: WTH does this term on RHS mean:
Pr ( Your present state| (Given) Your future state).
What it means is what it actually means! Given your future state, whats the probability of your present state? Does that make any sense? If you knew your future state, wouldn't you not be reading this blog already.
True.
Now the answer is - resolving this term depends on the context of the problem and how we can model it in Bayes theorem context. However, since I won't let you go intuitionally starved: here's a way: Why don't you observe your past and evaluate this term? So basically you spread out your past as a set of discrete events that have already happened and you can very easily divide it into states that are similar to your present states and also the states which you eventually went into. Having done that, you can get a fair amount of idea of the term that invoked a WTH from you a while before.
See, you can model your future given your present - isn't Bayes theorem insanely beautiful? :)
The idea came to me while attending a talk by Peter Norvig. If you haven't already, you must, and I insist, take a look at the spell checker written by him which uses Bayes theorem. (Search Google!)
Before the idea, what is Bayes theorem: Well, it relates two events A and B in the following way:
Pr (A | B). Pr(B) = Pr (B | A). Pr(A)
How you may ask? And I must explain that intuitively upholding the spirit of this blog.
So here it is:
Pr(A | B) means the probability of A happening after event B has already taken place. So imagine two circles which are intersecting and represent A and B. Stretch your imagination and think that both the circles are invisible for now. What happens next?
B.
Event B happens next.
At this point of time, you should imagine a circle representing event B emerge out. What happens next?
A.
Event A happens next.
At this point of time, you should be careful. The whole of circle representing event A should not emerge out. Since B has happened, we are restricted to play around only inside B. So what should emerge out is the intersecting area between A and B - which is already a part of B.
Now at this point of time, either you are confused enough and about to leave or have in your imagination - a circle representing B with a shaded area that represents A and B happening together. Voila! What have we got?
Pr (A | B) = Pr ( A intersecting B) / Pr(B) -- see the circle in your head - this is simple maths.
Swap A and B and what have you got:
Pr (B | A) = Pr(B intersecting A) / Pr(A)
Well, Pr (A intersecting B) = Pr (B intersecting A) , so if you do that, you get the form which I originally mentioned:
Pr( A | B) = Pr(B | A). Pr(A) / Pr(B)
Predicting the future - how?
This is the interesting part. Let B represent your present(or past) and A represent one possible state of your future.
What you ought to do is:
Pr( Your future state | (Given) Your present state) =
Pr ( Your present state| (Given) Your future state). Pr(Your future state) / Pr(Your present state)
Now note that given your present state (pathetic as it is - reading this blog) , your future state could be many. So dont forget to apply a max function and take the future state with the maximum probability.
Also note that Pr (Your present state) is the common denominator across all your future states - so it can safely be ignored or if you are a stickler to pure maths - still can you ignore it (humble request)?
But the million $ question would: WTH does this term on RHS mean:
Pr ( Your present state| (Given) Your future state).
What it means is what it actually means! Given your future state, whats the probability of your present state? Does that make any sense? If you knew your future state, wouldn't you not be reading this blog already.
True.
Now the answer is - resolving this term depends on the context of the problem and how we can model it in Bayes theorem context. However, since I won't let you go intuitionally starved: here's a way: Why don't you observe your past and evaluate this term? So basically you spread out your past as a set of discrete events that have already happened and you can very easily divide it into states that are similar to your present states and also the states which you eventually went into. Having done that, you can get a fair amount of idea of the term that invoked a WTH from you a while before.
See, you can model your future given your present - isn't Bayes theorem insanely beautiful? :)
Actually, I also wrote (what I consider a nice article) on Bayes Theorem. Let me drop my pretense and leave it for the readers to decide whether its nice or mice (thats quite the lamest joke in a while, but bear with me).
ReplyDeleteFirst let us recall that Bayes Theorem is also called the theorem of inverse probability. The diatribe below will justify this name if its not already clear from what Nikhil had to share.
Bayes theorem is used in problems where you are supposed to calculate 'inverse probability' - thats a well known fact.
For eg a problem may ask "Given 2 coins - one fair and the other biased to land on heads 9/10 of time. You toss both and one lands heads and the other tails. What is the probability that the one which lands tails is the biased coin?"
Problems like this illuminate the name 'inverse probability' and you can use Bayes Theorem to chop off their little legs (I mean to solve them)
But what is important about Bayes Theorem is not what I said above but what I am about to say now.
"Bayes Theorem gives you a way to adjust your assumptions about an experiment which you made before the experiment".
In other words, it measures how your degree of confidence in an event changes after some event occurs.
I will give an example shortly - but first be clear on what it implies.
You might be knowing of the argument that says science at its very heart is a circular process wherein you invoke a 'model of nature' to study nature and make predictions about it which might alter your model itself.
Bayes theorem, if you will, is the mathematical formalisation of this statement.
For example consider the following.
"It is known that on a table there are 10 balls all of which are either red or green (you cannot see any ball, they are covered by bowls). A reasonable assumption would be to assume that there are 5 reds and 5 greens. You push a button. 3 balls fall down all of which are green. What is the probability that a ball picked from the 7 remaining at random will be green?"
Here you have a 'nature's model' (5-red-5-green-model). You do an experiment (push-the-button). And you refine the model.
What I liked about it is the fact that Bayes Theorem captures in a very real way the spirit of science. I had never thought about it this way. Just one word - its fascinating. At least for me it was.
But the concept needs to be more explicit
ReplyDelete