Monday, August 8, 2011

Bayes Theorem and Predicting the Future

I think Bayes theorem is one of the most beautiful things I have come across - its so simple and insanely efficient. In this post, I am going to present an idea about how Bayes theorem can be used for predicting the future ... Yes! you heard it right - for predicting the future.

The idea came to me while attending a talk by Peter Norvig. If you haven't already, you must, and I insist, take a look at the spell checker written by him which uses Bayes theorem. (Search Google!)

Before the idea, what is Bayes theorem: Well, it relates two events A and B in the following way:
Pr (A | B). Pr(B) = Pr (B | A). Pr(A)

How you may ask? And I must explain that intuitively upholding the spirit of this blog.
So here it is:
Pr(A | B) means the probability of A happening after event B has already taken place. So imagine two circles which are intersecting and represent A and B. Stretch your imagination and think that both the circles are invisible for now. What happens next?
B.
Event B happens next.
At this point of time, you should imagine a circle representing event B emerge out. What happens next?
A.
Event A happens next.
At this point of time,  you should be careful. The whole of circle representing event A should not emerge out. Since B has happened, we are restricted to play around only inside B. So what should emerge out is the intersecting area between A and B - which is already a part of B.

Now at this point of time, either you are confused enough and about to leave or have in your imagination - a circle representing B with a shaded area that represents A and B happening together. Voila! What have we got?
Pr (A | B) = Pr ( A intersecting B) / Pr(B)  -- see the circle in your head - this is simple maths.

Swap A and B and what have you got:
Pr (B | A) = Pr(B intersecting A) / Pr(A)

Well, Pr (A intersecting B) = Pr (B intersecting A) , so if you do that, you get the form which I originally mentioned:

Pr( A | B) = Pr(B | A). Pr(A) / Pr(B)



Predicting the future - how?
This is the interesting part. Let B represent your present(or past) and A represent one possible state of your future.

What you ought to do is:

Pr( Your future state | (Given) Your present state) = 


Pr ( Your present state| (Given) Your future state). Pr(Your future state) / Pr(Your present state)

Now note that given your present state (pathetic as it is - reading this blog) , your future state could be many. So dont forget to apply a max function and take the future state with the maximum probability.

Also note that Pr (Your present state) is the common denominator across all your future states - so it can safely be ignored or if you are a stickler to pure maths - still can you ignore it (humble request)?

But the million $ question would: WTH does this term on RHS mean:

Pr ( Your present state| (Given) Your future state).

What it means is what it actually means! Given your future state, whats the probability of your present state? Does that make any sense? If you knew your future state, wouldn't you not be reading this blog already.

True.

Now the answer is - resolving this term depends on the context of the problem and how we can model it in Bayes theorem context. However, since I won't let you go intuitionally starved: here's a way: Why don't you observe your past and evaluate this term? So basically you spread out your past as a set of discrete events that have already happened and you can very easily divide it into states that are similar to your present states and also the states which you eventually went into. Having done that, you can get a fair amount of idea of the term that invoked a WTH from you a while before.

See, you can model your future given your present - isn't Bayes theorem insanely beautiful? :)