Poisson Distribution


A Poisson random variable gives the probability of a given number of events in a fixed interval of time (or space). It makes the Poisson assumption that events occur with a known constant mean rate and independently of the time since the last event.

Poisson Random Variable

Notation: XโˆผPoi(ฮป)
Description: Number of events in a fixed time frame if (a) the events occur with a constant mean rate and (b) they occur independently of time since last event.
Parameters: ฮปโˆˆR+, the constant average rate.
Support: xโˆˆ{0,1,โ€ฆ}
PMF equation: P(X=x)=ฮปxeโˆ’ฮปx!
Expectation: E[X]=ฮป
Variance: Var(X)=ฮป
PMF graph:
Parameter ฮป:

Poisson Intuition

In this section we show the intuition behind the Poisson derivation. It is both a great way to deeply understand the Poisson, as well as good practice with Binomial distributions.

Let's work on the problem of predicting the chance of a given number of events occurring in a fixed time interval โ€” the next minute. For example, imagine you are working on a ride sharing application and you care about the probability of how many requests you get from a particular area. From historical data, you know that the average requests per minute is ฮป=5. What is the probability of getting 1, 2, 3, etc requests in a minute?

: We could approximate a solution to this problem by using a binomial distribution! Lets say we split our minute into 60 seconds, and make each second an indicator Bernoulli variable โ€” you either get a request or you don't. If you get a request in a second, the indicator is 1. Otherwise it is 0. Here is a visualization of our 60 binary-indicators. In this example imagine we have requests at 2.75 and 7.12 seconds. the corresponding indicator variables are blue filled in boxes:

1 minute

The total number of requests received over the minute can be approximated as the sum of the sixty indicator variables, which conveniently matches the description of a binomial โ€” a sum of Bernoullis. Specifically define X to be the number of requests in a minute. X is a binomial with n=60 trials. What is the probability, p, of a success on a single trial? To make the expectation of X equal the observed historical average ฮป=5 we should choose p so that ฮป=E[X]. ฮป=E[X]Expectation matches historical averageฮป=nโ‹…pExpectation of a Binomial is nโ‹…pp=ฮปnSolving for p In this case since ฮป=5 and n=60, we should choose p=5/60 and state that XโˆผBin(n=60,p=5/60). Now that we have a form for X we can answer probability questions about the number of requests by using the Binomial PMF: P(X=x)=(nx)px(1โˆ’p)nโˆ’x

So for example:
P(X=1)=(601)(5/60)1(55/60)60โˆ’1โ‰ˆ0.0295 P(X=2)=(602)(5/60)2(55/60)60โˆ’2โ‰ˆ0.0790 P(X=3)=(603)(5/60)3(55/60)60โˆ’3โ‰ˆ0.1389

Great! But don't forget that this was an approximation. We didn't account for the fact that there can be more than one event in a single second. One way to assuage this issue is to divide our minute into more fine-grained intervals (the choice to split it into 60 seconds was rather arbitrary). Instead lets divide our minute into 600 deciseconds, again with requests at 2.75 and 7.12 seconds:
1 minute

Now n=600, p=5/600 and XโˆผBin(n=600,p=6/600). We can repeat our example calculations using this better approximation: P(X=1)=(6001)(5/600)1(595/60)600โˆ’1โ‰ˆ0.0333 P(X=2)=(6002)(5/600)2(595/600)600โˆ’2โ‰ˆ0.0837 P(X=3)=(6003)(5/600)3(595/600)600โˆ’3โ‰ˆ0.1402

Choose any value of n, the number of buckets to divide our minute into:

The larger n is, the more accurate the approximation. So what happens when n is infinity? It becomes a Poisson!

Poisson, a Binomial in the limit

Or if we really cared about making sure that we don't get two events in the same bucket, we can divide our minute into infinitely small buckets:

1 minute

Proof: Derivation of the Poisson

What does the PMF of X look like now that we have infinite divisions of our minute? We can write the equation and think about it as n goes to infinity. Recall that p still equals ฮป/n:

P(X=x)=limnโ†’โˆž(nx)(ฮป/n)x(1โˆ’ฮป/n)nโˆ’x

While it may look intimidating, this expression simplifies nicely. This proof uses a few special limit rules that we haven't introduced in this book:

P(X=x)=limnโ†’โˆž(nx)(ฮป/n)x(1โˆ’ฮป/n)nโˆ’xStart: binomial in the limit=limnโ†’โˆž(nx)โ‹…ฮปxnxโ‹…(1โˆ’ฮป/n)n(1โˆ’ฮป/n)xExpanding the power terms=limnโ†’โˆžn!(nโˆ’x)!x!โ‹…ฮปxnxโ‹…(1โˆ’ฮป/n)n(1โˆ’ฮป/n)xExpanding the binomial term=limnโ†’โˆžn!(nโˆ’x)!x!โ‹…ฮปxnxโ‹…eโˆ’ฮป(1โˆ’ฮป/n)xRule limnโ†’โˆž(1โˆ’ฮป/n)n=eโˆ’ฮป=limnโ†’โˆžn!(nโˆ’x)!x!โ‹…ฮปxnxโ‹…eโˆ’ฮป1Rule limnโ†’โˆžฮป/n=0=limnโ†’โˆžn!(nโˆ’x)!โ‹…1x!โ‹…ฮปxnxโ‹…eโˆ’ฮป1Splitting first term=limnโ†’โˆžnx1โ‹…1x!โ‹…ฮปxnxโ‹…eโˆ’ฮป1limnโ†’โˆžn!(nโˆ’x)!=nx=limnโ†’โˆžฮปxx!โ‹…eโˆ’ฮป1Cancel nx=ฮปxโ‹…eโˆ’ฮปx!Simplify

That is a beautiful expression! Now we can calculate the real probability of number of requests in a minute, if the historical average is ฮป=5:

P(X=1)=51โ‹…eโˆ’51!=0.03369 P(X=2)=52โ‹…eโˆ’52!=0.08422 P(X=3)=53โ‹…eโˆ’53!=0.14037

This is both more accurate and much easier to compute!

Changing time frames

Say you are given a rate over one unit of time, but you want to know the rate in another unit of time. For example, you may be given the rate of hits to a website per minute, but you want to know the probability over a 20 minute period. You would just need to multiply this rate by 20 in order to go from the "per 1 minute of time" rate to obtain the "per 20 minutes of time" rate.