In the previous article we set up a really simple binomial tree model in Excel and observed that we could artificially change how the random trajectories drifted by altering the probability of moving up and down \{p_u, p_d\} from \{0.5, 0.5\} to \{0.6, 0.4\}.

For convenience, here’s what happened.

With the original probability measure, assigning the values of \mathbb{Q} = \{p_u, p_d\} = \{0.5, 0.5\} we saw these trajectories:

Fig 1 – Two hundred trajectories of a simple binomial tree model with probability of moving up and down given by fifty percent each. The up and down step sizes are +1 and -1 respectively.

Then, we changed probability measure to \mathbb{P} which assigned values of \{p_u, p_d\} = \{0.6, 0.4\} we saw that the measure change indtroduced an artificial drift upwards in our trajectories:

Fig 2 – Now we have the same model but we have changed our probability measure. The probability of moving up has increased from 0.5 to 0.6. The trajectories now look like they are drifting upward! This is the P measure.

Then we conjectured that the inverse of this exercise must also be true: for a naturally (upward) drifting process, we can cancel-out the drift by changing the probability measure!

We accomplished this by going to the cells in our worksheet which held the Wiener increments and added in some additive noise. This resulted in a stochastic process (the coin flip stochastic process) with drift \mu.

Fig 3 – Back to the original Q probability measure, but now the underlying stochastic process has a non-zero drift term in the equation.

Then, we played around with the probability measure until it looked like all the drift was gone…

Fig 4 – The trajectories of a coin flip process with drift, but now with a primitive “risk-neutral” probability measure, Q. The underlying coin-flip stochastic process had a constant drift of 0.123, and the probability measure which cancelled out the drift was up = 38% and down = 62%. All screenshots show 200 trajectories.

To help find the driftless probability measure, we used a histogram until fitting was good:

Fig 5 – Fitting the empirical distribution (orange) to the theoretical normal distribution (blue). This is obviously easier the more paths/samples we use.

Now we shall show an easier, more systematic way of doing this.

Girsanov’s Theorem

The Cameron-Martin-Girsanov theorem (1960), a.k.a. Girsanov’s theorem, is a some-what technical theorem that is used a lot in risk-neutral derivatives pricing. If you want to, click the link above to go to wikipedia and read about it, but for us, here, we don’t need the technical result. Plus, we have already seen Girsanov’s theorem in action! All we need to know is: Girsanov’s theorem tells us that a probability measure (which cancels out drift) simply exists.

But why? Why do we care about driftless stochastic processes?

Cancelling out the drift is actually extremely useful. In fear of swapping one useless term for another: driftless processes are martingales (well, technically local martingales! But that’s too high-brow for us right now). In this blog we will use both “driftless (stochastic) process” and “martingale” interchangebly.

We have already seen what a driftless process looks like. Above, in Fig 1 – the equal-probability coin flip experiment is a martingale.

These martingales are really useful because they have a great property: their next (conditional) expected value is equal to the current value – conditioned on all information up until now. In symbols, this means:


But of course this is true, just look at a driftless process. The mean value of the histogram is always centered at the current value!

Fig 6 – The great property of martingales: Their expected value (as shown by the mean of the histogram) is (almost) always equal to the current value, which in this case is 0.

It’s constantly resetting itself. In fact, that histogram is really important, because at any point in time, the mean value of the histogram of a driftless process sits in the same spot.

And now we have it!

If the mean value of the empirical histogram of a driftless stochastic process does not change with time, and if that histogram looks like a Gaussian (which is does thanks to the Central Limit Theorem), then for all time, the expectation can be interchanged with a cumulative normal distribution function!

In other words:

\displaystyle\underbrace{\mathbb{E}_t^{\mathbb{P}}[x_{t+1}|\mathbf{x}_t]}_{\textup{Difficult}} \longrightarrow \underbrace{\Phi(x_t)}_{\textup{Easy}}

where \Phi(x_t) is the standard normal cumulative distribution function.

So, basically, expectations become cumulative normals for martingales!


Now for an application.

Let us assume that interest rates are deterministic and that a bank account M(t) exists for all times t which may be used to invest or borrow cash at a continuously compounded interest rate r(t) with initial account value M(0) = 1 unit of some (domestic) currency. Since interest rates are assumed deterministic, the account value M(t) evolves over time as a drifting (so, definitly not a martingale) stochastic process:

\displaystyle\textup{d}M_t = r_t M_t \textup{d}t

Integrating both sides with respect to time t we get:

\displaystyle M(t) = 1 \cdot \exp\left(\int_0^t r_s \textup{d}s\right)

The 1 is there at the front to remind you that this is the bank account’s initial boundary condition, and that we have set it to one.

Now we assume that some risky (i.e. stochastic) process S(t) follows a geometric Brownian motion:

\displaystyle\textup{d}S_t = \mu_t S_t \textup{d}t + \sigma_t S_t \textup{d}W_t^{\mathbb{P}}

with constant \sigma_t > 0 for all times t, and where W_t^{\mathbb{P}} signifies a Wiener process, and we are obtaining its infinitesimal increments \textup{d}W as little random numbers according to the real-world probability measure \mathbb{P}. Recall that by real-world we mean the natural probability measure as obtained by observing the process evolve naturally in real-life, i.e. before any sort of measure change.

We can dispense with all of the above nonsense and just say that we are operating under the Black-Scholes model, but crucially, this is also all the assumptions we require to quote Girsanov’s theorem (without actually really using it!).

Now introduce a financial instrument, the European call option that pays the following amout at some future time T:

\displaystyle V(T) = \max(S(T) - K, 0)

where K \in \mathbb{R} is the strike value. Now V(T) depends on S(T) which has drift. This is known as a terminal boundary condition.

We wish to calculate the present value of this instrument at time t = 0, in other words: what is V(0)?

Well, it is common knowledge that the present value will be equal to the expected value of all the discounted future cashflows (conditioned on the sigma-algebra of information received up until pricing time \mathfrak{F}_0). I.e.,

\displaystyle V(0) = \mathbb{E}_0^{\mathbb{P}}\left[V(T)|\mathfrak{F}_0\right]

But the future cashflow V(T) inside the expectation is a drifting risky asset! Which means we don’t have the nice approximation:

\displaystyle\mathbb{E}_t^{\mathbb{P}}[x_{t+1}|\mathbf{x}_t] \longrightarrow \Phi(x_t)

However, the universal pricing theorem says that for any numeraire N(t) we have a useful factorisation

\displaystyle V(0) = N(0)\times\mathbb{E}_0^{\mathbb{Q}^N}\left[\frac{V(T)}{N(T)}\bigg|\mathfrak{F}_0\right]

…and now this is a martingale under \mathbb{Q}^N, which is an equivalent probability measure to \mathbb{P}. This means that the present value of any payoff function (without intermediate payments) can always be factored like this.

But how does this help us? Because we still have a nasty looking expectation there, perhaps even nastier!

Well, it may look nastier but our theorem comes with a clause that says that the process inside the expectation is now a martingale, in other words, the process defined by

\displaystyle Z_t := \frac{V_t}{N_t}

is always a martingale under the \mathbb{Q}^N probability measure.

So we can use the theorem to reform our present value equation in to one about martingales, then use the martingale property to write it as:

\displaystyle V(0) = N(0)\times\mathbb{E}_0^{\mathbb{Q}^N}\left[\frac{V(T)}{N(T)}\bigg|\mathfrak{F}_0\right]

and then claim that this expectation can be evaluated using the approximation

\displaystyle\mathbb{E}_t^{\mathbb{Q}^N}[x_{t+1}|\mathbf{x}_t] \longrightarrow \Phi(x_t)

But wait! How do we know what numeraire to use? Where did \mathbb{Q}^N come from? And how do we find it? Surely we don’t test each possible one until we are satisfied that the drift has disappeared like we did in the last article!?

How do you Find the Risk-Neutral Probability Measure?

The above universal pricing theorem gets us half-way there.

It got us from the definition of the present value of a future payoff, to an equation involving an expectation of a martingale, with the caveat that the expectation is taken under some equivalent, yet mysteriously unknown probability measure \mathbb{Q}^N that renders the payoff process driftless.

But it doesn’t tell you how to find the measure! It just says it exists.

So how do we find it?

The short answer is that we don’t need to!

Girsanov’s theorem tells us that a change of measure is synonymous with a change of drift! Right?

Specifically, Girsanov’s theorem intervenes right at the moment when we think all is lost and we need to go and find some mysterious probability measure \mathbb{Q}^N; and then it tells us that actually, all we need to do, is find some algebraic drift amount \widetilde{\mu}!

Girsanov Theorem implies that a change of measure is a change of drift!

Okay, but then aren’t we exchanging the problem of: trying to find the right probability measure, with: trying to find the right stochastic differential equation (SDE) with the right drift term \widetilde{\mu}?

Yes. But the latter problem is easier.

To recap: we don’t brute-force find the \mathbb{Q}^N probability measure that cancels out the drift of the V_t/N_t-process like we did in the spreadsheet. Instead, we invoke Girsanov’s theorem and say that we simply look for the right drift term \widetilde{\mu} (as the two exercises are equivalent).

Then, this “right” drift term is obviously the zero drift because we want martingales. So we simply constrain the drift term to zero algebraically, and shunt any compensation to the Wiener process W_t, effectively inducing a transformation:

\displaystyle W_t \longrightarrow W_t + \textup{Stuff} =: \widetilde{W}_t

…where \textup{Stuff} is the result of forcing the drift to be zero.

Our SDE, now running with these shunted Wiener increments, \textup{d} \widetilde{W}_t, and nice zero drift \widetilde{\mu} = 0, is a \mathbb{Q}^N-martingale.

In other words: instead of trying to find the right measure, we just find under which (algebraic) conditions could we get the SDE to have zero drift, by playing around with the Wiener increments.

If anything this blog is trying to get you to see is: there is more than one way to visualise a change of measure. You can change the probabilities directly: \mathbb{P} \rightarrow \mathbb{Q}, or you can change the drift: \mu \rightarrow \widetilde{\mu} but any change here impacts the Wiener increments, so you also need to do W_t \rightarrow \widetilde{W}_t.

Choosing a Numeraire for an Option Payoff

Right. Now that we know what to do, let’s do it.

First, choose a numeraire N(t). Let us choose the bank account M(t). We choose this numeraire because it is just an exponential of an integral of a determinstic interest rate. Hence, when it inevitably appears inside the expectation, we can just move it outside as it will never have an probabilistic value. This is what is called a natural choice for a numeraire, i.e. it is the numeraire which we know, ahead of time, will be factored out of any exponential operation.

Suppose we also have a stochastic asset price S_t that follows geometric Brownian motion:

\displaystyle \textup{d}S_t := \mu_t S_t \textup{d}t + \sigma_t S_t \textup{d}W_t

Now, “Choosing a numeraireM_t is synonymous with forming the quotiented stochastic process via the quotient rule of Ito’s Lemma:

\displaystyle\textup{d}\left(\frac{S_t}{M_t}\right) = \frac{S_t}{M_t}\left(\left(\mu_t - r_t\right)\textup{d}t + \sigma_t\textup{d}W_{t}\right)

where \mu is the drift associated with the SDE under the real-world probability measure \mathbb{P}.

Now look at that SDE.

Ito promises us that the left-hand side is a martingale under \mathbb{Q}^M (as all stochastic processes, quotiented by a numeraire are). But the right-hand side has drift! But look, the drift has this form:

\displaystyle\textup{Drift}_t = \mu_t - r_t

Which means we have an available degree of freedom to set this to zero, i.e.

\displaystyle \mu_t - r_t = 0

In fact, it is the left-hand side (being promised to be a martingale) that implies that the drift on the right-hand side must be zero, i.e.

\displaystyle \mu_t = r_t

So the drift under the new measure is just the risk-free interest rate!

With this intuition in mind, let’s go back to the original SDE and, for ease of notation, let us now define

\displaystyle X_t := \frac{S_t}{M_t} = \frac{S_t}{\exp\left(\int_0^t r_u\textup{d}u\right)} = e^{-\int_0^t r_u\textup{d}u} S_t

which gives the SDE as

\displaystyle\textup{d}X_t = X_t\left(\left(\mu_t - r_t\right)\textup{d}t + \sigma_t\textup{d}W_{t}\right)

Let us now factor our \sigma_t X_t which gives:

\displaystyle\textup{d}X_t = \sigma_tX_t\left(\left(\frac{\mu_t - r_t}{\sigma_t}\right)\textup{d}t + \textup{d}W_{t}\right)

As said before, we now algebraically force this SDE to have zero drift but shunting the excess to the Wiener process. In other words, we substitute

\displaystyle \widetilde{W}_t := W_t + \int_0^t \frac{\mu_u - r_u}{\sigma_u}\textup{d}u = W_t \int_0^t \lambda_u\textup{d}u


\displaystyle\textup{d}X_t = \sigma_tX_t\left(\left(\frac{\mu_t - r_t}{\sigma_t}\right)\textup{d}t + \textup{d}\widetilde{W}_t - \frac{\mu_t - r_t}{\sigma_t}\textup{d}t\right)

which cancels the drift, and we are left with

\displaystyle\textup{d}X_t = \sigma_tX_t\textup{d}\widetilde{W}_t

We now invoke Girsanov’s theorem by stating that there must exist an equivalent (martingale) measure \mathbb{Q} on the filtration \mathfrak{F}_s,\,0\leq s\leq t defined by a Radon-Nikodym derivative (which we don’t need to write down).

So, now we have a driftless stochastic process \textup{d}X_t defined on a martingale measure \mathbb{Q} (so we already have the adjusted probabilities). But we want an SDE in terms of S_t, not X_t.

To get back to equations in S_t we simply differentiate

\displaystyle \widetilde{W}_t = W_t + \int_0^t \lambda_u\textup{d}u
\displaystyle \Rightarrow \textup{d}\widetilde{W}_t = \textup{d}W_t + (\lambda_0 - \lambda_t)\textup{d}t
\displaystyle \Rightarrow \textup{d}\widetilde{W}_t = \textup{d}W_t - \lambda_t\textup{d}t
\displaystyle \Rightarrow \textup{d}W_t = \textup{d}\widetilde{W}_t + \lambda_t\textup{d}t

and substitute it into our original S_t SDE giving

\displaystyle \textup{d}S_t = \mu_t S_t \textup{d}t + \sigma_t S_t \textup{d}W_t
\displaystyle \Rightarrow\textup{d}S_t = \mu_t S_t \textup{d}t + \sigma_t S_t \left(\textup{d}\widetilde{W}_t - \lambda_t\textup{d}t\right)
\displaystyle \Rightarrow\textup{d}S_t = \left(\mu_t S_t - \sigma_t S_t\lambda_t\right)\textup{d}t + \sigma_t S_t \textup{d}\widetilde{W}_t
\displaystyle \Rightarrow\textup{d}S_t = \left(\mu_t S_t - \sigma_t S_t\frac{\mu_t - r_t}{\sigma_t}\right)\textup{d}t + \sigma_t S_t \textup{d}\widetilde{W}_t
\displaystyle \Rightarrow\textup{d}S_t = r_t S_t\textup{d}t + \sigma_t S_t \textup{d}\widetilde{W}_t

Plotting this SDE will look like this:

with zero drift.

We have not changed anything about the underlying asset S_t. But we have changed from using the asset’s real-world drift \mu_t in to the risk-neutral interest rate r_t, and we now have two representations of the infinitesimal rate of change of S_t: one under the real-world probability measure \mathbb{P} and one under the risk-neutral (equivalent martingale measure) \mathbb{Q}.

However, despite these changes, we still have a single, unifying terminal condition: that is:

\displaystyle S_T^{\mathbb{P}} = S_T^{\mathbb{Q}} = S_T


Now that we have an SDE for the underlying asset under a probability measure that renders it a martingale, allows us to compute expectations (and hence fair value) in a really easy way.