We are going to try to take a few blog posts and get in to the quantitative mathematics behind the foreign exchange (FX) swap market.

But, as usual in quantitative finance, before we can even get off the ground we need to make a bunch of assumptions.

The first assumption is about the FX rate itself. Since FX rates can’t be negative, but at the same time some aspect of them should be Gaussian (normally distributed), we assume that FX rates are distributed Lognormally.


Next, as in (Heath 1992), we assume that the domestic and foreign FX rates are distributed Normally.

Thirdly, we assume that the FX Rate follows a geometric Brownian motion (gBm) over time, with constant drift \mu and constant volatility \sigma – remember, this is a basic swap model. Some of these assumptions can be relaxed in more sophisticated swap models.

All this means we can write down the equation of evolution of the FX rate as

\displaystyle dS_t = \mu S_t dt + \sigma S_t dW_t^{(1)}

The Wiener process W_t^{(1)} has a 1 as a superscript because there will be more than one Wiener process to consider in this model. This one is for the FX rate dynamics. There will also be one for the forward domestic rate and the forward foreign rate.

Integrating gives,

\displaystyle S_t = S_0 \exp\left[\left( \mu - \frac{1}{2}\sigma^2 \right)t + \sigma W_t^{(1)} \right]

…and this is about as far as we can go without introducing more assumptions.

First of all, it should be noted that as it is written, the FX Rate price S_t (as a geometric Brownian motion) is a Markov Process. This explains why the price of an option (as written on a security that follows a gBm) is only a function of the current price of the security (and not its price history, as one would expect).

Unfortunately however, the FX Rate, as it is written, is not yet a martingale…and it won’t be, unless \mu = 0, and then it becomes a so-called driftless gBm (which is a martingale). But this is both unrealistic and wrong. Even though it will be a martingale, it will be the wrong one!

In what follows we will discuss how to obtain the right martingale, and why this quest is a worthwhile one.

The Quest for the Holy Martingale

We seek a martingale.

Martingales are, by definition, driftless.

So, is there is some logical way to transform the drift out of the equation? In a similar fashion to Lebesgue integration (see my article here), we can change the probability measure so that the drift term vanishes all by itself, and we don’t have to set it equal to zero.

We are going to change the probability measure so that the drift term vanishes all by itself!

In other words, we want a naturally driftless process looking like this:

\displaystyle d\widetilde{S}_t = \sigma\widetilde{S}_td\widetilde{W}_t

The tildes are there to indicate that we want the symbols of our equations to be related but they are slightly different to the original values. Namely, the probability measure \mathbb{P} that maps the random variable events to a number between 0 and 1 will be re-weighted to somewhat different values. This is a so-called change of measure approach and can be employed to obtain our driftless gBm, and hence a martingale.

Changing the Measure

Here is an example of how we can change the probability of measure of a simple stochastic system: the 6-sided die.

A 6-sided die has 6 possible outcomes, 6 faces which could show up. These faces are labels and we list them out here. Faces and Labels have no mathematical substance, so we must map a face to a number. In this case it is obvious which map to use because the face/label indicates precisely which integer to use, i.e. 1 dot maps to the number 1, 2 dots map to 2, and so on. This is what a random variable does: it maps faces/labels/events to numbers.

Now that we have numbers to play with we utilise the probability measure to further map the integer to a probability. Thus, in our simple model, we have done the following mapping:

\displaystyle\mathcal{F} \ni \sigma \mapsto \mathbb{Z} \mapsto [0,1] \subseteq \mathbb{R}

We have to do this everytime we employ a probability model.

OK, now we can draw some things. In what follows we have a so-called fair die which has a equal probability of turning up any of the six faces. Since probabilities must sum to 100%, each face is assigned a probability of 16.67%.

Figure 2 – The probability distribution of a fair die. The sum of the individual events sum to 100%. The assignment of 16.67% to each event is provided by the probability measure.

This probability distribution is a result of the following probability measure:

Figure 3 – The Probability Measure of the fair die.

Now, let us see what happens when we change the measure:

Figure 4 – The Probability Measure associated with an unfair, or rigged, die.

This is a rigged or unfair die as this die rolls a 6 much more often than any other result. And now, the experiment looks different:

Figure 5 – The results of rolling the unfair die.

The point I want to make is that in probability theory we are free to change our probability measure whenever our experiments are not working. In this example we were trying to model an unfair die and clearly the first probability measure was not matching what we were observing in reality. Exchanging that measure for the second one allowed us to match reality much more accurately.

We do the same thing in quantitative finance. The probability measure that comes with the BSM model is the so-called real world probability meaure \mathbb{P}; but, in a world where all we care about are what things look like under continuous, risk-free discounting, experiments do not match what the real-world measure predicts. So, we exchange that measure for the so-called risk-neutral measure \mathbb{Q} (also referred to as the Equivalent Martingale Measure) and re-run the experiment. You could think of the risk-neutral measure as a rigged real-world probability measure that renders the price of an asset to be exactly equal to its risk-free, continuously discounted expectation.

But you see, we do not state such changes in probability measure without also stating what our reason for changing is. In the die example, we had an unfair die, so the re-weighting of the measure was due to that. For the risk-neutral measure \mathbb{Q}, similarly, we used the risk-free, continuously compounded interest rate (which is physically a Money Market account) as justification for the change.

We do not have to use the Money Market account either. We can have experiments in quantitative finance where some other interaction with some other financial asset dictates a change in measure. But only some assets work and make sense for this. We call these assets numéraires, and they are very useful for changing the probability measure to suit experimental evidence. But more on that later.

Why Do We Want Martingales?

Martingales are useful.

An important feature of martingales is Doob’s Optional Stopping Theorem which says that the expectation of a martingale is constant in time, even if we randomly stop it.

An important example of a martingale is Brownian motion (geometric Brownian motion is not!).

Martingales have their own convergence theorems, which are really only useful in proving subtle, technical details outside the scope of this article.

Another nice technical feature of Martingales are that they are decomposable.

Technical qualities aside, they are also highly practical. Indeed, in quantitative finance, martingales are absolutely crucial for pricing. Here’s why.

Crucial for Pricing

The Fundamental Theorem of Asset Pricing (FTAP)is a mathematical theorem that provides the necessary and sufficient conditions for a financial market (an environment where things are traded at varying prices between a large number of people) to be arbitrage-free and complete (which is a fancy way of saying: “Hey! No risk-less profit, no transactional costs, perfect information, and there is a price for everything!”).

While this sounds like a lot of fluff, all these economic statements can and have been boiled down to a simple statement about the existence of a sacred, mysterious probability measure, called the risk-neutral measure, often denoted by \mathbb{Q}.

Could this be what we will transform \mathbb{P} into to achieve driftless behaviour? Actually, yes!

A corollary of the Fundamental Theorem of Asset Pricing then says that if one happens to have a complete, arbitrage-free market, then any derivative’s price is equal to the discounted expected value of future cashflows (payoffs) under the risk-neutral measure. The existence of this risk-neutral measure is therefore a direct result of the no-arbitrage claim.

Another name of the risk-neutral measure \mathbb{Q} is the Equivalent Martingale Measure (or EMM).

If we believe in the FTAP then we believe that there is only ever one, unique EMM. This implies that there is only ever one, unique price for each asset in the market. And this is a very important belief, because we surely can’t have two different prices for the same thing!

We need the FTAP to believe that there is a unique price for each asset. But to realise the unique price, we need to re-weight all of our real-world probabilities in to risk-neutral ones. Then (and only then) do our asset prices start behaving like driftless martingales.

If there is no EMM then this is equivalent to the existence of arbitrage opportunities.

And, as you can see, the solution was not as simple as setting \mu = 0 and hope for the best. We actually had to deal with a lot of practical, economic consequences for our change. Unfortunately, as we will see in the next section, there is even more to be careful of when performing this change.

So How Do We Get a Martingale?

OK, so Martingales sound pretty awesome, and they are! But how do we get one?

Well, we’ve hinted that it has something to do with changing the probability measure. This change has the power to make the drift in the FX Rate stochastic process vanish. But, unfortunately, this change also affects the very properties of the Wiener process driving the thing. So we must proceed with caution!

The challenge is to not only find a new, equivalent probability measure but also a new random Wiener process to go with it. This new random process need to be suitably random under the EMM as well, we can’t just use the old one.

So we need to transform \mathbb{P} to some EMM \mathbb{Q} (which we don’t know) and then to transform our Wiener process W in to some other Wiener process \widetilde{W} that’s a martingale under \mathbb{Q}.

Ugh! Sounds tough.

Well, whatever the new random process \widetilde{W} is, it had better render the FX dynamics driftless. So let’s see if can by naively achieve this by directly substituting:

\displaystyle \widetilde{W}_t := W_t + \left( \frac{\mu - r}{\sigma} \right)t

If we substitute this in, we get

\displaystyle S_t = S_0 \exp\left[\left( \mu - \frac{1}{2}\sigma^2 \right)t + \sigma\left( \widetilde{W}_t - \left( \frac{\mu - r}{\sigma} \right)t \right) \right]

\displaystyle  \text{...} = S_0 \exp\left[\left( \mu - \frac{1}{2}\sigma^2 \right)t + \sigma \widetilde{W}_t - \left( \frac{\mu - r}{\sigma} \right)\sigma t \right]

\displaystyle  \text{...} = S_0 \exp\left[\left( \mu - \frac{1}{2}\sigma^2 \right)t + \sigma \widetilde{W}_t - \left( \mu - r \right) t \right]

\displaystyle  \text{...} = S_0 \exp\left[\left( r - \frac{1}{2}\sigma^2 \right)t + \sigma \widetilde{W}_t \right]

…and, darn! We still have a drift term. Further, we have only succeeded in replacing the \mu term with the r term. No good.

Let’s try something else.

Let us try to transform the FX Rate S_t to the FX Rate divided by e^{rt}.

In other words, let’s define \widetilde{S}_t := e^{-rt}S_t, so that wherever we see a S_t, we replace it with e^{rt}\widetilde{S}_t. This trick is called discounting at the risk free rate. We will see many, many examples of this trick in other application. Primarily, it is a trick used to prepare a risky asset with a risk-free asset before application of Itô’s lemma for derivatives pricing. But we’ll get to that in a later post.

Then we get:

\displaystyle S_t = S_0 \exp\left[\left( \mu - \frac{1}{2}\sigma^2 \right)t + \sigma\left( \widetilde{W}_t - \left( \frac{\mu - r}{\sigma} \right)t \right) \right]

\displaystyle e^{rt}\widetilde{S}_t = S_0 \exp\left[\left( \mu - \frac{1}{2}\sigma^2 \right)t + \sigma\left( \widetilde{W}_t - \left( \frac{\mu - r}{\sigma} \right)t \right) \right]

\displaystyle \widetilde{S}_t = S_0 \exp\left[\left( \mu - r - \frac{1}{2}\sigma^2 \right)t + \sigma\left( \widetilde{W}_t - \left( \frac{\mu - r}{\sigma} \right)t \right) \right]

But now we need to get back to the representation of this as d\widetilde{S}_t. For that we need Ito’s Lemma.

Define f:=e^{-rt}S_t. Then \partial_t f = \partial_t (e^{-rt}S_t) = -re^{-rt}S_t\partial_S f = \partial_S (e^{-rt}S_t) = e^{-rt}, and \partial_{SS} f = \partial_{SS} (e^{-rt}S_t) = \partial_S e^{-rt} = 0. Then, Ito’s Formula states that

\displaystyle df = \partial_t f dt + \partial_x f dx + \frac{1}{2}\partial_{xx}f(dx^2)

and so we have:

\displaystyle df := d\widetilde{S}_t
\displaystyle df = d\left(e^{-rt}S_t\right)

Ito’s Lemma:

\displaystyle df = -re^{-rt}S_t dt + e^{-rt} dS_t + 0(dS_t^2)
\displaystyle df = -re^{-rt}S_t dt + e^{-rt} dS_t

Substituting in our known expression for dS_t gives:

\displaystyle df = -re^{-rt}S_t dt + e^{-rt} \left( \mu S_t dt + \sigma S_t dW_t \right)
\displaystyle df = \left( -re^{-rt}S_t dt + \mu e^{-rt} S_t\right) dt + e^{-rt} \sigma S_t dW_t
\displaystyle df = \left( \mu - r \right) e^{-rt}S_t dt + e^{-rt} \sigma S_t dW_t

…and we get:

\displaystyle d\widetilde{S}_t = \left( \mu - r \right) \widetilde{S}_t dt + \sigma \widetilde{S}_t dW_t

Oh no! There is still that pesky drift term in there!

So, two attempts at naïve transformation has not achieved the aim that is to remove the drift term.

However, it is clear from the last line above that the process \widetilde{S}_t has trend (\mu - r)\widetilde{S}_t. This trend causes \widetilde{S}_t not to be a \mathbb{P}-martingale.

So what if we try the above transformations together?

Starting with our last expression:

\displaystyle d\widetilde{S}_t = \left( \mu - r \right) \widetilde{S}_t dt + \sigma \widetilde{S}_t dW_t

\displaystyle \frac{d\widetilde{S}_t}{\widetilde{S}_t} = \left( \mu - r \right)dt + \sigma dW_t

\displaystyle \frac{d\widetilde{S}_t}{\widetilde{S}_t} = \left( \mu - r \right)dt + \sigma \left( d\widetilde{W}_t - \frac{\mu - r}{\sigma}dt \right)

\displaystyle \frac{d\widetilde{S}_t}{\widetilde{S}_t} = \left( \mu - r \right)dt + \sigma d\widetilde{W}_t - \left(\mu - r\right)dt

\displaystyle \frac{d\widetilde{S}_t}{\widetilde{S}_t} = \sigma d\widetilde{W}_t

Awesome! No drift term.

Our wild guess at

\displaystyle \widetilde{W}_t := W_t + \left( \frac{\mu - r}{\sigma} \right)t

worked perfectly for the process \widetilde{S}_t, but not for S_t.


The reason it worked for the tilde process \widetilde{S}_t has everything to do with the thing that we divided S_t by.

Recall that to get \widetilde{S}_t we took our original FX Rate process S_t and divided it by e^{rt}.

Well, it just so happens that e^{-rt} is also the price of a zero-coupon bond; it’s just the discounted value of a unit of currency. So what we did, to get the transformation to work, was to divide the FX Rate by a Zero-Coupon Bond. We essentially normalised all the FX Rates by a Zero-Coupon Bond. Then the re-weighting of probabilities by \widetilde{W}_t made the whole thing driftless.

But why did this work? What is so special about normalising asset prices with a zero-coupon bond?


What we did above was not only wild guessing, but was essentially a transformation of the FX Rate prices in to a new FX Rate price without introducing any new form of risk (i.e. randomness).

If your price process is completely deterministic (no randomness) then it can always be used as a numéraire, i.e. it can be used to normalise the main, risky price process.

Examples of numéraires are:

  1. Money Market Account,
  2. Currency Exchange Rates,
  3. The Forward Numeraire (zero-coupon bond),
  4. Annuities

The numéraire itself can be discounted

\displaystyle M_t := e^{-rt} N_t

and will also be a martingale under the transformed probability measure \mathbb{P}^*.

Guessing for Days…

There is a better way of finding the transformation

\displaystyle \widetilde{W}_t := W_t + \left( \frac{\mu - r}{\sigma} \right)t

that works rather than wild guessing?

Yes. It’s called Girsanov’s Theorem and we will cover it in the next blog post and show why how we finally get and use a martingale, why numéraire’s work for us, and then explain what we set out to do: pricing in the foreign exchange swap market.

But for now, let me wrap up with a quick summary of what we really do in practice (instead of guessing):

  1. Pick a suitable numéraire N_t,
  2. Make sure that the ratio \widetilde{S}_t := \frac{S_t}{N_t} takes a sufficiently simple form,
  3. use Girsanov’s Theorem to determine the dynamics of \widetilde{S}_t under the transformed measure \mathbb{P}^*.

and we will see exactly how to perform Step 3 in the next blog.