What is a quantum oracle? And why do we need them?

In this blog we will attempt to de-mystify them by outlining the mathematics involved, and why certain assumptions are made.

At the highest level, a quantum oracle is two things. The first is fairly obvious when you think that most of quantum algorithms is linear algebra – and that is, that most of the calculations are matrix multiplications. We already know that matrix multiplication is a little weird. There’s this strange relationship between multiplying vectors by matrices and rotations in the Euclidean plane. So there’s that to contend with, but they’re also linear transformations – i.e. they are machines that take in vectors and output vectors, but visually this just corresponds to distorting the Euclidean plane so that its grid lines remain parallel and evenly spaced (and the origin remains fixed). To get a good intuitive feel for this, I recommend watching 3Blue1Brown’s video on this subject (see Reference 1 below), as this is how a lot of quantum algorithms work below the surface.

The second thing about oracles is that they can be thought of as quantum switch statements. That is, depending on an input (e.g. some prediction about a quantum state), the oracle will automatically run a particular operation depending on that input’s value is. For example, if we predict heads for a coin flip, the the oracle will automatically run, say, the Identity operation. Or if we preduct tails, then it will automatically run some \pi-radian y-axis rotation operation; or something akin to this, depending on the specific problem.

Just like with all switch statements, the programmer must code it! This means that whenever one uses an oracle in a quantum algorithm, one has already figured out what switch statement to use; usually by doing a bunch of linear algebra beforehand. Then, using oracles should come as no surprise because we have already figured out how to map the resultant 0 and 1 output states to the inputs (or predictions). E.g. we already know that a measurement outcome of 0 identifies with the input prediction ‘Heads‘, and a measurement outcome of 1 identifies with the input preduction ‘Tails‘.

The closest classical concept I could think of is the switch statement that under the Even prediction input, the switch statement runs 2 \times x, and under the Odd prediction input, the switch statement runs 2 \times x - 1. This is certainly not anything magical, and is in fact a useful subroutine! It always outputs an even number if the user makes the Even preduction, and the same for an Odd prediction. The tough part is coming up with the equations 2x and 2x+1, which, in general quantum algorithms, are much more tricky!

Designing oracles to work for particular quantum experiments is not as easy as the classical one just mentioned. There are several rules which oracles must obey. The most important one is that it must preserve probabilities. Just as in the classical Even/Odd example, the two equations we chose for our classical oracle covered all outcomes with a uniform probability. But we are not limited to the natural numbers, nor are we limited to uniform probability distributions when it comes to quantum algorithms!

Furthermore, a quantum oracle must be able to distinguish encoded input values without affecting any probabilities. We will now see how this is done.

One final thing to note: Since there is this notion of mapping of input ‘predictions’ to output basis states there is an inherent assumption that the question to be asked of oracles is to be a boolean one: like up/down, heads/tails, true/false, pass/fail, accept/reject, etc… which matches the dimension of the qubit, in this case 2.

The Oracle Protocol

Not every configuration of quantum gates will create a oracle. They have to be configured and tuned to solve the problem at hand. So let us introduce a problem and attempt to configure and tune an oracle to it.

The problem will be the one of predicting the outcome of flipping a (quantum) coin. I.e. we flip a q-coin in to the air, make a prediction (heads OR tails), then observe the outcome once the coin has landed and settled on an orientation. The coin is a quantum coin because while it is flipping it can be in a superposition of Heads and Tails (a classical coin never is, because given knowledge of all physical parameters of the state space it always persists as either Heads or Tails right the way through the flipping process until it lands).

Note immediately that this problem is almost already formatted for use in a quantum algorithm! I assure you, most problems are not this amenable!

First, the initial flipping of the coin is synonymous with applying a Hadamard gate H to a qubit in the |0\rangle state (a matrix-vector multiplication). Secondly, making the prediction whilst the coin is in a superposition of heads/tails states is equivalent to applying the actual oracle, whatever that means (discussed next!). Finally, having the coin settle in to either the heads or tails state is synonymous with collapsing the superposition, which is, of course, done by re-applying the Hadamard gate.

Further, the oracle acts as a conduit that transforms boolean inputs (prediction heads OR tails) in to measurable outputs, so we need to keep in mind that we will need to form a mapping between heads/tails and 0/1 – assuming, of course, our measurements will be done in the computational basis – i.e. the basis which provides real, useful numbers for the abstract states |0\rangle and |1\rangle just as any basis does!

Right, so we need a legitimate switch statement (a.k.a. oracle) that maps the 2-element set \left\{\textup{Heads},\textup{Tails}\right\} to the 2-element set \{0,1\}. And this must be done without altering probabilities (i.e. the squares of the amplitudes must remain identical).

Luckily, since it is the squares of the amplitudes that are to stay the same for each basis state, we are free to add or remove negative signs wherever we please! (because -n^2 = n^2 for all choices of n \in \mathbb{N}). This is our first trick. Our second trick manifests inside matrix multiplication as we shall now see.

Matrix Multiplication Tricks

With the goal of mapping 2-element sets to 2-element sets, without changing probabilities, and only using linear transformations, we seek two different matrix multiplications that preserves probabilities but whose results can be distinguished by the same operation.

First, let us flip the q-coin in to the air. This is equivalent to applying a Hadamard to a 0-qubit:

\displaystyle H |0\rangle = \frac{1}{\sqrt{2}}\begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix}\begin{pmatrix} 1 \\ 0 \end{pmatrix} = \frac{1}{\sqrt{2}}\left(\begin{pmatrix}1 \\ 0\end{pmatrix} + \begin{pmatrix}0 \\ 1\end{pmatrix}\right) = \frac{1}{\sqrt{2}}\left(|0\rangle + |1\rangle\right) =: |+\rangle

since

\displaystyle \begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix}\begin{pmatrix} 1 \\ 0 \end{pmatrix} =\begin{pmatrix}1 + 0 \\ 1 - 0\end{pmatrix} = \begin{pmatrix}1\\1\end{pmatrix} = \begin{pmatrix}1\\0\end{pmatrix} + \begin{pmatrix}0\\1\end{pmatrix}

These are all just matrix-vector multiplications, so where is the big trick?

To motivate the trick, notice the probability amplitudes in the |-\rangle state. The amplitude of the |0\rangle state is 1/\sqrt{2}, and it’s the same for the |1\rangle state. The two states are equally likely to be observed when measured, and so they are in an equal superposition.

The probability of either state being observed is the absolute-value-square of this number, i.e. 50\% apiece. Our oracle cannot change the probability, but it is allowed to change the amplitudes in ways that taking the absolute-value and squaring it, makes no difference.

So we can actually introduce negative signs in to the amplitude, and in to our equations because the negative signs cancel out when taking the absolute-value-squared.

This means that if we could multiply the superposition vector |-\rangle := (\frac{1}{\sqrt{2}}, \frac{1}{\sqrt{2}})^{\top} by a matrix with a negative element, then we could theoretically introduce a negative sign. Let’s try one.

Here, I’m going to pick the following matrix:

\displaystyle O := \begin{bmatrix}0 & 1 \\ -1 & 0\end{bmatrix}

and neglect to mention, just for a moment, where it comes from, and see what happens when I multiply:

\displaystyle O|+\rangle = \begin{bmatrix}0 & 1 \\ -1 & 0\end{bmatrix}\begin{pmatrix}\frac{1}{\sqrt{2}} \\ \frac{1}{\sqrt{2}}\end{pmatrix} = \begin{pmatrix}\frac{1}{\sqrt{2}} \\ -\frac{1}{\sqrt{2}}\end{pmatrix} = \frac{1}{\sqrt{2}}\left(\begin{pmatrix} 1 \\ -1\end{pmatrix}\right) = \frac{1}{\sqrt{2}}\left(\begin{pmatrix}1\\0\end{pmatrix}-\begin{pmatrix}0\\1\end{pmatrix}\right) = \frac{1}{\sqrt{2}}\left(|0\rangle - |1\rangle\right) = |-\rangle

We have successfully introduced a negative sign in to the amplitude of the |1\rangle state without altering their absolute-value-squared values.

Note that I can think of yet another matrix which achieves the same property of preserving the probabilities: the identity matrix but without introducing a negative sign. Observe the following calculation

\displaystyle I|+\rangle = \begin{bmatrix}1 & 0 \\ 0 & 1\end{bmatrix}\begin{pmatrix} \frac{1}{\sqrt{2}} \\ \frac{1}{\sqrt{2}}\end{pmatrix} = \frac{1}{\sqrt{2}}\begin{pmatrix} 1 + 0 \\ 0 + 1\end{pmatrix} = \frac{1}{\sqrt{2}}\begin{pmatrix}1\\1\end{pmatrix} = \frac{1}{\sqrt{2}}\left(|0\rangle + |1\rangle\right) = |+\rangle

…and the absolute-value-squared coefficients of each \{|0\rangle, |1\rangle\} are identical (even though the quantum states \{|+\rangle, |-\rangle\} are different. The point is that we have not altered the probabilities!

Thus, whilst the coin is flipping in mid air, measuring either O|+\rangle or I|+\rangle will yield a set of outcomes with the same probabilities. This is clearly on the right track but not what we need because at the moment our prediction will only match observed outcomes 50\% of the time!

Also, where did this matrix O come from?

Configuring and Tuning an Oracle

Ah, well it wasn’t exactly a pure guess. This is hard part of configuring and tuning an oracle. You see, there are only a certain type of matrix we could use. First, they need to be unitary w.r.t. some basis, and their dimension is linked to the number of qubits in play: I.e. the unitary matrix has to have dimension 2^n \times 2^n for n qubits. That’s a pretty big matrix for not many qubits!

Since we are only dealing with one qubit, the matrix I’m after to complete this operation is of size 2 \times 2. The identity matrix I is always allowed, and so are the Pauli matrices X, Y and Z. Controlled NOT matrices are also allowed (for 2 or more qubits, so not applicable here), and so are Phase Shift matrices. You can read up on all the allowed unitary matrices here.

For us though, in this particular problem, we used a Rotation Matrix, whose general form is,

\displaystyle \textsf{Rot}_y(\theta) := \begin{bmatrix}\cos\frac{\theta}{2} & -\sin\frac{\theta}{2} \\ \sin\frac{\theta}{2} & \cos\frac{\theta}{2}\end{bmatrix}

but we simply chose \theta = -\pi. Choosing this particular value for \theta is called tuning. We have thus reduced the y-rotation matrix to be precisely the one we need to introduce that negative sign in just the right place:

\displaystyle \textsf{Rot}_y\left(-\pi\right) = \begin{bmatrix} 0 & 1 \\ -1 & 0\end{bmatrix}

So, for this specific quantum coin flip problem, a y-axis clockwise rotation by 90^{\circ} exactly complements the identity matrix to give a one-to-one mapping between predictions of Heads and Tails to quantum measurement outcomes of |0\rangle and |1\rangle.

Collapsing the State

We have just witnessed two different actions one can perform on a qubit in superposition that preserves probabilities but creates two fundamentally different, but physically identical, quantum states. Of course, to do this, we needed to form a concrete problem statement first (and for us, that was our quantum coin flip with a prediction step). This gave us a bifurcation in the quantum subroutine that was different enough (but not too different as to destroy probabilities) to be able to concretely map a boolean prediction type on to each one.

The task now is to extract this information.

So, to recap: we perform a coin toss (applying a Hadamard and putting the coin in to a superposition), then we make a prediction of Heads or Tails (applying our bespoke Oracle switch statement). But we still have a superposition while it is flipping in the air. Conventionally, to undo a superposition we apply the inverse Hadamard, which is just the Hadamard as it is its own inverse.

And now something truly quantum happens…

Remember how we said that the flipped coin and the oracle composed together is a superposition of Heads and Tails? And remember how we showed that the probability of coming up Heads or Tails is still 50\%? Because we did not alter the probabilities (only the amplitudes)? Well, classically, we would be doomed because classically there is no operation, or logic gate, that can discern any difference here. That negative sign is not something that any classical logic gate is sensitive to.

However, in the quantum regime, the Hadamard gate is sensitive to that negative sign!!

Just look at how applying a Hadamard to these two (classically identical) states results in two classically different outputs:

\displaystyle H(|+\rangle) = \frac{1}{\sqrt{2}}\begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix}\cdot\frac{1}{\sqrt{2}}\begin{pmatrix} 1 \\ 1 \end{pmatrix} = \frac{1}{2}\begin{pmatrix} 1 + 1 \\ 1 - 1 \end{pmatrix} = \begin{pmatrix} 1 \\ 0 \end{pmatrix} = |0\rangle

\displaystyle H(|-\rangle) = \frac{1}{\sqrt{2}}\begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix}\cdot\frac{1}{\sqrt{2}}\begin{pmatrix} 1 \\ -1 \end{pmatrix} = \frac{1}{2}\begin{pmatrix} 1 - 1 \\ 1 + 1 \end{pmatrix} = \begin{pmatrix} 0 \\ 1 \end{pmatrix} = |1\rangle

The inverse-Hadamard (which represents the physical landing of the quantum coin either heads or tails up on the table) is sensitive enough to that negative phase, unlike any classical logic gate could be.

Our prediction of Heads or Tails can thus be mapped, via a quantum switch-like statement, to either one of these outputs. For example, if our quantum algorithm sees that we have made a Heads prediction, it simply switches to apply the Identity unitary operator. If it sees that we have made a Tails prediction, it simply switches to apply the 90^{\circ}, y-axis rotation unitary operator. When the quantum coin lands, either way, it will always land the Predicted side up!

Conclusion

To see how an Oracle is like a quantum switch statement, please take a look at Frank Zickert‘s The Quantum Oracle Demystified Medium post. I think his Python code clearly demonstrates this switch-like behaviour of a quantum oracle.

All the tricks mentioned above in this blog involve matrix multiplication and trigonometric relationships. The only quantum-like object is the Hadamard logic gate, which has no classical counterpart, and only works on a primitive object with at least two degrees of freedoms, which classical bits certainly do not have!

References

  1. 3Blue1Brown, Matrix multiplication as compositionhttps://www.youtube.com/watch?v=XkY2DOUCWMU.
  2. Frank Zickert‘s The Quantum Oracle Demystified Medium (2020).
  3. Amit Nikhade‘s Rotating the qubit – Medium (2021).
  4. Qubit Rotation: https://pennylane.ai/qml/demos/tutorial_qubit_rotation.html Pennylane (2021).
  5. Qiskit Single Qubit Gates: https://qiskit.org/textbook/ch-states/single-qubit-gates.html Qiskit.
  6. The \textsf{Rot}_y(\theta) gate – https://qiskit.org/documentation/stubs/qiskit.circuit.library.RYGate.html Qiskit.