Superconducting qubits – on islands, charge qubits and the transmon

In my previous post on superconducting qubits, we have seen how a flux qubit represents a qubits state as a superposition of currents in a superconducting loop. Even though flux qubits have been implemented and used successfully, most research groups today focus on different types of qubits using a charge qubit as an archetype.

Charge qubits – the basics

The basic idea of a superconducting charge qubit is to create a small superconducting area called the island which is connected to a circuit in such a way that we can control the number of charge carries that are located on the island. A typical way to do this is shown in the diagram below.


In this diagram, we see, in the upper left, a Josephson junction, indicated by the small cross. Recall that a Josephson junction consists of two superconducting electrodes separated by a thin insulator. Thus a Josephson junction has a capacity, which is indicated by the capacitor C in the diagram.

On the right of the Josephson junction is a second capacitor. Now charge carriers, i.e. Cooper pairs, can tunnel through the Josephson junction and reach the area between the second capacitor and the junction – this is our island. Conversely, Cooper pairs can cross the junction to leave the island again. The flow of Cooper pairs into the island and away from the island can be controlled by applying an external voltage Vg. Effectively, a certain number of Cooper pairs will be trapped on the island – this is why this circuit is sometimes called a Cooper pair box – but this number will be subject to quantum fluctuations due to tunneling through the junction. Roughly speaking, these fluctuations cause an oscillation which will give us energy levels, and we can use two of these energy levels to represent our qubit.

Let us try to understand how these energy levels look like. Again, I will try to keep things short and refer to my more detailed notes on GitHub for more details. The Hamiltonian for our system looks as follows.

H = E_C[N - N_g] - E_0 \cos \delta

Here, Eg and E0 are energies that are determined by the geometry of the junction and the value of the capacities in the circuit, N is the number of Cooper pairs on the island, Ng is a number depending on the external voltage and \delta is (proportional to) the flux through the junction. Thus N and \delta are our dynamic variables, representing the number of Cooper pairs on the island and its change over time, whereas the other quantities are parameters determined by the circuit and the external voltage. As Ng depends on the external voltage and can therefore be changed easily, this is the parameter we will be using to tweak our qubit.

Using a computer algebra program (or this Python notebook), it is not difficult to obtain a numerical approximation of the eigenvalues of this operator (we can, for instance, represent the Hamiltonian as a matrix subject to an eigenbasis for N and calculate its eigenvalues after cutting off at a finite dimension). The diagram below shows the results for E0 = 0.1 Ec.


We see that if Ng is an integer, there is a degeneracy between the first and second excited state. If, however, Ng is a half-integer, then this degeneracy is removed, and the first two energy levels are fairly well separated from the rest of the spectrum.

With this relation of the two energies EC and E0, the probability for a Cooper pair to tunnel through the Josephson junction is comparatively small. Thus, the fluctuations in the quantum number N are small, and the eigenstates of N are almost stationary and therefore almost energy eigenstates. As the energy levels are well separated, we can use the first two energy levels as a qubit. As the eigenstates of the operator representing the charge on the island are almost stationary states, this regime is called the charge regime and the resulting qubit is called the charge qubit.

The transmon

This is all nice, but in practice, there is still a problem. To understand this, let us take a look at the energy level diagram above again. The energy levels are not flat, meaning that a change in the value of Ng is changing the energy levels and therefore the stationary states. Unfortunately, the value of N g does not only depend on the external voltage Vg (which we can control), but also on charge noise, i.e. unwanted charge fluctuations that are hard to suppress.

Therefore, the charge qubit is quite sensitive to charge noise. The point Ng = 0.5, called the sweet spot, is typically chosen because at this point, at least the first derivative of the energy as a function of the charge vanishes, so that the qubit is only affected by charge noise to the second order. However, this dependency remains and limits the coherence time of the charge qubit.

One way to reduce the sensitivity to charge noise is to increase the ratio between E0 and EC . To understand what happens if we do this, take a look at the following diagram which displays the first few energy levels when this ratio is 5.0


We see that, compared to our first energy level diagram, the sensitivity to charge noise is reduced, as the energy of the first two energy levels is almost flat, with a minimal dependency on Ng . However, this comes at the cost of a more equidistant spacing of the energy levels, making the isolation between the first two energy levels and the rest of the spectrum hard, and the question arises whether we gain an advantage at the cost of another one.

Fortunately, it turns out that the sensitivity to charge noise decreases exponentially fast, but the anharmonicity of the energy levels decreases much slower. Thus there is a region for the ratio E0 / Ec in which this sensitivity is already comparatively low, but the energy levels are still sufficiently anharmonic to obtain a reasonable two-level system. A charge qubit operated in this regime is called a transmon qubit.

Technically, a larger value of E0 compared to EC is achieved by adding an additional capacitor parallel to the Josephson junction. If we use a high value for the additional capacity, we can make EC arbitrarily small and can achieve an arbitrary high ratio of E0 compared to EC.

To develop a physical intuition of why this happens, recall that the energy E0 of the Josephson junction measures the tunneling probability. If E0 is large compared to EC , it is comparatively easy for a Cooper pair to tunnel through the junction. Therefore the phase difference \delta across the junction will only fluctuate slightly around a stationary value, i.e. the wave function will be localized sharply in the \delta-space. Consequently, the charge N will no longer be a good quantum number and the charge eigenstates will no longer be approximate energy eigenstates. Instead, we will see significant quantum fluctuations in the charge, which makes the system more robust to external charge noise. In this configuration, you should therefore think of the qubit state not as a fixed number of Cooper pairs on the island, but more as a constant tunneling current flowing through the junction.

To control and read out a transmon qubit, it is common to use a parallel LC circuit which is coupled with the transmon via an additional capacitor. Using microwave pulses to create currents in that LC circuit, we can manipulate and measure the state of the qubit and couple different qubits. Physically, the LC circuit is realized as a transmission line resonator, in which – similar to an organ pipe – waves are reflected at both ends and create standing wave patterns (that transmission lines are used is the reason for the name transmon qubit).

At the time of writing, most major players (Google, IBM, Rigetti) are experimenting with transmon based qubit designs, as it appears that this type of qubit is most likely to be realizable at scale. In fact, Transmon qubits are the basic building blocks of Google’s Bristlecone architecture as well as for IBMs Q experience and Rigettis QPU.

To learn more, I recommend this and this review paper, both of which are freely available on the arXiv.

Superconducting qubits – the flux qubit

In the last post, we have discussed the basic idea of superconducting qubits – implement circuits in which a supercurrent flows that can be described by a quantum mechanical wave function, and use two energy levels of the resulting quantum system as a qubit. Today, we will look in some more detail into one possible realization of this idea – flux qubits.

In its simplest form, a flux qubit is a superconducting loop threaded by an external magnetic field and interrupted by a Josephson junction. This is visualized on the left hand side of the diagram below, while the right hand side shows an equivalent circuit, formed by an inductance L, the capacity CJ of the junction and the pure junction.


It is not too difficult to describe this system in the language of quantum mechanics. Essentially, its state at a given point in time can be described by two quantities – the charge stored in the capacitor formed by the two leads of the Josephson junction, and the magnetic flow (or flux) through the loop. If you carefully go through all the details and write down the resulting equation for the Hamiltonian, you will find that the classical equivalent of this system is a particle moving in a potential which, for an appropriate choice of the parameters of the circuit, looks like the one displayed below.


This potential has a form which physicists call a double-well potential. Let us discuss the behavior of a particle moving in such a potential qualitatively. Classically, the particle would eventually settle down at one the minima. As long as its energy is below the height of the potential separating these two minima, it would remain in that state. In the quantum world, however, we would expect tunneling to occur, i.e. we would expect that our system has two basic states, one corresponding to the particle being close the left minimum and one corresponding to the particle being right to the minimum, and that we see a certain non-zero probability for the particle to cross the potential wall and to flip from one state into the other state. This two-state system already looks like a good candidate for a qubit.

Being a one-dimensional system, the eigenstate wave functions can be approximated numerically using standard methods, for example the “particle-in-a-box” approach. Again, I refer to my detailed notes for the actual calculation. The result is displayed below.


The diagram shows the ground state (blue curve on the left hand side) and the first excited state (blue curve on the right hand side). In both diagrams, I have added the classical potential (orange line) for the purpose of comparison. So we actually obtain the picture that we expect. Up to normalization, the ground state – the eigenstate on the left – is a superposition of two states

|l \rangle - | r \rangle

where |l \rangle is a state localized around the left minimum of the potential and |r \rangle is a state centered around the right minimum, whereas the first excited state is a linear combination proportional to

|l \rangle + | r \rangle

In general, there will be a small energy difference between these two states, which leads to a non-zero probability for tunneling between them. Intuitively, the state |l \rangle corresponds to a supercurrent that flows through the loop in one direction and the state |r \rangle is a state in which the supercurrent flows in the opposite direction. Our ground state – which would be the state |0 \rangle in an interpretation as a qubit – and our first excited state |1 \rangle are superpositions of these two states.

A nice property of the flux qubit is that the energy gap between the first excited state and the second excited state is much higher than the energy gap between the ground state and the first excited state. In the example used as basis for the numerical simulations described here, the second gap is more than one order of magnitude higher than the first gap. This implies that the two level system spanned by |0 \rangle and |1 \rangle is a fairly well isolated system and thus serves as a very good approximation to a qubit. The Hamiltonian can be manipulated by changing the flux through the loop by applying an external magnetic field or an external microwave pulse can be used to stimulate a transition from the ground state to the first excited state. In this way, the qubit can be read-out and controlled.

So theoretically, this system is a good candidate for the implementation of a qubit. In fact, this has been used in practice – the D-Wave quantum annealer is based on interconnected flux qubits. However, it seems that the flux qubit has come a bit out of fashion, and research has focussed on a new generation of superconducting qubits like the transmon qubit that work slightly differently. We will study this type of qubit in the next post in this series.

Superconducting qubits – an introduction

In some of the last posts in my series on quantum computing, we have discussed how NMR (nuclear magnetic resonance) technology can be used to implement quantum computers. Over the last couple of years, however, a different technology has attracted significantly more interest and invest – superconducting qubits.

What are superconducting qubits? To start with something a bit more familiar, let us look at an analogue in classical electronics – an LC circuit. If you have ever built a transistor-based radio, you will have seen this. An LC circuit consists of an inductor and a capacitor, connected together as follows.


To understand what this circuit is doing, let us assume that at some point in time, the capacitor is fully charged. Then current will start to flow through the inductor. This will create a magnetic field in the inductor, so the electric energy stored in the capacitor is transformed into magnetic energy stored in the magnetic field of the inductor. Once the capacitor is fully decharged, the magnetic field will start to break down, again inducing a current which will then recharge the capacitor and so forth. This follows a pattern that every physicist knows by the name harmonic oscillator.

Now, in quantum mechanics, a harmonic oscillator has an infinite number of energy levels at equally spaced values. Can we use two of them, say the ground state and the first excited state, to build a qubit? Unfortunately there are a few obstacles.

The first problems is that an ordinary current is the coordinated movement of many electrons that constantly interact with the environment (dissipating heat due to the non-zero resistance) and therefore do not make a good closed quantum system. Even though each individual electron is of course a quantum mechanical system, the overall system exhibits classical behavior.

This changes if we use a superconducting LC circuit. As soon as we enter the superconducting regime, the flow of charge is no longer given by individual moving electrons, but by Cooper pairs that flow through the conductor. It is well known that in this state, the system can be described by a macroscopic wave function and is therefore behaving according to the laws of quantum mechanics, leading to phenomena like the quantization of the magnetic flux in a superconducting loop.

Does this mean that we can use a superconducting LC circuit, which then should behave as a quantum mechanical harmonic oscillator, as a qubit? Well, we are not quite there yet – as the energy levels of a harmonic oscillator are equally spaced, the first two levels are not well isolated against the remaining levels which makes it very hard to confine the system to these two levels.

Enter Josephson junctions. A Josephson junction is a superconducting element which consists of two superconducting solids that are separated by a thin insulator, like displayed schematically below.


Due to tunneling, Cooper pairs can cross the junction and thus a superconducting current can flow through the element. It turns out that a Josephson junction behaves like an inductor whose inductance is not constant (as for an ordinary inductor), but depends on the current. This will change the Hamiltonian of the system to the effect that the energy levels become distorted. If we get the parameters right, we are able to obtain a situation where the two lowest energy levels are separated fairly well from the rest of the energy spectrum and therefore the corresponding subsystem can approximately be treated as a two-level system, i.e. a qubit.

Superconducting qubits have gained a lot of attention over the last two years. They can be produced with techniques known from the production of integrated circuit, and they can be controlled with classical electronics. Their dimensions are within the realm of traditional IC technology, being in the micrometer range. The picture below (credits A. ter Haar, PhD thesis) shows a superconducting qubit (a so-called flux qubit) made of an inner loop with three Josephson junctions and an outer loop for readout and control.

picture taken from the PhD thesis of A. ter Haar

In addition, superconducting qubits can easily be layed out on a chip, which makes them good candidates for small, highly integrated quantum computing devices. Of course they need cooling as the need to operate below the critical temperature of the employed superconducting material, but this is standard technology by now. Today, most of the big players in the quantum world like Google, IBM, Rigetti and D-Wave are focusing their efforts on superconducting qubits.

To interact with the qubit (and to make two qubits interact), there are several options. We can apply an external voltage, an external current or couple the system with a microwave cavity acting as a resonator. We also have several choices to combine a Josephson junction with classical circuits – we can add an additional capacitor in series with the Josephson junction (which will give us a qubit called the charge qubit), combine this setup with an additional capacitor parallel to the Junction (transmon qubit) or use a Josephson junction in combination with a classical inductance formed by a superconducting loop (flux qubit). All these circuits have different characteristics and have been used to realize qubits. The flux qubit, for instance, is being used in the D-Wave quantum annealer, whereas Google uses transmon qubits for their devices. We will dive a little bit deeper into each of them in the next posts in this series.

Quantum teleportation

Quantum states are in many ways different from information stored in classical systems – quantum states cannot be cloned and quantum information cannot be erased. However, it turns out that quantum information can be transmitted and replicated by combining a quantum channel and a classical channel – a process known as quantum teleportation.

Bell states

Before we explain the teleportation algorithm, let us quickly recall some definitions. Suppose that we are given a system with two qubits, say A and B. Consider the unitary operator

U = C_{NOT} (H \otimes I)

i.e. the operator that applies a Hadamard operator to qubit A and then a CNOT gate with control qubit A and target qubit B. Let us calculate the action of this operator on the basis  state |0 \rangle |0 \rangle.

U |0 \rangle |0 \rangle = \frac{1}{\sqrt{2}} C_{NOT} (|0 \rangle |0 \rangle + |1 \rangle |0 \rangle) = \frac{1}{\sqrt{2}} (|00 \rangle + |11 \rangle)

It is not difficult to show that this state cannot be written as a product – it is an entangled state. Traditionally, this state is called a Bell state.

Now the operator U is unitary, and it therefore maps a Hilbert space basis to a Hilbert space basis. We can therefore obtain a basis of our two-qubit Hilbert space that consists of entangled states by applying the transformation U to the elements of the computational basis. The resulting states are easily calculated and are as follows (we use the notation from [1]).

Computational basis vector Bell basis vector
|00 \rangle |\beta_{00} \rangle = \frac{1}{\sqrt{2}} (|00 \rangle + |11 \rangle)
|01 \rangle |\beta_{01} \rangle = \frac{1}{\sqrt{2}} (|01 \rangle + |10 \rangle)
|10 \rangle |\beta_{10} \rangle = \frac{1}{\sqrt{2}} (|00 \rangle - |11 \rangle)
|11 \rangle |\beta_{11} \rangle = \frac{1}{\sqrt{2}} (|01 \rangle - |10 \rangle)

Of course we can also reverse this process and express the elements of the computational basis in terms of the Bell basis. For later reference, we note that we can write the computational basis as

|00\rangle = \frac{1}{\sqrt{2}} (\beta_{00} + \beta_{10})
|01\rangle = \frac{1}{\sqrt{2}} (\beta_{01} + \beta_{11})
|10\rangle = \frac{1}{\sqrt{2}} (\beta_{01} - \beta_{11})
|11\rangle = \frac{1}{\sqrt{2}} (\beta_{00} - \beta_{10})

Quantum teleportation

Let us now turn to the real topic of this post – quantum teleportation. Suppose that Alice is in possession of a qubit that captures some quantum state |\psi \rangle = a |0 \rangle + b|1 \rangle, stored in some sort of quantum device, which could be a superconducting qubit, a trapped ion, a polarized photon or any other physical implementation of a qubit. Bob is operating a similar device. The task is to create a quantum state in Bob’s device which is identical to the state stored in Alice device.

To be able to do this, Alice and Bob will need some communication channel. Let us suppose that there is a classical channel that Alice can use to transmit a finite number of bits to Bob. Is that sufficient to transmit the state of the qubit?

Obviously, the answer is no. Alice will not be able to measure both coefficients a and b, and even if she would find a way to do this, she would be faced with the challenge of transmitting two complex numbers with arbitrary precision using a finite string of bits.

At the other extreme, if Alice and Bob were able to fully entangle their quantum devices, they would probably be able to transmit the state. They could for instance implement a swap gate to move the state from Alice device onto Bob’s device.

So if Alice and Bob are able to perform arbitrary entangled quantum operations on the combined system consisting of Alice qubit and Bob’s qubit, a transmission is possible. But what happens if the devices are separated? It turns out that a transmission is still possible based on the classical channel provided that Alice and Bob had a chance to prepare a common state before Alice prepares |\psi \rangle. In fact, this common state does not depend at all on |\psi \rangle, and at no point during the process, any knowledge of the state |\psi \rangle is needed.

The mechanism first described in [2] works as follows. We need three qubits that we denote by A, B and C. The qubit C contains the unknown state |\psi \rangle_C (we use the lower index to denote the qubit in which the state lives). We also assume that Alice is able to perform arbitrary measurements and operations on A and C, and that B similarly controls qubit B.

The first step in the algorithm is to prepare an entangled state

|\beta_{00}^{AB} \rangle = \frac{1}{\sqrt{2}} \big[ |0\rangle _A |0 \rangle _B+ |1\rangle_A |1 \rangle_B) \big]

Here the upper index at \beta indicates the two qubits in which the Bell state vector lives. This is the common state mentioned above, and it can obviously be prepared upfront, without having the state |\psi \rangle_C. Bob and Alice could even prepare an entire repository of such states, ready for being used when the actual transmission should take place.

Now let us look at the state of the combined system consisting of all three qubits.

|\beta_{00}^{AB} \rangle |\psi\rangle_C = \frac{1}{\sqrt{2}} \big[ a |000 \rangle + a |110 \rangle + b |001\rangle + b |111 \rangle \big]

For a reason that will become clear in a second, we will now adapt the tensor product order and choose the new order A-C-B. Then our state is (simply swap the last two qubits)

\frac{1}{\sqrt{2}} \big[ a |000 \rangle + a |101 \rangle + b |010\rangle + b |111 \rangle \big]

Now instead of using the computational basis for our three-qubit system, we could as well use the basis that is given by the Bell basis of the qubits A and C and the computational basis of B, i.e. |\beta^{AC}_{ij} \rangle |k\rangle. Using the table above, we can express our state in this basis. This will give us four terms that contain a and four terms that contain b. The terms that contain a are

\frac{a}{2} \big[ |\beta^{AC}_{00}\rangle |0 \rangle + |\beta^{AC}_{10}\rangle |0 \rangle + |\beta^{AC}_{01}\rangle |1 \rangle - |\beta^{AC}_{11} \rangle |1 \rangle \big]

and the terms that contain b are

\frac{b}{2} \big[ |\beta^{AC}_{01}\rangle |0 \rangle + |\beta^{AC}_{11}\rangle |0 \rangle + |\beta^{AC}_{00}\rangle |1 \rangle - |\beta^{AC}_{10} \rangle |1 \rangle \big]

So far we have not done anything to our state, we have simply re-written the state as an expansion into a different basis. Now Alice conducts a measurement – but she measures A and C in the Bell basis. This will project the state onto one of the basis vectors |\beta_{ij}^{AC}\rangle. Thus there are four different possible outcomes of this measurement, which we can read off from the expansion in terms of the Bell basis above.

Measurement State of qubit B
\beta_{00} a|0\rangle + b|1\rangle = |\psi \rangle
\beta_{01} a|1\rangle + b|0\rangle = X|\psi \rangle
\beta_{10} a|0\rangle - b|1\rangle = Z|\psi \rangle
\beta_{11} b|0\rangle - a|1\rangle = -XZ|\psi \rangle

Now Alice sends the outcome of the measurement to Bob using the classical channel. The table above shows that the state in Bob’s qubit B is now the result of a unitary transformation which depends only on the measurement outcome. Thus if Bob receives the measurement outcomes (two bits), he can do a lookup on the table above and apply the inverse transformation on his qubit. If, for instance, he receives the measurement outcome 01, he can apply a Pauli X gate to his qubit and then knows that the state in this qubit will be identical to the original state |\psi \rangle. The teleportation process is complete.

Implementing teleportation as a circuit

Let us now build a circuit that models the teleportation procedure. The first part that we need is a circuit that creates a Bell state (or, more generally, transforms the elements of the computational basis into the Bell basis). This is achieved by the circuit below (which again uses the Qiskit convention that the most significant qubit q1 is the leftmost one in the tensor product).


It is obvious how this circuit can be reversed – as both individual gates used are there own inverse, we simply need to reverse their order to obtain a circuit that maps the Bell basis to the computational basis.

Let us now see how the overall teleportation circuit looks like. Here is the circuit (drawn with Qiskit).


Let us see how this relates to the above description. First, the role of the qubits is, from the top to the bottom

C = q[2]
A = q[1]
B = q[0]

The first two gates (the Hadamard and the CNOT gate) act on qubits A and B and create the Bell basis vector \beta_{00} from the initial state. This state is the shared entangled state that Alice and Bob prepare upfront.

The next section of the circuit operates on the qubits A and C and realizes the measurement in the Bell basis. We first use a combination of another CNOT gate and another Hadamard gate to map the Bell basis to the computational basis and then perform a measurement – these steps are equivalent to doing a measurement in the Bell basis directly.

Finally, we have a controlled X gate and a controlled Z gate. As described above, these gates apply the conditional corrections to Bob’s state in qubit B that depend on the outcome of the measurement. As a result, the state of qubit B will be the same as the initial state of qubit C.

How can we test this circuit? One approach could be to add a pre-processing part and a post-processing part to our teleportation circuit. The pre-processing gate acts with one or more gates on qubit C to put this qubit into a defined state. We then apply the teleportation circuit. Finally, we apply a post-processing circuit, namely the inverse sequence of gates to the “target” of the teleportation, i.e. to qubit B. If the teleportation works, the pre- and post-steps act on the same logical state and therefore cancel each other. Thus the final result in qubit B will be the initial state of qubit C, i.e. |0 \rangle. Therefore, a measurement of this qubit will always yield zero. Thus our final test circuit looks as follows.


When we run this on a simulator, the result will be as expected – the value in the classical register c2 after the measurement will always be zero. I have not been able to test this on real hardware as the IBM device does not (yet?) support classical conditional gates, but the result is predictable – we would probably get the same output with some noise. If you want to play with the circuits yourself, you can, as always, find the code in my GitHub repository.

So this is quantum teleportation – not quite as mysterious as the name suggests. In particular, we do not magically teleport particles and actual matter, but simply quantum information, based on a clever split of the information contained in an unknown state into a classical part, transmitted in two classical bit, and a quantum part that we can prepare upfront. We also do not violate special relativity – to complete the process, we depend on the measurement outcomes that are transmitted via a classical channel and therefore not faster than the speed of light. It is also important to note that quantum teleportation does also not violate the “no-cloning”-principle – the measurements let the original state collapse and therefore we do not end up with two copies of the same state.

Quantum teleportation has already been demonstrated in reality several times. The first experimental verification was published in [3] in 1997. Since then, the distance between the involved parties “Alice” and “Bob” has been gradually increased. In 2017, a chinese team reported a teleportation of a single qubit between a ground observatory to a low-Earth-orbit satellite (see [4]).

The term quantum teleportation is also used in the context of quantum error correction for a circuit that uses a measurement on an entangled state to bring one of the involved qubits into a specific state. When using a surface code, for instance, this technique – also called gate teleportation – is used to implement the T gate on the level of logical qubits. We refer to [5] for a more detailed discussion of this pattern.


1. M. Nielsen, I. Chuang, Quantum Computation and Quantum Information, Cambridge University Press, Cambridge 2010
2. C.H. Bennett, G. Brassard, C. Crepeau, R. Jozsa, A Peres, W.K. Wootter, Teleporting an Unknown Quantum State via Dual Classical and EPR Channels, Phys. Rev. Lett. 70, March 1993
3. D. Bouwmeester, Jian-Wei Pan, K. Mattle, M. Eibl, H. Weinfurter, A. Zeilinger, Experimental quantum teleportation, Nature Vol. 390, Dec. 1997
4. Ji-Gang Ren et. al., Ground-to-satellite quantum teleportation, Nature Vol. 549, September 2017
5. D. Gottesman, I. L. Chuang, Quantum Teleportation is a Universal Computational Primitive, arXiv:quant-ph/9908010

NMR based quantum computing: gates and state preparation

In my last post on NMR based quantum computing, we have seen how an individual qubit can be implemented based on NMR technology. However, just having a single qubit is of course not really helpful – what we are still missing is the ability to initialize several qubits and to realize interacting quantum gates. These are the topics that we will discuss today.

The physics and mathematics behind these topics is a bit more involved, and I will try to keep things simple. In case you would like to dive more deeply into some of the details, you might want to take a look at my notes on GitHub as well as the references cited there.

Pseudo-pure states

In quantum computing, we are typically manipulating individual qubits which are initially in a pure state. In an NMR probe, however, we usually have a state which is very far from being a pure state. Instead, we typically deal with highly mixed states, and it appears to be impossible to prepare all spins in an NMR probe in the same, well defined pure state. So we first need to understand how the usual formalism of quantum computing – expressed in terms of qubits being in pure states which are subject to unitary operations – relates to the description of an NMR experiment in terms of density matrices and mixed states.

The formalism that we will now describe is known as the formalism of pseudo-pure states. These are states that are described by a density matrix of the form

\rho = \frac{1}{2^N}(1-\epsilon) + \epsilon |\psi \rangle \langle \psi |

with a pure state |\psi \rangle. Here, N is the number of spins, and the factor 1/2N has been inserted to normalize the state. This density matrix describes an ensemble for which almost all molecules are in purely statistically distributed states, but a small fraction – measured by \epsilon – of them are in a pure state. The second term in this expression is often called the deviation and denoted by \rho_\Delta.

Why are these states useful? To see this, let us calculate the time evolution of this state under a unitary matrix U. In the density matrix formalism, the time evolution is given by conjugation, and a short calculation shows that

U(t) \rho U(t)^{-1} = \frac{1}{2^N} (1-\epsilon) + \epsilon |U(t) \psi \rangle \langle U(t) \psi |

Thus the result is again a pseudo-pure state. In fact, it is the pseudo-pure state that corresponds to the pure state U|\psi \rangle. Similarly, one can show that if A is an observable, described by a hermitian matrix (which, for technical reasons, we assume to be traceless), then the expectation value of A evaluated on |\psi \rangle – which is the result a measurement would yield in the pure state formalism – is, up to a constant, the same as the result of a measurement of A on an ensemble prepared in the pure state corresponding to |\psi \rangle!

Using these relations, we can now translate the typical process of a quantum computation as expressed in the standard, pure state formalism, into NMR terminology as follows.

  • The process of initializing a quantum computer into an initial state |\psi \rangle corresponds to putting the NMR probe into the corresponding pseudo-pure state with \rho_{\Delta} \sim |\psi \rangle \langle \psi|
  • If the quantum algorithm is described as a unitary operator U (typically presented as a sequence of gates Ui), we apply the same unitary operator to the density matrix, i.e. we realize the gates as an NMR experiment
  • We then measure the macroscopic quantity A which will give us – up to a factor – the result of the calculation

This is nice, but how do we prepare these states? Different researchers have come up with different techniques to do this. One of the ideas that is commonly applied is averaging. This is based on the observation that the state at thermal equilibrium is a pseudo-pure state up to an error term. This error term can be removed by averaging over many instances of the same experiment (while somehow shuffling the initial states around using a clever manipulation). So we first let the probe settle down, i.e. prepare it in a thermal state (which can even be at room temperature). We then run our quantum algorithm and measure. Next, we re-initialize the system, apply a certain transformation and re-run the algorithm. We repeat this several times, with a prescribed set of unitary transformations that we apply in each run to the thermal equilibrium state before running the quantum algorithm. At the end, we add up all our results and take the average. It can be shown that these initial “shuffling transformations” can be chosen such that the difference between the thermal state and the pseudo-pure state cancels out.

Quantum gates

Having this result in place, we now need to understand how we can actually realize quantum gates. In the last post, we have already seen that we can use RF-pulses to rotate the state of an NMR qubit on the Bloch sphere, which can be used to realize single qubit gates. But how do we realize two-qubit gates like the CNOT gate? For that purpose, we need some type of interaction between two qubits.

Now, in reality, any two molecules in an NMR probe do of course interact – by their electric and magnetic fields, by direct collision and so forth. It turns out that in most circumstances, all but one type of coupling called the J-coupling can be neglected. This coupling is an indirect coupling – the magnetic moment created by a nuclear spin interacts with the electric field of an electron, which in turn interacts with the magnetic moment of a different nuclear spin. In the Hamiltonian – in a rotating frame – the J-coupling contributes a term like

\frac{2 \pi}{\hbar} J_{12} I^1_z I^2_z

where J12 is a coupling constant. This term introduces a correlation between the two nuclear spins, similar to an additional magnetic field depending on the state of the second qubit. The diagram below is the result of a simulation of an initial state to which J-coupling is applied and demonstrates that the J-coupling manifests itself in a splitting of peaks in an NRM diagram.


An NMR diagram is the result of a Fourier transform, so an additional peak corresponds to a slow, additional rotation of the two spin polarizations around each other.

Let us take a closer look at this. If we apply a free evolution under the influence of J-coupling over some time t, it can be shown – again I refer to my notes on GitHub for the full math – that the time evolution is given by the operator

U = \cos (\frac{\pi J_{12}}{2} t) -i \sigma_z^1 \sigma_z^2 \sin (\frac{\pi J_{12}}{2}  t)

If we choose t such that

\frac{\pi J_{12}}{2} t = \frac{\pi}{4}

and combine this time evolution with rotations around the z-axis of both qubits, we obtain the following transformation, expressed as a matrix

R_{z^2}(-\frac{\pi}{2}) R_{z^1}(-\frac{\pi}{2}) U =  \frac{1+i}{\sqrt{2}} \begin{pmatrix} 1 & & & \\ & 1 & & \\ & & 1 & \\ & & & -1  \end{pmatrix}

Technically, this transformation is the result of letting the system evolve under the influence of J-coupling for the time t and then applying two sharp RF pulses, one at resonance with the first qubit and one at resonance with the second qubit, timed such that they correspond to a rotation on the Bloch sphere around the z-axis with angle \pi / 2. The matrix on the right hand side of this equation represents a two-qubit transformation known as the controlled phase gate. This gate corresponds to applying a phase gate to the second qubit, conditional on the state of the first qubit.

This already looks very similar to a CNOT gate, and in fact it is – one can easily show that the phase gate is equivalent to the CNOT gate up to single qubit operations, more precisely these two gates are related by

C_{NOT} = (I \otimes H) C_{PHASE} (I \otimes H)

where I is the identity and H is the Hadamard gate. As single qubit operations can be realized by RF pulses, this shows that a CNOT gate can be realized by a free evolution under the J-coupling, framed by sequences of RF pulses. This, together with single qubit gates, gives us a universal gate set.

This completes our short deep dive into NMR based quantum computing. Historically, NMR based quantum computers were among the first fully functional implementations of (non error-corrected) universal quantum computers, but have come a bit out of fashion in favor of other technologies. At the time of writing, most of the large technology players focus on a different approach – superconducting qubits – which I will cover in the next couple of posts on quantum computing.

Single qubit NMR based quantum computation

In the previous post, we have sketched the basic ideas behind NMR based quantum computation. In this post, we will discuss single qubits and single qubit operations in more depth.

The rotating frame of reference

In NMR based quantum computing, quantum gates are realized by applying oscillating magnetic fields to our probe. As an oscillating field is time dependent, the Hamiltonian will be time dependent as well, making some calculations a bit more difficult. To avoid this, it is useful to pass to a different frame of reference, called the rotating frame of reference.

To explain this, let us first study a more general setting. Assume that we are looking at a quantum system with Hamiltonian H and a state vector |\psi \rangle. Suppose further that we are given a unitary group, i.e. a time-dependent unitary operator

T(t) = e^{-iAt}

with a hermitian matrix A. We can then consider the transformed vector

|\tilde{\psi}\rangle = T(t) |\psi \rangle

Using the product rule and the Schrödinger equation, we can easily calculate the time derivative of this vector and obtain

i \hbar \frac{d}{dt} |\tilde{\psi}\rangle = \hbar A |\tilde{\psi} \rangle + T(t) H |\psi\rangle = \tilde{H} |\tilde{\psi}\rangle


\tilde{H} = THT^* + \hbar A

In other words, the transformed vector again evolves over time according to an equation which is formally a Schrödinger equation if we replace the original Hamiltonian by the transformed Hamiltonian \tilde{H}.

Let us now apply this to the system describing a single nuclear spin with spin 1/2 in a constant magnetic field B along the z-axis of the laboratory system. In the laboratory frame, the Hamiltonian is then given by

H = \omega I_z

with the Larmor frequency \omega and the spin operator

I_z = \frac{\hbar}{2} \sigma_z

We now pass into a new frame of reference by applying the transformation

T(t) = \exp ( \frac{i\omega_{ref}}{\hbar} t I_z )

with an arbitratily chosen reference frequency \omega_{ref}. Geometrically, this is a rotation around the z-axis by the angle \omega_{ref}t. Using the formula above and the fact that T commutes with the original Hamiltonian, we find that the transformed Hamiltonian is

\tilde{H} = (\omega - \omega_{ref}) I_z

This is of the same form as the original Hamiltonian, with a corrected Larmor frequency

\Omega = \omega - \omega_{ref}

In particular, the Hamiltonian is trivial if the reference frequency is equal to the Larmor frequency.

Intuitively, this is easy to understand. We know that the time evolution in the laboratory frame is described by a precession with the frequency \omega. When we choose \omega_{ref} = \omega, we place ourselves in a frame of reference rotating with the same frequency around the z-axis. In this rotating frame, the state vector will be constant, corresponding to the fact that the new Hamiltonian vanishes. If we choose a reference frequency different from the Larmor frequency, we will observe a precession with the frequency \Omega.


Let us now repeat this for a different Hamiltonian – the Hamiltonian that governs the time evolution in the presence of an oscillating magnetic field. More precisely, we will look at the Hamiltonian of a rotating magnetic field (which is a good approximation for an oscillating magnetic field by an argument known as rotating wave approximation, see my notes for more details on this). In the presence of such a field, the Hamiltonian in the laboratory frame is

H = \omega I_z + \omega_{nut} [I_x \cos (\omega_{ref}t + \Phi_p) + I_y \sin (\omega_{ref}t + \Phi_p) ]

To calculate the Hamiltonian in the rotating frame, we have – according to the above formula – to apply the conjugation with T to each of the terms appearing in this Hamiltonian and add a correction term.

Now the transformation T is a rotation around the z-axis, and the result of applying a rotation around the z-axis to the operators Ix and Iy is well known and in fact easy to calculate using the commutation relations between the Pauli matrices. The correction term cancels the first term of the Hamiltonian as above. The transformed Hamiltonian is then given by

\tilde{H} = \omega_{nut} [I_x \cos \Phi_p + I_y \sin \Phi_p ]

In other words, the time dependence has disappeared and only the phase term remains. Again, this is not really surprising – if we look at a rotating magnetic field from a frame of reference that is rotating around the same axis with the same frequency, the result is a constant magnetic field.

The density matrix of a single qubit

We are now ready to formally describe a single qubit, given by all nuclei at a specific position in a system consisting of a large number of molecules. According to the formalism of statistical quantum mechanics, this ensemble is described by a 2×2 density matrix \rho. The time evolution of this density matrix is governed by the Liouville-von Neumann equation

i\hbar \frac{d}{dt} \rho = [H,\rho]

The density matrix is a hermitian matrix with trace 1. Therefore the matrix \rho - \frac{1}{2} is a traceless hermitian matrix. Any such matrix can be expressed as a linear combination of the Pauli matrices with real coefficients. Consequently, we can write

\rho(t) = \frac{1}{2} + f(t) \cdot I

where f is a three-vector with real coefficients and the dot product is a shorthand notation for

f \cdot I = f_x I_x + f_y I_y + f_z I_z

Similarly, the most general time-independent Hamiltonian can be written as

H = \frac{1}{2} tr(H) + a \cdot I

We can remove the trace by adding a constant, which does not change the physics and corresponds to a shift of the energy scale. Further, we can express the vector a as the product of a unit vector and a scalar. Thus the most general Hamiltonian we need to consider is

H = \omega_{eff} n \cdot I

with a real number \omega_{eff} (the reason for this notation will become apparent in a second) and a unit vector n.

Let us now plug this into the Liouville equation. Applying the very useful general identity (which is easily proved by a direct calculation)

[a\cdot I, b \cdot I] = i \hbar (a \times b) \cdot I

for any two vectors a and b, we find that

\dot{f} = - \omega_{eff} [f \times n]

This equation is often called the Bloch equation. By splitting f into a component perpendicular to n and a component parallel to n, one can easily see that the solution is a rotation around the axis n with frequency \omega_{eff}.

What is the physical interpretation of this result and the physical meaning of f? To see this, let us calculate the expectation value of the magnetic moment induced by the spin of our system. The x-component of the magnetic moment, for instance, corresponds to the observable \gamma I_x. Therefore, according to the density matrix formalism, the expectation value of the x-component of the magnetic moment \mu is

\langle \mu_x \rangle = \gamma tr (\rho I_x)

If we compute the matrix product \rho I_x and use the fact that the trace of a product of two different Pauli matrices is zero, we find that the only term that contributes to the trace is the term fx, i.e.

\langle \mu_x \rangle = \frac{\gamma \hbar^2}{4} f_x

Similar calculations work for the other components and we find that, up to a constant, the vector f is the net magnetic moment of the probe.

A typical NMR experiment

After all these preparations, we now have all tools at our disposal to model the course of events during a typical NMR experiment in terms of the density matrix.

Let us first try to understand the initial state of the system. In a real world experiment, none of the qubits is fully isolated. In addition, the qubits interact, and they interact with the surroundings. We can model these interactions by treating the qubits as being in contact with a heat bath of constant temperature T. According to the rules of quantum statistical mechanics, the equilibrium state, i.e. the state into which the system settles down after some time, is given by the Boltzmann distribution, i.e.

\rho(t=0) = \frac{1}{Z} \exp (- \frac{H}{kT})

In the absence of an additional rotating field, the Hamiltonian in the laboratory frame is given by

H = \omega I_z


\rho(t=0) = \frac{1}{Z} \exp ( \frac{-\omega}{kT} I_z)

Using the relations

I_z = \frac{\hbar}{2} \sigma_z


\omega = -\gamma B \sigma_z

with the gyromagnetic moment \gamma, we can write this as

\rho(t=0) = \frac{1}{Z} \exp ( \frac{1}{2} {{\hbar \gamma B}\over{kT}} \sigma_z)

Let us now introduce the energy ratio

\beta = \frac{\hbar \gamma B}{kT}

The energy in the numerator is the energy scale associated with the Larmor frequency. For a proton in a magnetic field of a few Tesla, for example, this will be in the order of 10-25 Joule. The energy in the denominator is the thermal energy. If the experiment is conducted at room temperature, say T=300K, then this energy is in the order of 10-21 Joule (see this notebook for some calculations). This yields a value of roughly 10-4 for \beta. If we calculate the exponential in the Boltzmann distribution by expanding into a power series, we can therefore neglect all terms except the constant term and the term linear in \beta. This gives the approximation

\rho(t=0) = \frac{1}{Z} (1 + \frac{1}{2} \beta \sigma_z)

called the high temperature approximation. We can determine the value of Z by calculating the trace and find that Z = 2, so that

\rho(t=0) = \frac{1}{2} + \frac{1}{4} \beta \sigma_z = \begin{pmatrix} \frac{1}{2} + \frac{\beta}{4} & 0 \\ 0 & \frac{1}{2} - \frac{\beta}{4} \end{pmatrix} = \frac{1}{2} + \frac{1}{2} \frac{\beta}{\hbar} I_z

If we compare this to the general form of the density matrix discussed above, we find that the thermal state has a net magnetization in the direction of the z-axis (for positive \beta). This is what we expect – with our sign conventions, the energy is lowest if the spin axis is in the direction of the z-axis, so that slightly more nuclei will have their spins oriented in this direction, leading to a net magnetic moment.

To calculate how this state changes over time, we again pass to the rotating frame of reference. As the initial density matrix clearly commutes with the rotation around the z-axis, the density matrix in the rotating frame is the same. If we choose the reference frequency to be exactly the Larmor frequency, the Hamiltonian given by the static magnetic field along the z-axis vanishes, and the density matrix does not change over time.

When, however, we apply an additional pulse, i.e. an additional rotating magnetic field, for some time \tau, this changes. We have already seen that in the rotating frame, this pulse adds an additional term

\omega_{nut} (I_x \cos \Phi_p + I_y \sin \Phi_p)

to the Hamiltonian. This has the form discussed above – a scalar times a dot product of a unit vector with the vector I = (Ix, Iy, Iz). Therefore, we find that the time evolution induced by this Hamiltonian is a rotation around the vector

(\cos \Phi_p, \sin \Phi_p, 0)

with the frequency \omega_{nut}. If, for instance, we choose \Phi_p = 0, the vector f – and thus the magnetic moment – will slowly rotate around the x-axis. If we turn off the pulse after a time \tau such that

\omega_{nut} \tau = \frac{\pi}{2}

the net magnetization will be parallel to the y-axis. After the pulse has been turned off, the density matrix in the rotating frame is again constant, so the magnetic moment stays along the y-axis. In the laboratory frame, however, this corresponds to a magnetic moment which rotates with the Larmor frequency around the z-axis. This magnetic moment will induce a voltage in a coil placed in the x-y-plane which can be measured. The result will be an oscillating current, with frequency equal to the Larmor frequency. Over time, the state will slowly return into the thermal equilibrium, resulting in a decay of the oscillation.

This is a good point in time to visualize what is happening. Given the explicit formulas for the density matrix derived above, it is not difficult to numerically simulate the state changes and NMR signals during an experiment as the one that we have just described (if you want to take a look at the required code, you can find a Python notebook here)

The diagram below shows the result of such a simulation. Here, we have simulated a carbon nucleus in a TCE (Trichloroethylene) molecule. This molecule – pictured below (source Wikipedia) – has two central carbon nuclei.


A small percentage of all TCE molecules in a probe will have two 13C nuclei instead of the more common 12C nuclei, which have spin 1/2 and therefore should be visible in the NMR spectrum. At 11.74 Tesla, an isolated 13C carbon nucleus has a Larmor precession frequency of 125 MHz. However, when the nuclei are part of a larger molecule as in our case, each nucleus is shielded from an external magnetic field by the surrounding cloud of electrons. As the electron configuration for both nuclei is different, the observed Larmor frequencies differ by a small amount known as the chemical shift.

At the start of the simulation, the system was put into a thermal state. Then, an RF pulse was applied to flip the magnetization in the direction of the x-axis, and then a sample was taken over 0.1 seconds, resulting in the following signal.


The signal that we see looks at the first glance as expected. We see an oscillating signal with an amplitude that is slowly decaying. However, you might notice that the frequency of the oscillation is clearly not 125 MHz. Instead, the period is roughly 0.001 seconds, corresponding to a frequency of 1200 Hz. The reason for this is that in an NMR spectrometer, the circuit processing the received signal will typically apply a combination of a mixer and a low pass filter to effectively shift the frequency by an adjustable reference frequency. In our case, the reference frequency was adjusted to be 1200 Hz above the Larmor frequency of the carbon nucleus, so the signal will oscillate with a frequency of 1200 Hz. In practice, the reference frequency determines a window in the frequency space in which we can detect signals, and all frequencies outside this window will be suppressed by the low pass filter.

Now let us take a look at a more complicated signal. We again place the system in the thermal equilibrium state first, but then apply RF pulses to flip the spin of both carbon nuclei into the x-axis (in a simulation, this is easy, in a real experiment, this requires some thought, as the Larmor frequencies of these two carbon nuclei differ only by a small chemical shift of 900 Hz). We then again take a sample and plot the signal. The result is shown below.


This time, we see a superposition of two oscillations. The first oscillations is what we have already seen – an oscillation with 1200 Hz, which is the difference of the chosen reference frequency and the Larmor frequency of the first carbon. The second oscillation corresponds to a frequency of roughly 300 Hz. This is the signal caused by the Larmor precession of the second spin. As we again measure the difference between the real frequency and the reference frequency, we can conclude that this frequency differs from the Larmor frequency of the first spin by 900 Hz.

In reality, the signal that we observe is the superposition of many different oscillations and is not easy to interpret – even with a few oscillations, it soon becomes impossible to extract the frequencies by a graphical analysis as we have done it so far. Instead, one usually digitizes the signal and applies a Fourier transform (or, more precisely, a discrete Fourier transform). The following diagram shows the result of applying such a DFT to the signal above.


Here, we have shifted the x-axis by the difference between the Larmor frequency \omega_0 of the first nucleus and the reference frequency, so that the value zero corresponds to \omega_0. We clearly see two peaks. The first peak at zero, i.e. at Larmor frequency \omega_0, is the signal emitted by the first nucleus. The second peak is shifted by 900 Hz and is emitted by the second nucleus. In general, each nucleus will result in one peak (ignoring couplings that we will study in a later post) and the differences between the peaks belonging to nuclei of the same isotope are the chemical shifts.

Let us quickly summarize what we have learned. An ensemble of spin systems is described by a density which in turn is given by the net magnetization vector f. The result of applying a pulse to this state is a rotation around an axis given by the phase of the pulse (in fact, the phase can be adjusted to rotate around any axis in the x-y-plane, and as any rotation can be written as a decomposition of such rotations, we can generate an arbitrary rotation). The net magnetization can be measured by placing a coil close to the probe and measuring the induced voltage.

This does already look like we are able to produce a reasonable single qubit. The vector f appears to correspond – after some suitable normalization – to points on the Bloch sphere, and as we can realize rotations, we should be able to realize arbitrary single qubit quantum gates. But what about multiple qubits? Of course a molecule typically has more than one nucleus, and we could try to use additional nuclei to create additional qubits, but there is a problem – in order to realize multi-qubit gates, these qubits have to interact. In addition, we need to be able to prepare our NMR system in a useful initial state and, at the end of the computation, we need to measure the outcome. These main ingredients of NMR based quantum computing will be the subject of the next post.

Bulk quantum computing with nuclear spin systems

The theoretical foundations of universal quantum computing were essentially developed in the nineties of the last century, when the first native quantum algorithms and quantum error correction were discovered. Since then, physicists and computer scientists have been working on physical implementations of quantum computing. One of the first options that moved into the focus was to use a technology known as nuclear magnetic resonance (NMR) to develop universal quantum computers. In this post, we give a brief overview of NMR based quantum computing and prepare to dive into some advanced topics in the next few posts.

Introduction to NMR

Nuclear magnetic resonance is not a new technology. It was already developed to a high degree when quantum computation was initially discussed, and was therefore a natural choice for first implementations.

So what is NMR? Recall that all matter (as we know it) consists of molecules which, in turn, are compounds of atoms. Each atom consists of a nucleus and a cloud of electrons.

A nucleus is itself not an elementary particle, but is a composite system, formed by protons and neutrons (which again are not elementary but consist of quarks). As every elementary particle, each constituent of a nucleus has a spin degree of freedom. These spins add up to a total spin. Of course, the spin depends on the quantum mechanical state of the nucleus, but typically, the energy gap between the ground state and the first excited state is so high that at room temperature, it is a good approximation to assume that the spin of the nucleus is the ground state spin. The value of this spin of course depends on the structure of the nucleus. Some nuclei, like deuterium, have spin one, some – like 12C have spin zero, some like 13C have spin 1/2, and some have higher spins.

Of course the nuclear spin interacts with an external magnetic field. The strength of this interaction is determined by the coupling between the magnetic field and the spin which is expressed by the gyromagnetic moment of the nucleus.

Now suppose we are given a probe of a substance with nuclear spin 1/2 that is subject to an external field along some axis, say the z-axis. Let us simplify the discussion a bit by looking at a single nucleus only and ignoring all spatial degrees of freedom. The nuclear spin is then described by a two-level system, i.e. a qubit. The Hamiltonian of this system is

H = - \frac{\hbar}{2} \gamma B \sigma_z

where \gamma is the gyromagnetic moment, B is the magnetic field in the z-direction and \sigma_z is the usual Pauli Z-operator, i.e. the Hamiltonian is simply a multiple of the Pauli Z operator. Consequently, the time evolution operator is

U(t) = \exp \left( - \frac{i}{\hbar} H t \right) = \exp \left( - \omega i \frac{\hbar}{2} \sigma_z \right)

with \omega = - \gamma B. Thus, on the Bloch sphere, the time evolution is a rotation around the z-axis with frequency \omega. This rotation is called the Larmor precession and the frequency is called the Larmor frequency. For typical nuclei and fields, the Larmor frequency is in the order of a few hundred MHz.

Let us now suppose that the spin is originally in the ground state, i.e. pointing to the north pole of the Bloch sphere.


What happens if we now apply an external additional magnetic pulse? The answer will obviously depend on the frequency and direction of the pulse. A detailed analysis that we will carry out in a later post shows that if the frequency of the pulse is equal to the Larmor frequency, the effective Hamiltonian during the pulse (in a rotating frame of reference) is proportional to the Pauli X matrix. Therefore the time evolution operator during the pulse corresponds to a rotation around the X-axis. If we time the pulse correctly so that the angle of this rotation is \frac{\pi}{2}, then the nuclear spin will flip into a direction perpendicular to the z-axis.


After the pulse has completed, the time evolution is now again given by the rotation around the z-axis with the Larmor frequency. Consequently, the spin axis will start to precess around the z-axis. This spin precession represents a rotating magnetic moment and creates an oscillating magnetic field which in turn can be detected by placing a coil close the probe so that the rotating field will induce a voltage in the coil. Of course, the frequency of the observed signal will be that of the rotation, i.e. the Larmor frequency of the nucleus in question. We therefore find that we obtain a resonance peak at the Larmor frequency if we slowly vary the frequency of the external pulse.

In a probe that contains several different types of nuclei, we will observe more than one resonance peak – one at the Larmor frequency of each of the involved nuclei. The location and relative strength of these peaks can be used to determine the nuclei that make up the probe, and can even yield insights into the interactions between the nuclei and therefore the structure of the molecule. This technique is called nuclear magnetic resonance (NMR).

How NMR can be used for quantum computation

This description of the NMR process is on a very heuristic level, but it can be made precise (and we will look at a few more details in the next few posts). But let us take this for granted for today and let us discuss how this technology can be applied to implement a universal quantum computer.

At the first glance, this looks like a perfect match. If we pick a nucleus like 13C which has spin 1/2, the spin degree of freedom is described by a two-level system which makes for a perfect qubit. We have already indicated above that rotations on the Bloch sphere, i.e. one-qubit operations, can be realized by applying correctly timed magnetic fields. Given that the NMR technology is already highly developed, it appears that NMR should be a good candidate for an implementation.

However, there are few subtleties that we have ignored. First, we have not yet discussed multi-qubit systems. Of course, in a typical molecule, there is more than one nucleus. Not all nuclei have spin 1/2, but for some molecules, we will find several spin 1/2 nuclei which could serve as qubits. In one of the first applications of NMR to quantum computing (see [3]) for instance, a custom designed molecule was chosen to represent five qubits. Of course, this method is difficult to scale if we want to build systems with more than 20 or 50 qubits, but for a low number of qubits, it has been demonstrated to work.

Another problem which we have largely ignored so far is that of course, the probe that we use consists of a very large number of molecules. When we place this probe in our external magnetic field, the spin axis of the individual nuclei will not point all in the same direction, but instead we expect a random pattern of spin orientations. How can we then prepare the system in a well defined initial state? And how can we avoid that the individual signals created by each qubit cancel each other completely because of the random orientation of the spin axis? Of course we could try to cool down the system to a very low temperature to force all spin nuclei into the ground state, but this would make dealing with the system difficult and take very long. A solution to this problem was presented in [4] and [5] – it turns out that even though it appears almost impossible to prepare the system in a pure state, one can prepare it in a state which has properties similar to a pure state and is sufficient to do quantum computation.

In fact, the NMR technology quickly turned into the technology of choice for the first experimental verifications of quantum algorithms due to the availability of high-resolution NMR spectrometers and the many years of experience in the field. The first-ever factorization of an integer on a quantum computer using Shor’s algorithm (albeit a toy version which only works because the result was known upfront, see also my previous post on realizing Shor’s algorithm with Qiskit) was done on an NMR computer (see [6]). In recent years, however, other technologies like superconducting qubits and ion traps have moved more into the focus of the various research groups. Nevertheless, NMR based quantum computing is worth being studied as it illustrates many of the basic principles of the field.

This completes our short overview of NMR based quantum computation. In the next few posts, we will dig deeper into some of the questions raised above, starting with a more precise description of the physics underlying a single qubit NMR based quantum computation.


1. M.H. Levitt, Spin dynamics – basics of nuclear magnetic resonance, John Wiley & Sons, 2008
2. D.P. DiVincenzo, The Physical Implementation of Quantum Computation, arXiv:quant-ph/0002077
3. L.M.K. Vandersypen et. al., Experimental realization of order-finding with a quantum computer, arXiv:quant-ph/0002077
4. D.G. Cory, A.F. Fahmy, T.F. Havel, Ensemble Quantum Computing by NMR Spectroscopy, Proceedings of the National Academy of Sciences of the
United States of America, Vol. 94, No. 5 (Mar. 4, 1997), pp. 1634–1639
5. N.A. Gershenfeld, I.L Chuang, Bulk Spin-Resonance Quantum Computation, Science Vol. 275, Jan. 1997
6. L.M.K. Vandersypen et. al., Experimental realization of Shor’s quantum factoring algorithm using nuclear magnetic resonance, Nature Vol. 414, Dec. 2001

Quantum error correction: the surface code

In my previous post on quantum error correction, we have looked at the toric code which is designed for a rather theoretical case – a grid of qubits on a torus. In reality, qubits are more likely to be arranged in a planar geometry. Luckily, a version of the toric codes that works well in a planar geometry exists – the surface code.

Planar codes and their stabilizers

Recall that the starting point for the toric code was a lattice L with periodic boundary conditions so that the geometry of the lattice becomes toroidal, which gave us a four-dimensional code space.

If we want to use this code in planar setting, it is clear that we somehow have to modify the boundary conditions. What happens if we simply drop it? A short calculation using a bit of algebraic topology shows that in this case, the code space collapses to a one-dimensional code space which is not enough to hold a logical qubit. So we need some a more sophisticated boundary condition. The solution found in [1] is to use two types of boundaries.


Look at the diagram above. The solid lines form a lattice L with two types of boundary – a “smooth” boundary at the top and the bottom and a “rough boundary” left and right (ignore the dashed lines for a moment, we will explain their meaning in a second). We again place qubits on the edges of the lattice, but for the boundary, we only place qubits (again indicated by black dots) on the edges which are part of the smooth boundary.

As for the case of a toric code, we can again create a second lattice called the dual lattice whose vertices are the center points of the faces of L and whose edges are perpendicular to the edges of L. This dual lattice is indicated by the dashed lines in our diagram. Again, some edges of the dual lattice carry qubits and others do not, so we have a smooth and a rough boundary as well (note that the smooth boundary of L gives raise to the rough boundary of the dual lattice and vice versa).

Again a path along the edges, i.e. a set of edges, is an element of a group, the group of one chains. If we only allow the edges that are not part of the rough boundary, i.e. the edges that carry qubits, we obtain a subgroup which is known to topologists as the group of one chains relative to the rough boundary – again I will strive to keep this post free from too much algebraic topology and refer the reader to my notes for more details and a more precise description.

As for the toric code, every such relative one-chain c gives raise to a operators Xc and Zc, obtained by letting X respectively Z act on any qubit crossed by the chain. Similarly, a co-chain is a path in the dual lattice, i.e. a path perpendicular to the edges of the lattice L.

In the literature, the lattice and its dual are typically visualized differently, namely with all edges that do not carry qubits removed, as in the following diagram.


Here we have removed the rough boundary at the left and right of the lattice and the same for the dual lattice. The red line is a relative one chain. Note that this one-chain is “open-ended”. Its boundary would be in the (removed) rough boundary. In the relative version of the chain complex, its boundary is zero, hence this is the equivalent to a closed loop in the toric geometry. Similarly, the blue line represents a co-chain, i.e. a chain in the dual lattice, which has zero boundary in the dual lattice.

We can now proceed as for the toric code. For each vertex v, there is a vertex operator Xv, which is the product of the Pauli X operators acting on the qubits sitting on the edges touching the vertex (which are four qubits in the interior of the lattice and three qubits if the vertex is sitting on the smooth boundary). For each face f, we again have a face operator Zf, acting as a Pauli Z operator on the four qubits sitting on the boundary edges of the face – again, if the face is located at the rough boundary, there are only three operators. These operators again commute with each other and create an abelian subgroup S. The code space TS created by this group, i.e. the space of all states left invariant by all face operators and all vertex operators, now turns out to be two-dimensional and is thus encoding one logical qubit. The code obtained in this way is called the surface code.

Again, there are logical Pauli operators. In fact, these are the operators created by the chains marked in the diagram above – the Z operator Zc associated with the one chain c and the X operator Xc* associated with the co-chain c*. The fact that these operators commute with each operator in S is related to their boundaries being zero (in the sense described above, i.e. the are open-ended) and the fact that they anti-commute is expressed geometrically by the fact that they intersect exactly once.

Thus we obtain a logical qubit and logical Pauli operators that act on this qubit. What about measurements? As for the toric code, we can measure our (data) bits located at the edges by adding measurement qubits. To measure a face operator Zf, we can place an additional measurement qubit in the center of the face and use this qubit as an ancilla in our measurement circuit below.


Similarly, to measure the vertex operator Xv for a vertex v, we place an additional measurement qubit at the vertex and use a similar circuit to transfer the result of the measurement into this qubit, as we did it for the toric code. The only difference to the case of a toric code is again that if a vertex or face is adjacent to only three qubits as it is located close to the boundary, we only include those qubits in our measurement. We therefore obtain the following arrangement of qubits, were black dots are data qubits, white dots are ancilla qubits used for the measurements and a letter X or Z indicates the action of a Pauli operator (this is, up to a switch of black and white, the graphical representation of a surface code used in [2]).


For instance, a white dot surrounded by Z’s indicates a face and the corresponding face operator Zf, together with the black data qubits on which the face operator acts. At the same time, the white dot is the location of a measurement qubit used to measure Zf. Similarly, a white dot surrounded by X’s indicates both a vertex operator and a data qubit to measure the vertex operator.

Of qubits and holes

So far we have seen how the surface code can be used to encode a single logical qubit in a whole array of physical qubits. In practice, however, we need more than one qubit. We could of course try to arrange the planar surfaces in a stack, with qubits on each plane interacting with the qubits above and below, and then use some variant of quantum teleportation to establish connectivity between the logical qubits. However, there is a different way – we can realize more than one logical qubit on one array of physical qubits (this is only one option – as of today, there are many different approaches built upon similar ideas, like color codes or lattice surgery – see [5] and [6] for a discussion of some alternatives).

The idea is simple. We have obtained our code space as the intersection of the subspaces given by constraints, where each face operator and each vertex operator add one (linear) constraint. To obtain a larger code space, all we have to do is to drop a few of these constraints again!

To understand how this works, we first have to understand how the surface code is usually used in practice ([2]). We first put the system into an initial, potentially unknown state. We then perform a series of measurements of all stabilizers. This will force the state into a simultaneous eigenstate of all the stabilizer operators.

Now, the eigenvalues will typically not all be one. We therefore have to apply corrections to move our state into the code space (or we could as well keep track of the error syndrome and perform the corrections in software when we measure) – this is exactly the same process as if we had found errors during the later computation. In other words, we do a first round of error correction.

We now start our quantum algorithm. After each step of the algorithm, we perform another round of measurements of the stabilizers to detect any errors that might have occurred. Again, we could now correct the errors, or alternatively keep track of them as well and correct the final measurements if needed. In this way, a computation with a surface code is essentially a periodic measurement of all stabilizers in repeating cycles followed by an (actual or virtual) error correction, and logical operations executed between any two subsequent cycles.

Now let us see what happens when we modify our code by removing a face operator Zf from the set of stabilizers. Thus we perform some initial cycles to put the system into a ground state |g \rangle (here and in the sequel, we assume that errors are actually corrected and not just detected to simplify our calculations) on which all stabilizers act trivially, and starting with the next round of measurements, we exclude a Zf operator from the measurements.


Of course, this will enlarge the code space. Specifically, let us take a look at the upper left corner of the figure above. Here we have marked a face f and an operator Xc* coming from a string connecting this face with the rough boundary. Then the boundary of c* consists of one point in the dual lattice (the center of f), and hence every face operator commutes with Xc*, except Zf which anti-commutes. If we now define our code space to be the space fixed by all the vertex operators Xv and all face operators except Zf, then Xc* and Zf map that code space to itself. As c* intersects the boundary of f in one point, Zf and Xc* anti-commute and therefore define a set of logical Pauli operators, acting on the code space. Thus we have constructed a logical qubit!

Intuitively, removing the face operator Zf from the set of stabilizers corresponds to poking a “hole” into the surface, i.e. removing the face f and the data qubit in it. But of course we do not change the physical layout of our device at all – all we do is changing the code space and the set of logical Pauli operators. Our original code space, by the way, still exists, but we will tacitly ignore it and work with our new logical qubit.

This construction has a nice physical interpretation in terms of the particle picture that we have introduced in the post on the toric code. Suppose we create a pair of quasi-particles, supported on f and a second face close to the rough boundary. We have seen that we can move these particles by acting with edge operators Xe on them. Thus, we can move the second face on which the particle lives across the rough boundary and now obtain a particle supported on one face only. This is exactly the configuration which is obtained by acting with the logical Pauli X operator of our newly created logical qubit on the ground state, i.e. these logical qubits correspond to pairs of particles where one particle has been “pushed off” the surface!

As usual, this construction has a counterpart in the dual lattice, as shown on the right of of our diagram. The configuration marked in blue consists of a vertex operator Xv and a string c connecting that vertex to the smooth boundary. We can now remove Xv from the stabilizer, and obtain a pair of logical operators Xv and Zc. This type of logical qubit is called a single X-cut in [2], whereas the first type of qubit considered is a single Z-cut – other authors use the term defects for these logical qubits.

This technique allows us in theory to encode a large number of logical qubits in one surface. However, in practice, the restriction that we need to reach the boundary from each of the faces and vertices that we switch off is restricting our layout options a bit. Fortunately, it turns out that this is not even necessary.

In fact, coming back to the analogy of the quasi-particles, nobody forces us to move the second particle off the surface. To see this, consider the configuration marked in green in the lower part of the diagram above. Here we have removed two stabilizers from the code, the face operator Zf corresponding to the face on the left hand side, and the face operator Zf’ corresponding to the face on the right hand side. This will enlarge our code space by a space of dimension four. However, we now gracefully ignore half of that space and instead work with the space spanned by the Pauli operators Zf and Xc*.

A logical qubit obtained by this construction is called a double Z-cut. Of course, we can again repeat this construction in the dual lattice and obtain a double X-cut (these cuts are also for vegetarians…).

Let’s do the twist – moving and braiding qubits

Again, it is time to recap what we have done so far. We have seen that we can create logical qubits by removing face and vertex operators from the set of stabilizers. We have also seen that we can identify logical Pauli X and Z operators acting on these qubits. We can than, of course, also realize a logical Y as the product ZX. What about other gates? And specifically, what about multi-qubit gates?

It turns out that multi-qubit gates can be realized by moving logical qubits around the surface. To make this term a bit more precise, take a look at the following diagram.


Here we consider a logical qubit described by the Pauli operators Xc* and Zf, i.e the face operator Zf has been removed from the set of stabilizers. Adjacent to the face f, there is a face that we denote by \bar{f}. By removing Z_{\bar{f}} (and adding Zf again to the set of stabilizers), we would obtain a different code space – we can think of this different code space as a deformation of the code space given by removing Zf.

We want to describe a process that starts with a state in the code space given by Zf and ends with a state in the code space given by Z_{\bar{f}}. To do this, we manipulate the original state in several steps. First, we measure the Pauli X operator Xe for the physical qubit sitting on the edge e of the dual lattice that connects the centre of f with that of \bar{f} – this is the qubit on which both Z_{\bar{f}} and Zf act. Let us assume for a moment that the outcome of this measurement is one. Next, we do another measurement – this time we measure Zf. This will force the system into an eigenstate of Zf. Let us assume again that the outcome of this measurement is plus one, so that the system is now in the new code space obtained by removing Z_{\bar{f}} instead of Zf from the stabilizer.

This procedure gives us a mechanism to transform a state in the original code space into a state in the new code space. It is not difficult to see (again I refer to my notes for all the details) that this leaves the logical meaning of the state untouched, i.e. this transformation maps a logical one in the old codespace to a logical one in the new code space and the same for a logical zero (in fact, it turns out that we need to modify the Pauli operators a bit for this to work).

Of course we cannot directly compare the states before and after the transformation as they live in different code spaces. That, however, changes if we move a logical state around like this along a closed loop. Then we will obtain a new state which is living in the original code space and can compare it directly to the original state, as in the diagram below.


This diagram shows an example where we move a logical qubit generated by a Z-cut once around a logical qubit generated by an X-cut. If we hit upon the intersection between the loop and the X-cut, we pick up an additional transformation. This is a bit like a coordinate transformation – while we move along the loop, we constantly have to adjust our coordinate system, i.e. the logical Pauli operators, to preserve the logical meaning of the qubit. When we reach the initial position, we now have two coordinate system that we can compare, and we find that we have picked up a non-trivial transformation related to the fact that we cannot contract the loop without passing the hole in the surface creating our logical qubit. If you go through all the transformations carefully (or read my notes where I have tried to work out all the details – it is easy to get confused at this point), you will find that the transformation we pick up constitutes a CNOT gate between the two involved qubits! Thus a CNOT can be realized by moving one logical qubit around the second logical qubit once. This is often visualized by a two-strand braid, as illustrated below.


We have also seen that the transformation given by such a braiding operation is governed by terms like intersection numbers and contracting loop, i.e. the transformation is topological in nature – a slightly deformed braid or path gives the same result. Therefore this operation has a certain natural protection against faults, it is topologically protected. Unfortunately, the surface code does not allow a representation of a universal set of quantum gates by braids – doing this requires the use of a more sophisticated physical equivalent called non-abelian anyons. For the surface code, for instance, the T gate cannot be represented by topological operations and needs to be implemented using a technique called gate teleportation.

We will not go into details on this, but try to quickly describe the basic idea which is based on the observation that the circuit


can be used to realize the T gate.

How does this circuit work? The upper qubit is an ancilla qubit that is initialized with the specific state

|0\rangle + e^{i \frac{pi}{4}} |1\rangle

by a process called magic state distillation (see [2] for details). Now a short calculation shows that after applying the CNOT gate, the combined system of both qubits will be in the state

T|\psi \rangle \otimes |0 \rangle + (TX |\psi \rangle) \otimes |1 \rangle

Thus we have successfully transferred the state |\psi \rangle into the first qubit (this is why this and similar circuits are called gate teleportation circuits), but still have a superposition. To remove this superposition, we nown perform a measurement MZ on the second qubit. If the outcome of this measurement is one, the first qubit will be in the state T|\psi \rangle and we are done. If not, i.e. if the measurement results in minus one, then the above formula shows that the resulting state in the first qubit will be TX |\psi \rangle. Now one can easily verify that

TX = X T^*


T = S T^*

so that if we apply the sequence SX to the first qubit conditioned on the outcome of the measurement, as indicated in our circuit diagram, the result will in both cases be T|\psi \rangle.

By now, our discussion should have made it clear that the surface codes create a significant overhead. To illustrate this, the appendix of [2] contains an estimation of the number of physical qubits needed to implement Shor’s factoring algorithm for a 2048 bit RSA key. Still, most operations on a surface code are topologically protected, which makes surface codes rather robust. Therefore surface codes remain a promising technology to implement universal fault-tolerant quantum computation, and are one of the most active areas of current research in applied quantum computing.


1. S. Bravyi, A. Kitaev, Quantum codes on a lattice with boundary, arXiv:quant-ph/9811052
2. A.G. Fowler, M. Mariantoni, J.M. Martinis, A.N. Cleland, Surface codes: Towards practical large-scale quantum computation, arXiv:1208.0928v2
3. A.G. Fowler, A.M. Stephens, P. Groszkowski, High threshold universal quantum computation on the surface code, arXiv:0803.0272
4. E. Dennis, A. Kitaev, A. Landahl, J. Preskill, Topological quantum memory, arXiv:quant-ph/0110143
5. T.J. Yoder, I.H. Kim, The surface code with a twist, Quantum Vol. 1, 2017 (available online)
6. A. Javadi-Abhari et. al., Optimized Surface Code Communication in Superconducting Quantum Computers, arXiv:1708.09283


Quantum error correction: an introduction to toric codes

While playing with the IBM Q experience in some of my recent posts, we have seen that real qubits are subject to geometric restrictions – two-qubit gates cannot involve arbitrary qubits, but only qubits that are in some sense neighbors. This suggests that efficient error correction codes need to tie to the geometry of the quantum device. Toric codes are an example of these codes that create interesting connections between topology and quantum computing.

The stabilizer group of a toric code

In their original form introduced by A. Kitaev in 1997 (see e.g. [1]), toric codes are designed to operate on quantum circuits arranged on a torus. Specifically, we consider a 2-dimensional discrete lattice L with periodic boundary conditions. On each edge of this lattice, we place exactly one qubit.

In algebraic topology, one can associate with such a lattice an abelian group called the group of one-chains of the lattice and denoted by C_1(L). The elements of this group are simply binary linear combinations of edges, i.e. subsets of all edges. Addition in this group is given by joining the subsets, while dropping elements that appear twice. As there is one qubit sitting on every edge, every chain c, i.e. every subset of edges, will yield an operator that we call Xc and which is simply given by applying the Pauli X operator to each of the qubits sitting on an edge appearing in c. Similarly, one can define an operator Zc for evey chain c by combining the Pauli Z operators for the qubits touched by c.

This sounds a bit daunting and formal, so time to look at a picture and a few examples.


This diagram shows a graphical representation of our lattice L. The solid lines represent the edges of the lattice, their intersections are the vertices, see the labels at the lower right corner of the image. Cornered by four vertices and surrounded by four edges, there are the faces of the lattice. The black dots in the diagram indicate the qubits on the lattice.

Now let us look at the face marked in red and labeled by f at the top of the diagram. This face is surrounded by four edges colored red as well whose union is called the boundary of the face. This boundary is formalized as a one-chain denoted by \partial f \in C_1(L). As described above, we can associate an operator Z_{\partial f} to this one chain, which is simply given by letting a Pauli Z operator act on each of the qubits sitting on the red edges. This operator is called the face operator associated with the face f and denoted by Zf

On the right of the face, we have – in blue – marked a single vertex v. This vertex again defines a chain, namely the subset of all edges – marked in blue – that touch the vertex. Again, we can associate an operator to this chain, this time we choose the product of the Pauli X operators acting on the qubits on the blue edges. This operator is called the vertex operator for the vertex v and denoted by Xv.

Our lattice has many faces and many vertices. For each face, we have a face operator and for each vertex, we have a vertex operator. It is not difficult to check that all these operators do in fact commute (note that a vertex touches every boundary twice). Therefore the group S generated by all these operators is an abelian group.

Now recall from one of my previous posts that the stabilizer formalism can be used to associate to every abelian subgroup of the Pauli group a code. The code obtained from the group S generated by all face and vertex operators is called the toric code.

Let us try to better understand the code space TS. Suppose that our lattice has E edges, V vertices and F faces. Then there are E qubits, i.e. the Hilbert space of physical qubits has dimension 2E. The stabilizer group is generated by the V vertex operators and the F face operators, i.e. by V + F elements. However, these elements are not independent – there are relations imposed by the toric geometry. In fact, the product of all vertex operators is one and the product of all face operators is one (this is easiest seen with a bit of background in algebraic topology – for those readers who have that background, I refer to my notes on quantum error correction for the mathematical details behind this and some other statements in this post). Thus there are in fact V + F – 2 generators. As each generator introduces a constraint on the code space that cuts the Hilbert space dimension down by a factor two, the dimension of the code space is

2E – (V+F – 2) = 22 – (V + F – E)

Now the number V + F – E is known as the Euler number of the torus in algebraic topology, and it also known that the Euler number of the torus is zero. Therefore the dimension of the code space is 22 = 4, i.e. the code space encodes two logical qubits (this will change if we later move on to a planar geometry).

Let us now try to understand the logical gates acting on these two logical qubits. By the general theory of stabilizer codes, we know that the logical operators are those operators that commute will all elements in the stabilizer. What does this mean geometrically? To see this, consider the following diagram.


Consider the vertical red line in the diagram, labeled by c1. This line is a subset of edges, i.e. a one chain. To this chain, we can associate a Z operator Zc1, which is the product of all Pauli Z operators acting on the qubits touched by the red line. Being a Z operator, this operator commutes with all face operators. What about the vertex operators?

Close to the right border of the lattice, we have indicated the vertex operator Xv associated with the vertex v. This vertex operator acts on four qubits, indicated by the edges marked with the blue dashed line. Two of these edges also appear in the chain c1. Therefore, when we commute Xv and Zc1, we pick up two signs that cancel and the operators commute. Thus Zc1 commutes with all vertex and face operators and is in the normalizer. The same is true for the operator associated with the second red line. These two Z operators therefore act on the logical qubit. By convention, we can define the logical state |0 \rangle_L to be the state which is left invariant by these two operators.

It is interesting to note that the vertex operator Xv commutes with Zc1 because the chain c1 crosses the vertex v and therefore contains both edges touching v. If however, a chain c ends at v, then we will only pick up one sign when commuting the X past the Z operators. Geometrically, a vertex operator Xv anti-commutes with a Z-operator Zc associated with a chain c if and only if the vertex v is in the boundary of c, otherwise it commutes. The red line in our diagram is distinguished by the fact that its boundary is empty – it is a closed circle or a cycle in the language of algebraic topology that goes around the torus once. It therefore commutes with all vertex operators and thus is an element of the stabilizer.

Similarly, the dotted blue lines represent X operators, given by the product of the Pauli X operators acting on those qubits intersected by the blue line. A similar argument shows that, as these lines also go once around the torus, the X operators associated with them are in the normalizer of S as well. Thus we obtain two X operators that act as logical Pauli X operators on our code space. Given the logical X and Z operators, we can therefore implement all single qubit gates on our code space. It turns out that multi-qubit gates can be implemented by braid operations, but we will leave this to a later post when we study planar codes.

The particle interpretation of the toric code and anyons

Let us now turn to the particle interpretation of error correction in these codes, that will bring the famous anyons and ideas from topological quantum computing into play. For that purpose, consider the operator

H = - \sum_v X_v - \sum_f Z_f

which is the sum of all face and vertex operators. Clearly, this operator is hermitian, and can therefore be thought of as the operator of a physical system. Does this system have a particle interpretation?

To see this, let us try to determine the eigenstates of the Hamiltonian, i.e. the energy levels of our system. As the Xv and the Zf all commute, we can find a simultaneous basis of eigenstates that will diagonalize the Hamiltonian. Let us call this basis \{ |\psi_i \rangle \}. Each of these vectors is an eigenstate of all face and vertex operators, and we denote the corresponding eigenvalues by xvi and zfi. Clearly, |\psi_i \rangle is then an eigenvector of H with eigenvalue

- \sum_f z_{fi} - \sum_v x_{vi}

This is minimal if and only if all the xvi and zfi are equal to one, which is the case if |\psi_i \rangle is an element of the code space. Thus we have identified the elements of the code space as the ground states of our system!

Now let us introduce an error into our system. One way to do this is to pick a chain c, i.e. a path on the lattice, and apply the Zc operator to one of the ground states |g \rangle. Let us try to determine the error syndrome of this state, i.e. let us try to figure out for which elements of the stabilizer, this state is no longer in the +1 eigenspace.

As all Z operators commute, the state is clearly still in the +1 eigenspace of all the face operators Zf. If Xv is a vertex operator, the state will be in the -1 eigenspace if and only if Zc anti-commutes with Xv. But we have already seen that this happens if and only if v is in the boundary of c, i.e. v is one of the two vertices where the path c starts and ends! Thus the error syndrome is localized at those two vertices – if we imagine the vertex operator measurements as being signaled by green (outcome plus one) and red (outcome minus one) lights attached to the respective vertex, the error would be represented two red lights sitting at these vertices.

If we now expand the state Z|g \rangle into eigenstates of the Hamiltonian, the above discussion shows that the eigenvalue has increased by four. Thus errors correspond to excited states, and the operators Zc are creation operators that create excited states that are in a certain sense localized at the vertices of the lattice. It is tempting and common to therefore call these excitations quasi-particles, similar to the phonons in a crystal lattice.

So far we have associated an error operator Zc and therefore a particle with every chain, i.e. a path that runs along the edges. We can also consider a path that runs perpendicular to all the edges – in algebraic topology, this would be called a chain in the dual lattice or a co-chain. To every such path, we can associate an X operator, and this X operator acting on a ground state will create another excitation. The particles associated with a chain is called an electric charge e by Kitaev, and the particle created by a co-chain is called an magnetic vortex m.


As indicated in the diagram above, electric charges come in pairs, are located on vertices of the lattice and manifest themselves as -1 eigenvalues of the corresponding vertex operators, whereas magnetic particles live on pairs of faces of the vertex and appear as -1 eigenvalues of face operators. In this diagram, the solid red line creates a pair of electric particles, and the solid blue line creates a pair of magnetic particles.

How do these particles relate to the composition of chains? Suppose that we are given a chain c12 running from vertex 1 to vertex 2 and a second chain c23 running from vertex 2 to vertex 3. Then there are two ways to create the particle sitting at the vertices 1 and 3 – we can either first act with the Z operator associated with the chain c12 and then let the operator associated with the chain c23 act, or we can form the sum c13 of the chains which now runs from 1 to 3 and let the operator associated with this chain act. Of course, the result will be the same. Thus we can interpret the act of applying the operator given by the chain c23 to an excited state as moving one of the particles from vertex 2 to vertex 3. The same applies to magnetic particles.

Now let us again consider the image above. Look at the blue dashed loop. According to what we have just said, we can start with the particle pair created by the solid blue line and then move one of the particles around that closed loop. The result will be a particle pair living at the same location as the original pair. You might think that the new state is in fact identical to the old state, but this is not true – the state picks up a phase factor minus one. This is the reason why Kitaev calls our quasi-particles anyons, a term generally used in particle physics to characterize particles with non-trivial statistics (in our case, mutual statistics)

Exchanging particles by moving them around each other can be described as an action of the pure braid group, i.e. as an action of the group of braids that return each particle to its original position (this subgroup, the kernel of the natural homomorphism from the braid group to the permutation group, is sometimes called the group of colored braids). If we perturb a braid slightly, the action will remain unchanged, so the unitary operators which are given by that action are by construction rather stable in the presence of noise. In [1], Kitaev developed the idea to use braid group representations to implement quantum gates. Unfortunately, the set of operators that we can obtain in this way using the toric code is not sufficient to implement a universal quantum computer – we will get back to this point in a later post when we look at the actual procedures used to perform computations in a toric or surface code. Therefore Kitaev goes on to describe certain hypothetical particles called non-abelian anyons which, if they can be physically realized, provide sufficiently rich unitary operations. This approach is called topological quantum computation. We will not go into further details but refer to the original work [1].

Measurement and error correction for toric codes

Let us quickly summarize what we have found so far. The code space of a toric code can be thought of as the ground state of a physical system, describing a collection of two-state systems located on the edges of a lattice. Errors correspond to fundamental excitations, i.e. quasi-particles, that come in pairs and sit on vertices and faces of the lattice. These errors can be detected by measuring face and vertex operators.

How can we perform these measurements in practice? To understand this, let us first recall a general approach for measuring eigenvalues. Suppose that U is an operator that is hermitian and at the same time unitary. Consider the following circuit which is a combination of two Hadamard gates with a controlled-U operation.


Let us calculate the state after applying this circuit. The initial state is

|0 \rangle |\psi \rangle

After applying the Hadamard gate to the first qubit, we obtain

\frac{1}{\sqrt{2}} \big[ |0 \rangle |\psi \rangle  + |1 \rangle |\psi \rangle   \big]

which the controlled U operation turns into

\frac{1}{\sqrt{2}} \big[ |0 \rangle \otimes |\psi \rangle  +  |1 \rangle \otimes U |\psi \rangle  \big]

Now U is unitary, and therefore its eigenvalues are plus and minus one. So we can decompose our state |\psi \rangle as

\alpha |\psi_+ \rangle + \beta |\psi_- \rangle

where |\psi_+\rangle is an eigenstate with eigenvalue plus one and |\psi_-\rangle an eigenstate with eigenvalue minus one. Plugging this into the above equation and applying the final Hadamard operator shows that the final outcome of the circuit will be

\alpha |0 \rangle \otimes |\psi_+\rangle + \beta |1 \rangle \otimes |\psi_- \rangle

If we now apply a measurement to the control qubit, which serves as an ancilla here, the state is projected onto either |1 \rangle \otimes |\psi_- \rangle or |0 \rangle \otimes |\psi_+ \rangle, with probabilities |\beta|^2 and |\alpha|^2. The remaining qubits end up in the state |\psi_-\rangle or |\psi_+ \rangle, i.e. in an eigenstate. Thus applying this circuitry and measuring the ancilla has the same implications as measuring the operator U and yields the same information.

Let us now apply this idea to the problem of measuring one of the operators Xv. This operator is a tensor product of four \sigma_x operators, i.e. inversion operators. Therefore a controlled-Xv circuit is nothing but a sequence of controlled inversions, i.e. CNOT gates. Thus we can measure a vertex operator using the following circuit.


Here the four qubits at the bottom correspond to the four edges meeting the vertex v. So we need one ancilla qubit for doing the measurement, and we need to be able to combine it in gates with all the four qubits sitting on the edges meeting v. For the physical implementation, the easiest approach is to place this additional qubit called measurement qubit in the vertex itself, in contrast to the data qubits that we have already placed on the edges. So we add one data qubit to each vertex.

A circuit along the same lines can be constructed to measure Zf for a face f. Using that H \sigma_x H = \sigma_z, we can construct such a circuit as follows. In the first time step, apply a Hadamard to both the ancilla and all target qubits. Then apply CNOT operators to all target qubits, controlled by the ancilla, and then apply Hadamard operators again. Now, it is well known that conjugating the CNOT with Hadamard gates switches the roles of target qubits and control qubits. This gives us the circuit displayed below.


Again, for each face f, we need an ancilla qubit to measure Zf. This qubit will have to interact with all data qubits on the edges bounding this face, so it is easiest to position it in the center of that face.

Overall, we have now added F + V measurement qubits to our original E data qubits. Now the Euler characteristic of the torus is zero, and therefore F + V = E, showing that this procedure actually doubles the number of physical qubits.

We have now arrived at a description of the toric code (and its cousin the surface code) that one can often find as a definition in the literature but which somehow obscures the relation to the surface topology a bit.

  • We start with a lattice L with E edges and identify opposite borders
  • On each vertex, we place a data qubit
  • On each face, we place a measurement qubit called a Z-qubit
  • On each vertex, we place a measurement qubit called an X-qubit
  • To each X-qubit, we associate the operator that acts as a bit flip on the four data qubits next to the measurement qubit
  • To each Z-qubit, we associate the operator that acts a a phase flip on the four data qubits next to the measurement qubit

Thus every data qubit is acted on by four operators, the two Z-operators corresponding to the adjacent faces and the two X-operators corresponding to the adjacent vertices. The additional measurement qubits – indicated by white circles in contrast to the dark circles being the data qubits – are shown below.


For one face and one vertex, we have indicated the associated operators Xv and Zf by adding letters X and Z to show how these operators act on the nearby data qubits. This is the graphical representation typically used in the literature (like [3]). Note that some authors use different conventions and associate Z-operators with faces and X-operators with vertices, but this is of course completely equivalent due to the duality relations between a lattice and a dual lattice.

This completes our short introduction to toric codes. In the next post in my series on quantum error correction, we will pass on to a planar geometry which turns the toric code into a surface code.


1 A. Kitaev, Fault-tolerant quantum computation by anyons, arXiv:quant-ph/9707021
2. H. Bombin, An introduction to topological quantum codes, in Quantum error correction, edited by D.A. Lidar and T.A. Brun, Cambridge University Press, New York, 2013, available as arXiv:quant-ph/1311.0277
3. A.G. Fowler, M. Mariantoni, J.M. Martinis, A.N. Cleland, Surface codes: Towards practical large-scale quantum computation, arXiv:1208.0928v2

Factoring integers on a quantum computer with Qiskit

After all the work done in the previous posts, we are now ready to actually implement Shor’s factoring algorithm on a real quantum computer, using once more IBMs Q Experience and the Qiskit framework.

First, recall that Shor’s algorithm is designed to factor an integer M, with the restriction that M is supposed to be odd and not a prime power. Thus the smallest meaningful value of M is M=15. Of course this is a toy example, but we will see that even this toy example can be challenging on the available hardware. Actually, 15 is also the number that was factored in the first implementation of Shor’s algorithm on a real quantum computer which used an NMR based device and was done in 2001 – see [1].

The quantum part of the algorithm is designed to find the period r of a chosen number a modulo M. There are several choices of a that are possible, the only condition we need is that a and M are co-prime. In this post, we follow the choice in [1] and use a = 11 and a = 7. As in [1] (and all other similar papers I am aware of, see also the discussion in this paper by Smolin, Smith which later appeared in Nature) we will also rely on prior knowledge of the result to build up our circuits, so we are actually cheating – this is not really an application of Shor’s algorithm, but merely a confirmation of a known factoring.

Of course, it is easy to compute the period directly in our case. The period of a = 11 is two, and the period of a = 7 is four. A smaller period means a simpler circuit, and therefore we start with the case a = 11. Let us see whether we can confirm the period.

The easy case – a = 11

For our tests, we will use the description of Shor’s algorithm in terms of the quantum phase estimation procedure. Thus we have a primary register that we initialize in the state |1\rangle and to which we apply a sequence of controlled multiplications by powers of a, and a working register, in which we do a quantum Fourier transform in the second step. The primary register needs to hold M=15, so we need four qubits there. For the working register, we use three qubits, which is the smallest possible choice, to keep the number of gates small.

The first step of our algorithm is controlled multiplication by a. However, this is greatly simplified by the fact that we apply this to exactly one state that we know upfront – the initial state 1 \rangle. Thus we only need to implement a unitary transformation which conditionally maps the state |1 \rangle to the state |a = 11 \rangle. In binary notation, eleven is 1011, and we therefore simply need to conditionally flip qubits 3 and 1. We also need a Pauli X gate in the primary register to build the initial state |1 \rangle from the fiducial state |0\rangle and Hadamard gates in the working register to create the initial superposition. Thus the first part of our circuit looks as follows.


What about higher powers of a? This is easy – we know that a2 is equal to one modulo M, and therefore the multiplication by a2 modulo M is simply the identity transformation. The same is true for all even powers and so we are already done with the first part of our circuit – this is why we call the case a = 11 the easy case. The only thing that is left is to add the quantum Fourier transformation and a measurement to the working register. This gives us the following circuit (which can obviously be simplified, more on this later).


Now we can run this on a simulator. Before we do this, let us try to understand what we expect. We know that after measuring the working register, the value that we obtain is a multiple of 23 / r = 8 / 2 = 4. Thus we expect to see peaks at 0 and four (binary 100). And this is actually what we get – the following histogram shows the output of an execution on the local QASM simulator integrated into Qiskit.


This is nice, our circuit seems to work correctly. So let us proceed and try this on real hardware. Before we do so, however, let us simplify our circuit a bit to reduce the number of gates needed and thus the noise level. First, we skip the final swap gates in the QFT circuit – this will change our expected output from 100 to 001, but we can keep track of this manually when setting up the measurement. Next, we have two Hadamard gates on w[2] that cancel and that we can therefore remove. And the Pauli X gate on p[0] is never really used and can be dropped. After these changes, we obtain the following circuit.


We can run this on the IBMQ hardware. As we require seven qubits in total, we need to do this on the 14 qubit model ibmq_16_melbourne. Here are the results of a test run, again as a histogram.


We see the expected peaks at 000 and 100, but also see some significant noise that is of course not present in the simulation. However, the probabilities to measure one of the theoretically expected values is still roughly twice as high as the probability of any of the other values.

The hard case – a = 7

Having mastered the case a = 11, let us now turn to the case a = 7. The only change that we need to make is the circuit acting on the primary register. In the first part of the circuit, we can again make use of the fact that we only have to deal with one possible input, namely the initial state |1 \rangle. The conditional multiplication by a maps this to |7\rangle, and we can again implement this by two conditional bit flips, i.e. two CNOT gates.

The second part, the conditional multiplication by a2 = 4 modulo M, requires a bit more gates. On a bit level, multiplication by four is a bit shift by two bits to the left. We can realize this as a sequence of two conditional swap gates, swapping the bits zero and two and the bits one and three. A conditional swap gate can be implemented by three CNOT gates, or, in QASM syntax,

gate cswap a,b,c
  cx c,b;
  ccx a,b,c;
  cx c,b;

We can also make a few simplifications by replacing CNOT gates with fixed values of the control qubit by either an inversion or the identity. We arrive at the following circuit.


This is already considerably more complex than for the case a = 11. When we add the QFT circuit and the measurements, we end up with the following circuit.


Now the period is four. Thus we expect peaks at all multiples of two, i.e. at all even values. And this is actually what we get when we run this on a simulator.


Now let us again see how we can optimize this circuit before trying it on real hardware. Again, there are a few additional simplifications that we can make to reduce the noise level. On w[2], we have a double Hadamard gate that we can remove. We can again dispose of the final swap gates and move the swap operation into the measurement. The last CNOT between p[1] and p[3] can be dropped as it does not affect the outcome. The same is true for the last CNOT between p[0] and p[2]. Thus we finally arrive at the following circuit.


And here is the result of running this on the 16 qubit IBM device.


Surprisingly, the noise level is not significantly worse than for the version with a = 11, even though we have a few more gates. We can clearly see that the probabilities to measure even values are significantly higher than the probabilities to measure odd values, corresponding to the period four.

So what did we learn from all this? First, our example was of course a toy model – we did code the circuit based on knowledge on the period and hard-coded the value of a. However, our example demonstrates the basic techniques to build circuits for the more general case. In real application, we could, given a number M, first determine a suitable value for a. The conditional multiplication by a is then acting as a permutation on the computational basis and can therefore be implemented by a sequence of (conditional) swap gates. The same is true for higher powers of a. So in theory, we have all the ingredients that we need to implement more general versions of the algorithm.

In practice, however, we will quickly reach a point where the number of gates exceeds the limit that we can reasonable implement given the current level of noise. So real applications of the algorithm in number ranges that are not tractable for classical digital computers will require advanced error correction mechanisms and significantly reduced noise levels. Still, it is nice (and fun) to see a toy version of the famous algorithm in action on a real device.

If you want to run the examples yourself, you can use my notebooks for the case a=11 and a=7.


1. L. Vandersypen, M. Steffen, G. Breyta, C. Yannoni, M. Sherwood, I. Chuang, Experimental realization of Shor’s quantum factoring algorithm using nuclear magnetic resonance, Nature Vol. 414, (December 2001)
2. P. Shor, Polynomial-Time Algorithms for Prime Factorization and Discrete Logarithms on a Quantum Computer, SIAM J.Sci.Statist.Comput. Vol. 26 Issue 5 (1997), pp 1484–1509, available as arXiv:quant-ph/9508027v2
3. R. Cleve, A. Ekert, C. Macchiavello, M. Mosca, Quantum Algorithms revisited, arXiv:9708016
4. A. Kitaev, Quantum measurements and the Abelian Stabilizer Problem, arXiv:quant-ph/9511026
5. J. Smolin, G. Smith, A. Vargo, Pretending to factor large numbers on a quantum computer, arXiv:1301.7007 [quant-ph]