## Quantum error correction: the surface code

In my previous post on quantum error correction, we have looked at the toric code which is designed for a rather theoretical case – a grid of qubits on a torus. In reality, qubits are more likely to be arranged in a planar geometry. Luckily, a version of the toric codes that works well in a planar geometry exists – the surface code.

## Planar codes and their stabilizers

Recall that the starting point for the toric code was a lattice L with periodic boundary conditions so that the geometry of the lattice becomes toroidal, which gave us a four-dimensional code space.

If we want to use this code in planar setting, it is clear that we somehow have to modify the boundary conditions. What happens if we simply drop it? A short calculation using a bit of algebraic topology shows that in this case, the code space collapses to a one-dimensional code space which is not enough to hold a logical qubit. So we need some a more sophisticated boundary condition. The solution found in [1] is to use two types of boundaries.

Look at the diagram above. The solid lines form a lattice L with two types of boundary – a “smooth” boundary at the top and the bottom and a “rough boundary” left and right (ignore the dashed lines for a moment, we will explain their meaning in a second). We again place qubits on the edges of the lattice, but for the boundary, we only place qubits (again indicated by black dots) on the edges which are part of the smooth boundary.

As for the case of a toric code, we can again create a second lattice called the dual lattice whose vertices are the center points of the faces of L and whose edges are perpendicular to the edges of L. This dual lattice is indicated by the dashed lines in our diagram. Again, some edges of the dual lattice carry qubits and others do not, so we have a smooth and a rough boundary as well (note that the smooth boundary of L gives raise to the rough boundary of the dual lattice and vice versa).

Again a path along the edges, i.e. a set of edges, is an element of a group, the group of one chains. If we only allow the edges that are not part of the rough boundary, i.e. the edges that carry qubits, we obtain a subgroup which is known to topologists as the group of one chains relative to the rough boundary – again I will strive to keep this post free from too much algebraic topology and refer the reader to my notes for more details and a more precise description.

As for the toric code, every such relative one-chain c gives raise to a operators Xc and Zc, obtained by letting X respectively Z act on any qubit crossed by the chain. Similarly, a co-chain is a path in the dual lattice, i.e. a path perpendicular to the edges of the lattice L.

In the literature, the lattice and its dual are typically visualized differently, namely with all edges that do not carry qubits removed, as in the following diagram.

Here we have removed the rough boundary at the left and right of the lattice and the same for the dual lattice. The red line is a relative one chain. Note that this one-chain is “open-ended”. Its boundary would be in the (removed) rough boundary. In the relative version of the chain complex, its boundary is zero, hence this is the equivalent to a closed loop in the toric geometry. Similarly, the blue line represents a co-chain, i.e. a chain in the dual lattice, which has zero boundary in the dual lattice.

We can now proceed as for the toric code. For each vertex v, there is a vertex operator Xv, which is the product of the Pauli X operators acting on the qubits sitting on the edges touching the vertex (which are four qubits in the interior of the lattice and three qubits if the vertex is sitting on the smooth boundary). For each face f, we again have a face operator Zf, acting as a Pauli Z operator on the four qubits sitting on the boundary edges of the face – again, if the face is located at the rough boundary, there are only three operators. These operators again commute with each other and create an abelian subgroup S. The code space TS created by this group, i.e. the space of all states left invariant by all face operators and all vertex operators, now turns out to be two-dimensional and is thus encoding one logical qubit. The code obtained in this way is called the surface code.

Again, there are logical Pauli operators. In fact, these are the operators created by the chains marked in the diagram above – the Z operator Zc associated with the one chain c and the X operator Xc* associated with the co-chain c*. The fact that these operators commute with each operator in S is related to their boundaries being zero (in the sense described above, i.e. the are open-ended) and the fact that they anti-commute is expressed geometrically by the fact that they intersect exactly once.

Thus we obtain a logical qubit and logical Pauli operators that act on this qubit. What about measurements? As for the toric code, we can measure our (data) bits located at the edges by adding measurement qubits. To measure a face operator Zf, we can place an additional measurement qubit in the center of the face and use this qubit as an ancilla in our measurement circuit below.

Similarly, to measure the vertex operator Xv for a vertex v, we place an additional measurement qubit at the vertex and use a similar circuit to transfer the result of the measurement into this qubit, as we did it for the toric code. The only difference to the case of a toric code is again that if a vertex or face is adjacent to only three qubits as it is located close to the boundary, we only include those qubits in our measurement. We therefore obtain the following arrangement of qubits, were black dots are data qubits, white dots are ancilla qubits used for the measurements and a letter X or Z indicates the action of a Pauli operator (this is, up to a switch of black and white, the graphical representation of a surface code used in [2]).

For instance, a white dot surrounded by Z’s indicates a face and the corresponding face operator Zf, together with the black data qubits on which the face operator acts. At the same time, the white dot is the location of a measurement qubit used to measure Zf. Similarly, a white dot surrounded by X’s indicates both a vertex operator and a data qubit to measure the vertex operator.

## Of qubits and holes

So far we have seen how the surface code can be used to encode a single logical qubit in a whole array of physical qubits. In practice, however, we need more than one qubit. We could of course try to arrange the planar surfaces in a stack, with qubits on each plane interacting with the qubits above and below, and then use some variant of quantum teleportation to establish connectivity between the logical qubits. However, there is a different way – we can realize more than one logical qubit on one array of physical qubits (this is only one option – as of today, there are many different approaches built upon similar ideas, like color codes or lattice surgery – see [5] and [6] for a discussion of some alternatives).

The idea is simple. We have obtained our code space as the intersection of the subspaces given by constraints, where each face operator and each vertex operator add one (linear) constraint. To obtain a larger code space, all we have to do is to drop a few of these constraints again!

To understand how this works, we first have to understand how the surface code is usually used in practice ([2]). We first put the system into an initial, potentially unknown state. We then perform a series of measurements of all stabilizers. This will force the state into a simultaneous eigenstate of all the stabilizer operators.

Now, the eigenvalues will typically not all be one. We therefore have to apply corrections to move our state into the code space (or we could as well keep track of the error syndrome and perform the corrections in software when we measure) – this is exactly the same process as if we had found errors during the later computation. In other words, we do a first round of error correction.

We now start our quantum algorithm. After each step of the algorithm, we perform another round of measurements of the stabilizers to detect any errors that might have occurred. Again, we could now correct the errors, or alternatively keep track of them as well and correct the final measurements if needed. In this way, a computation with a surface code is essentially a periodic measurement of all stabilizers in repeating cycles followed by an (actual or virtual) error correction, and logical operations executed between any two subsequent cycles.

Now let us see what happens when we modify our code by removing a face operator Zf from the set of stabilizers. Thus we perform some initial cycles to put the system into a ground state $|g \rangle$ (here and in the sequel, we assume that errors are actually corrected and not just detected to simplify our calculations) on which all stabilizers act trivially, and starting with the next round of measurements, we exclude a Zf operator from the measurements.

Of course, this will enlarge the code space. Specifically, let us take a look at the upper left corner of the figure above. Here we have marked a face f and an operator Xc* coming from a string connecting this face with the rough boundary. Then the boundary of c* consists of one point in the dual lattice (the center of f), and hence every face operator commutes with Xc*, except Zf which anti-commutes. If we now define our code space to be the space fixed by all the vertex operators Xv and all face operators except Zf, then Xc* and Zf map that code space to itself. As c* intersects the boundary of f in one point, Zf and Xc* anti-commute and therefore define a set of logical Pauli operators, acting on the code space. Thus we have constructed a logical qubit!

Intuitively, removing the face operator Zf from the set of stabilizers corresponds to poking a “hole” into the surface, i.e. removing the face f and the data qubit in it. But of course we do not change the physical layout of our device at all – all we do is changing the code space and the set of logical Pauli operators. Our original code space, by the way, still exists, but we will tacitly ignore it and work with our new logical qubit.

This construction has a nice physical interpretation in terms of the particle picture that we have introduced in the post on the toric code. Suppose we create a pair of quasi-particles, supported on f and a second face close to the rough boundary. We have seen that we can move these particles by acting with edge operators Xe on them. Thus, we can move the second face on which the particle lives across the rough boundary and now obtain a particle supported on one face only. This is exactly the configuration which is obtained by acting with the logical Pauli X operator of our newly created logical qubit on the ground state, i.e. these logical qubits correspond to pairs of particles where one particle has been “pushed off” the surface!

As usual, this construction has a counterpart in the dual lattice, as shown on the right of of our diagram. The configuration marked in blue consists of a vertex operator Xv and a string c connecting that vertex to the smooth boundary. We can now remove Xv from the stabilizer, and obtain a pair of logical operators Xv and Zc. This type of logical qubit is called a single X-cut in [2], whereas the first type of qubit considered is a single Z-cut – other authors use the term defects for these logical qubits.

This technique allows us in theory to encode a large number of logical qubits in one surface. However, in practice, the restriction that we need to reach the boundary from each of the faces and vertices that we switch off is restricting our layout options a bit. Fortunately, it turns out that this is not even necessary.

In fact, coming back to the analogy of the quasi-particles, nobody forces us to move the second particle off the surface. To see this, consider the configuration marked in green in the lower part of the diagram above. Here we have removed two stabilizers from the code, the face operator Zf corresponding to the face on the left hand side, and the face operator Zf’ corresponding to the face on the right hand side. This will enlarge our code space by a space of dimension four. However, we now gracefully ignore half of that space and instead work with the space spanned by the Pauli operators Zf and Xc*.

A logical qubit obtained by this construction is called a double Z-cut. Of course, we can again repeat this construction in the dual lattice and obtain a double X-cut (these cuts are also for vegetarians…).

## Let’s do the twist – moving and braiding qubits

Again, it is time to recap what we have done so far. We have seen that we can create logical qubits by removing face and vertex operators from the set of stabilizers. We have also seen that we can identify logical Pauli X and Z operators acting on these qubits. We can than, of course, also realize a logical Y as the product ZX. What about other gates? And specifically, what about multi-qubit gates?

It turns out that multi-qubit gates can be realized by moving logical qubits around the surface. To make this term a bit more precise, take a look at the following diagram.

Here we consider a logical qubit described by the Pauli operators Xc* and Zf, i.e the face operator Zf has been removed from the set of stabilizers. Adjacent to the face f, there is a face that we denote by $\bar{f}$. By removing $Z_{\bar{f}}$ (and adding Zf again to the set of stabilizers), we would obtain a different code space – we can think of this different code space as a deformation of the code space given by removing Zf.

We want to describe a process that starts with a state in the code space given by Zf and ends with a state in the code space given by $Z_{\bar{f}}$. To do this, we manipulate the original state in several steps. First, we measure the Pauli X operator Xe for the physical qubit sitting on the edge e of the dual lattice that connects the centre of f with that of $\bar{f}$ – this is the qubit on which both $Z_{\bar{f}}$ and Zf act. Let us assume for a moment that the outcome of this measurement is one. Next, we do another measurement – this time we measure Zf. This will force the system into an eigenstate of Zf. Let us assume again that the outcome of this measurement is plus one, so that the system is now in the new code space obtained by removing $Z_{\bar{f}}$ instead of Zf from the stabilizer.

This procedure gives us a mechanism to transform a state in the original code space into a state in the new code space. It is not difficult to see (again I refer to my notes for all the details) that this leaves the logical meaning of the state untouched, i.e. this transformation maps a logical one in the old codespace to a logical one in the new code space and the same for a logical zero (in fact, it turns out that we need to modify the Pauli operators a bit for this to work).

Of course we cannot directly compare the states before and after the transformation as they live in different code spaces. That, however, changes if we move a logical state around like this along a closed loop. Then we will obtain a new state which is living in the original code space and can compare it directly to the original state, as in the diagram below.

This diagram shows an example where we move a logical qubit generated by a Z-cut once around a logical qubit generated by an X-cut. If we hit upon the intersection between the loop and the X-cut, we pick up an additional transformation. This is a bit like a coordinate transformation – while we move along the loop, we constantly have to adjust our coordinate system, i.e. the logical Pauli operators, to preserve the logical meaning of the qubit. When we reach the initial position, we now have two coordinate system that we can compare, and we find that we have picked up a non-trivial transformation related to the fact that we cannot contract the loop without passing the hole in the surface creating our logical qubit. If you go through all the transformations carefully (or read my notes where I have tried to work out all the details – it is easy to get confused at this point), you will find that the transformation we pick up constitutes a CNOT gate between the two involved qubits! Thus a CNOT can be realized by moving one logical qubit around the second logical qubit once. This is often visualized by a two-strand braid, as illustrated below.

We have also seen that the transformation given by such a braiding operation is governed by terms like intersection numbers and contracting loop, i.e. the transformation is topological in nature – a slightly deformed braid or path gives the same result. Therefore this operation has a certain natural protection against faults, it is topologically protected. Unfortunately, the surface code does not allow a representation of a universal set of quantum gates by braids – doing this requires the use of a more sophisticated physical equivalent called non-abelian anyons. For the surface code, for instance, the T gate cannot be represented by topological operations and needs to be implemented using a technique called gate teleportation.

We will not go into details on this, but try to quickly describe the basic idea which is based on the observation that the circuit

can be used to realize the T gate.

How does this circuit work? The upper qubit is an ancilla qubit that is initialized with the specific state

$|0\rangle + e^{i \frac{pi}{4}} |1\rangle$

by a process called magic state distillation (see [2] for details). Now a short calculation shows that after applying the CNOT gate, the combined system of both qubits will be in the state

$T|\psi \rangle \otimes |0 \rangle + (TX |\psi \rangle) \otimes |1 \rangle$

Thus we have successfully transferred the state $|\psi \rangle$ into the first qubit (this is why this and similar circuits are called gate teleportation circuits), but still have a superposition. To remove this superposition, we nown perform a measurement MZ on the second qubit. If the outcome of this measurement is one, the first qubit will be in the state $T|\psi \rangle$ and we are done. If not, i.e. if the measurement results in minus one, then the above formula shows that the resulting state in the first qubit will be $TX |\psi \rangle$. Now one can easily verify that

$TX = X T^*$

and

$T = S T^*$

so that if we apply the sequence SX to the first qubit conditioned on the outcome of the measurement, as indicated in our circuit diagram, the result will in both cases be $T|\psi \rangle$.

By now, our discussion should have made it clear that the surface codes create a significant overhead. To illustrate this, the appendix of [2] contains an estimation of the number of physical qubits needed to implement Shor’s factoring algorithm for a 2048 bit RSA key. Still, most operations on a surface code are topologically protected, which makes surface codes rather robust. Therefore surface codes remain a promising technology to implement universal fault-tolerant quantum computation, and are one of the most active areas of current research in applied quantum computing.

## References

1. S. Bravyi, A. Kitaev, Quantum codes on a lattice with boundary, arXiv:quant-ph/9811052
2. A.G. Fowler, M. Mariantoni, J.M. Martinis, A.N. Cleland, Surface codes: Towards practical large-scale quantum computation, arXiv:1208.0928v2
3. A.G. Fowler, A.M. Stephens, P. Groszkowski, High threshold universal quantum computation on the surface code, arXiv:0803.0272
4. E. Dennis, A. Kitaev, A. Landahl, J. Preskill, Topological quantum memory, arXiv:quant-ph/0110143
5. T.J. Yoder, I.H. Kim, The surface code with a twist, Quantum Vol. 1, 2017 (available online)
6. A. Javadi-Abhari et. al., Optimized Surface Code Communication in Superconducting Quantum Computers, arXiv:1708.09283

## Quantum error correction: an introduction to toric codes

While playing with the IBM Q experience in some of my recent posts, we have seen that real qubits are subject to geometric restrictions – two-qubit gates cannot involve arbitrary qubits, but only qubits that are in some sense neighbors. This suggests that efficient error correction codes need to tie to the geometry of the quantum device. Toric codes are an example of these codes that create interesting connections between topology and quantum computing.

## The stabilizer group of a toric code

In their original form introduced by A. Kitaev in 1997 (see e.g. [1]), toric codes are designed to operate on quantum circuits arranged on a torus. Specifically, we consider a 2-dimensional discrete lattice L with periodic boundary conditions. On each edge of this lattice, we place exactly one qubit.

In algebraic topology, one can associate with such a lattice an abelian group called the group of one-chains of the lattice and denoted by $C_1(L)$. The elements of this group are simply binary linear combinations of edges, i.e. subsets of all edges. Addition in this group is given by joining the subsets, while dropping elements that appear twice. As there is one qubit sitting on every edge, every chain c, i.e. every subset of edges, will yield an operator that we call Xc and which is simply given by applying the Pauli X operator to each of the qubits sitting on an edge appearing in c. Similarly, one can define an operator Zc for evey chain c by combining the Pauli Z operators for the qubits touched by c.

This sounds a bit daunting and formal, so time to look at a picture and a few examples.

This diagram shows a graphical representation of our lattice L. The solid lines represent the edges of the lattice, their intersections are the vertices, see the labels at the lower right corner of the image. Cornered by four vertices and surrounded by four edges, there are the faces of the lattice. The black dots in the diagram indicate the qubits on the lattice.

Now let us look at the face marked in red and labeled by f at the top of the diagram. This face is surrounded by four edges colored red as well whose union is called the boundary of the face. This boundary is formalized as a one-chain denoted by $\partial f \in C_1(L)$. As described above, we can associate an operator $Z_{\partial f}$ to this one chain, which is simply given by letting a Pauli Z operator act on each of the qubits sitting on the red edges. This operator is called the face operator associated with the face f and denoted by Zf

On the right of the face, we have – in blue – marked a single vertex v. This vertex again defines a chain, namely the subset of all edges – marked in blue – that touch the vertex. Again, we can associate an operator to this chain, this time we choose the product of the Pauli X operators acting on the qubits on the blue edges. This operator is called the vertex operator for the vertex v and denoted by Xv.

Our lattice has many faces and many vertices. For each face, we have a face operator and for each vertex, we have a vertex operator. It is not difficult to check that all these operators do in fact commute (note that a vertex touches every boundary twice). Therefore the group S generated by all these operators is an abelian group.

Now recall from one of my previous posts that the stabilizer formalism can be used to associate to every abelian subgroup of the Pauli group a code. The code obtained from the group S generated by all face and vertex operators is called the toric code.

Let us try to better understand the code space TS. Suppose that our lattice has E edges, V vertices and F faces. Then there are E qubits, i.e. the Hilbert space of physical qubits has dimension 2E. The stabilizer group is generated by the V vertex operators and the F face operators, i.e. by V + F elements. However, these elements are not independent – there are relations imposed by the toric geometry. In fact, the product of all vertex operators is one and the product of all face operators is one (this is easiest seen with a bit of background in algebraic topology – for those readers who have that background, I refer to my notes on quantum error correction for the mathematical details behind this and some other statements in this post). Thus there are in fact V + F – 2 generators. As each generator introduces a constraint on the code space that cuts the Hilbert space dimension down by a factor two, the dimension of the code space is

2E – (V+F – 2) = 22 – (V + F – E)

Now the number V + F – E is known as the Euler number of the torus in algebraic topology, and it also known that the Euler number of the torus is zero. Therefore the dimension of the code space is 22 = 4, i.e. the code space encodes two logical qubits (this will change if we later move on to a planar geometry).

Let us now try to understand the logical gates acting on these two logical qubits. By the general theory of stabilizer codes, we know that the logical operators are those operators that commute will all elements in the stabilizer. What does this mean geometrically? To see this, consider the following diagram.

Consider the vertical red line in the diagram, labeled by c1. This line is a subset of edges, i.e. a one chain. To this chain, we can associate a Z operator Zc1, which is the product of all Pauli Z operators acting on the qubits touched by the red line. Being a Z operator, this operator commutes with all face operators. What about the vertex operators?

Close to the right border of the lattice, we have indicated the vertex operator Xv associated with the vertex v. This vertex operator acts on four qubits, indicated by the edges marked with the blue dashed line. Two of these edges also appear in the chain c1. Therefore, when we commute Xv and Zc1, we pick up two signs that cancel and the operators commute. Thus Zc1 commutes with all vertex and face operators and is in the normalizer. The same is true for the operator associated with the second red line. These two Z operators therefore act on the logical qubit. By convention, we can define the logical state $|0 \rangle_L$ to be the state which is left invariant by these two operators.

It is interesting to note that the vertex operator Xv commutes with Zc1 because the chain c1 crosses the vertex v and therefore contains both edges touching v. If however, a chain c ends at v, then we will only pick up one sign when commuting the X past the Z operators. Geometrically, a vertex operator Xv anti-commutes with a Z-operator Zc associated with a chain c if and only if the vertex v is in the boundary of c, otherwise it commutes. The red line in our diagram is distinguished by the fact that its boundary is empty – it is a closed circle or a cycle in the language of algebraic topology that goes around the torus once. It therefore commutes with all vertex operators and thus is an element of the stabilizer.

Similarly, the dotted blue lines represent X operators, given by the product of the Pauli X operators acting on those qubits intersected by the blue line. A similar argument shows that, as these lines also go once around the torus, the X operators associated with them are in the normalizer of S as well. Thus we obtain two X operators that act as logical Pauli X operators on our code space. Given the logical X and Z operators, we can therefore implement all single qubit gates on our code space. It turns out that multi-qubit gates can be implemented by braid operations, but we will leave this to a later post when we study planar codes.

## The particle interpretation of the toric code and anyons

Let us now turn to the particle interpretation of error correction in these codes, that will bring the famous anyons and ideas from topological quantum computing into play. For that purpose, consider the operator

$H = - \sum_v X_v - \sum_f Z_f$

which is the sum of all face and vertex operators. Clearly, this operator is hermitian, and can therefore be thought of as the operator of a physical system. Does this system have a particle interpretation?

To see this, let us try to determine the eigenstates of the Hamiltonian, i.e. the energy levels of our system. As the Xv and the Zf all commute, we can find a simultaneous basis of eigenstates that will diagonalize the Hamiltonian. Let us call this basis $\{ |\psi_i \rangle \}$. Each of these vectors is an eigenstate of all face and vertex operators, and we denote the corresponding eigenvalues by xvi and zfi. Clearly, $|\psi_i \rangle$ is then an eigenvector of H with eigenvalue

$- \sum_f z_{fi} - \sum_v x_{vi}$

This is minimal if and only if all the xvi and zfi are equal to one, which is the case if $|\psi_i \rangle$ is an element of the code space. Thus we have identified the elements of the code space as the ground states of our system!

Now let us introduce an error into our system. One way to do this is to pick a chain c, i.e. a path on the lattice, and apply the Zc operator to one of the ground states $|g \rangle$. Let us try to determine the error syndrome of this state, i.e. let us try to figure out for which elements of the stabilizer, this state is no longer in the +1 eigenspace.

As all Z operators commute, the state is clearly still in the +1 eigenspace of all the face operators Zf. If Xv is a vertex operator, the state will be in the -1 eigenspace if and only if Zc anti-commutes with Xv. But we have already seen that this happens if and only if v is in the boundary of c, i.e. v is one of the two vertices where the path c starts and ends! Thus the error syndrome is localized at those two vertices – if we imagine the vertex operator measurements as being signaled by green (outcome plus one) and red (outcome minus one) lights attached to the respective vertex, the error would be represented two red lights sitting at these vertices.

If we now expand the state $Z|g \rangle$ into eigenstates of the Hamiltonian, the above discussion shows that the eigenvalue has increased by four. Thus errors correspond to excited states, and the operators Zc are creation operators that create excited states that are in a certain sense localized at the vertices of the lattice. It is tempting and common to therefore call these excitations quasi-particles, similar to the phonons in a crystal lattice.

So far we have associated an error operator Zc and therefore a particle with every chain, i.e. a path that runs along the edges. We can also consider a path that runs perpendicular to all the edges – in algebraic topology, this would be called a chain in the dual lattice or a co-chain. To every such path, we can associate an X operator, and this X operator acting on a ground state will create another excitation. The particles associated with a chain is called an electric charge e by Kitaev, and the particle created by a co-chain is called an magnetic vortex m.

As indicated in the diagram above, electric charges come in pairs, are located on vertices of the lattice and manifest themselves as -1 eigenvalues of the corresponding vertex operators, whereas magnetic particles live on pairs of faces of the vertex and appear as -1 eigenvalues of face operators. In this diagram, the solid red line creates a pair of electric particles, and the solid blue line creates a pair of magnetic particles.

How do these particles relate to the composition of chains? Suppose that we are given a chain c12 running from vertex 1 to vertex 2 and a second chain c23 running from vertex 2 to vertex 3. Then there are two ways to create the particle sitting at the vertices 1 and 3 – we can either first act with the Z operator associated with the chain c12 and then let the operator associated with the chain c23 act, or we can form the sum c13 of the chains which now runs from 1 to 3 and let the operator associated with this chain act. Of course, the result will be the same. Thus we can interpret the act of applying the operator given by the chain c23 to an excited state as moving one of the particles from vertex 2 to vertex 3. The same applies to magnetic particles.

Now let us again consider the image above. Look at the blue dashed loop. According to what we have just said, we can start with the particle pair created by the solid blue line and then move one of the particles around that closed loop. The result will be a particle pair living at the same location as the original pair. You might think that the new state is in fact identical to the old state, but this is not true – the state picks up a phase factor minus one. This is the reason why Kitaev calls our quasi-particles anyons, a term generally used in particle physics to characterize particles with non-trivial statistics (in our case, mutual statistics)

Exchanging particles by moving them around each other can be described as an action of the pure braid group, i.e. as an action of the group of braids that return each particle to its original position (this subgroup, the kernel of the natural homomorphism from the braid group to the permutation group, is sometimes called the group of colored braids). If we perturb a braid slightly, the action will remain unchanged, so the unitary operators which are given by that action are by construction rather stable in the presence of noise. In [1], Kitaev developed the idea to use braid group representations to implement quantum gates. Unfortunately, the set of operators that we can obtain in this way using the toric code is not sufficient to implement a universal quantum computer – we will get back to this point in a later post when we look at the actual procedures used to perform computations in a toric or surface code. Therefore Kitaev goes on to describe certain hypothetical particles called non-abelian anyons which, if they can be physically realized, provide sufficiently rich unitary operations. This approach is called topological quantum computation. We will not go into further details but refer to the original work [1].

## Measurement and error correction for toric codes

Let us quickly summarize what we have found so far. The code space of a toric code can be thought of as the ground state of a physical system, describing a collection of two-state systems located on the edges of a lattice. Errors correspond to fundamental excitations, i.e. quasi-particles, that come in pairs and sit on vertices and faces of the lattice. These errors can be detected by measuring face and vertex operators.

How can we perform these measurements in practice? To understand this, let us first recall a general approach for measuring eigenvalues. Suppose that U is an operator that is hermitian and at the same time unitary. Consider the following circuit which is a combination of two Hadamard gates with a controlled-U operation.

Let us calculate the state after applying this circuit. The initial state is

$|0 \rangle |\psi \rangle$

After applying the Hadamard gate to the first qubit, we obtain

$\frac{1}{\sqrt{2}} \big[ |0 \rangle |\psi \rangle + |1 \rangle |\psi \rangle \big]$

which the controlled U operation turns into

$\frac{1}{\sqrt{2}} \big[ |0 \rangle \otimes |\psi \rangle + |1 \rangle \otimes U |\psi \rangle \big]$

Now U is unitary, and therefore its eigenvalues are plus and minus one. So we can decompose our state $|\psi \rangle$ as

$\alpha |\psi_+ \rangle + \beta |\psi_- \rangle$

where $|\psi_+\rangle$ is an eigenstate with eigenvalue plus one and $|\psi_-\rangle$ an eigenstate with eigenvalue minus one. Plugging this into the above equation and applying the final Hadamard operator shows that the final outcome of the circuit will be

$\alpha |0 \rangle \otimes |\psi_+\rangle + \beta |1 \rangle \otimes |\psi_- \rangle$

If we now apply a measurement to the control qubit, which serves as an ancilla here, the state is projected onto either $|1 \rangle \otimes |\psi_- \rangle$ or $|0 \rangle \otimes |\psi_+ \rangle$, with probabilities $|\beta|^2$ and $|\alpha|^2$. The remaining qubits end up in the state $|\psi_-\rangle$ or $|\psi_+ \rangle$, i.e. in an eigenstate. Thus applying this circuitry and measuring the ancilla has the same implications as measuring the operator U and yields the same information.

Let us now apply this idea to the problem of measuring one of the operators Xv. This operator is a tensor product of four $\sigma_x$ operators, i.e. inversion operators. Therefore a controlled-Xv circuit is nothing but a sequence of controlled inversions, i.e. CNOT gates. Thus we can measure a vertex operator using the following circuit.

Here the four qubits at the bottom correspond to the four edges meeting the vertex v. So we need one ancilla qubit for doing the measurement, and we need to be able to combine it in gates with all the four qubits sitting on the edges meeting v. For the physical implementation, the easiest approach is to place this additional qubit called measurement qubit in the vertex itself, in contrast to the data qubits that we have already placed on the edges. So we add one data qubit to each vertex.

A circuit along the same lines can be constructed to measure Zf for a face f. Using that $H \sigma_x H = \sigma_z$, we can construct such a circuit as follows. In the first time step, apply a Hadamard to both the ancilla and all target qubits. Then apply CNOT operators to all target qubits, controlled by the ancilla, and then apply Hadamard operators again. Now, it is well known that conjugating the CNOT with Hadamard gates switches the roles of target qubits and control qubits. This gives us the circuit displayed below.

Again, for each face f, we need an ancilla qubit to measure Zf. This qubit will have to interact with all data qubits on the edges bounding this face, so it is easiest to position it in the center of that face.

Overall, we have now added F + V measurement qubits to our original E data qubits. Now the Euler characteristic of the torus is zero, and therefore F + V = E, showing that this procedure actually doubles the number of physical qubits.

We have now arrived at a description of the toric code (and its cousin the surface code) that one can often find as a definition in the literature but which somehow obscures the relation to the surface topology a bit.

• We start with a lattice L with E edges and identify opposite borders
• On each vertex, we place a data qubit
• On each face, we place a measurement qubit called a Z-qubit
• On each vertex, we place a measurement qubit called an X-qubit
• To each X-qubit, we associate the operator that acts as a bit flip on the four data qubits next to the measurement qubit
• To each Z-qubit, we associate the operator that acts a a phase flip on the four data qubits next to the measurement qubit

Thus every data qubit is acted on by four operators, the two Z-operators corresponding to the adjacent faces and the two X-operators corresponding to the adjacent vertices. The additional measurement qubits – indicated by white circles in contrast to the dark circles being the data qubits – are shown below.

For one face and one vertex, we have indicated the associated operators Xv and Zf by adding letters X and Z to show how these operators act on the nearby data qubits. This is the graphical representation typically used in the literature (like [3]). Note that some authors use different conventions and associate Z-operators with faces and X-operators with vertices, but this is of course completely equivalent due to the duality relations between a lattice and a dual lattice.

This completes our short introduction to toric codes. In the next post in my series on quantum error correction, we will pass on to a planar geometry which turns the toric code into a surface code.

## References

1 A. Kitaev, Fault-tolerant quantum computation by anyons, arXiv:quant-ph/9707021
2. H. Bombin, An introduction to topological quantum codes, in Quantum error correction, edited by D.A. Lidar and T.A. Brun, Cambridge University Press, New York, 2013, available as arXiv:quant-ph/1311.0277
3. A.G. Fowler, M. Mariantoni, J.M. Martinis, A.N. Cleland, Surface codes: Towards practical large-scale quantum computation, arXiv:1208.0928v2

## Quantum error correction with stabilizer codes

In our previous discussion of quantum error correction, we have assumed that quantum gates can act on any two physical qubits. In reality, however, this is not true – only nearby qubits and interact, and our error correction needs to take the geometric arrangements of the qubits into account. The link between these geometric constraints and the theory of quantum error correction is the stabilizer formalism described first in [2].

This post will be a bit more formal, but is a necessary preparation to be able to define and understand the surface code and other topological codes. To introduce and motivate the formalism, let us for a moment come back to the simple example studied before – the three-bit code. In this code, the logical states were

$|0 \rangle_L = |000 \rangle$

and

$|1 \rangle_L = |111 \rangle$

These logical states span our two-dimensional code space and describe one logical qubit. A single bit flip error will flip one of the three involved qubits. Suppose, for instance, that a bit flip occurs on the first or second qubit. This error will modify our logical states as follows (the first two lines represent a bit flip on the first qubit, the next two lines a bit flip on the second qubit

$|0 \rangle_L \mapsto |100 \rangle$

$|1 \rangle_L \mapsto |011 \rangle$

$|0 \rangle_L \mapsto |010 \rangle$

$|1 \rangle_L \mapsto |101 \rangle$

Now consider the operator

$M_1 = \sigma_z^1 \sigma_z^2$

where the upper index on a Pauli matrix indicates the qubit on which the operator acts. The error-free logical states are both eigenstates of this operator with eigenvalue +1. The states that result out of a bit-flip error are also eigenstates of this operator, but with eigenvalue -1. Thus, a measurement of this operator will tell us whether a bit flip on the first or second qubit has occurred. Similarly, to also detect a bit flip on the third qubit, we need a measurement for a second operator, namely

$M_2 = \sigma_z^1 \sigma_z^3$

Let us now write down a few properties of these operators. First, they are hermitian and therefore correspond to measurements. Second, the logical states are eigenvectors of these operators with eigenvalues +1 and the code space is exactly the subspace of the three-qubit state space that is fixed by M1 and M2. We can think of these operators as defining linear constraints that together define the code space, as indicated below.

Moreover, if a bit flip error occurs, the resulting state will again be an eigenvalue, but for at least one of the operators the new eigenvalue will be -1. Thus, errors correspond to violated constraints. Finally, the two operators commute and can therefore be measured simultaneously.

These properties allow us to express the three bit code entirely in terms of the operators M1 and M2. The code space is the subspace which is left invariant by these operators. The syndrome measurement can be done by measuring both operators simultaneously, which is possible because they commute. Finally, all involved states – code space and states after bit flip errors – are eigenstates and therefore the measurement process does not collapse a superposition of these states and can therefore be executed without interfering with the actual quantum algorithm that we want to run.

This correspondence between sets of operators – the Mi in our case – and quantum error correction is the core of the stabilizer formalism. In a more general setting, we are looking at the Hilbert space of dimension 2n spanned by n physical qubits. Products of n Pauli matrices (where we include the identity matrix) are acting on this Hilbert space and form a group ${\mathcal G}_n$ called the Pauli group. Put differently, the Pauli group consists of those linear operators on the Hilbert space that can be written as a product

$g = \mu (\sigma_x^1)^{a_1} (\sigma_y^1)^{b_1} (\sigma_z^1)^{c_1}\cdots (\sigma_x^n)^{a_n}(\sigma_y^n)^{b_n} (\sigma_z^n)^{c_n}$

with coefficients $a_i, b_ic, c_i$ and an overall phase factor $\mu \in \{ \pm 1, \pm i \}$ (we can even assume that all the $b_i$ are equal to one as $\sigma_x \sigma_z$ is a multiple of $\sigma_y$). Any two elements of the Pauli group either commute or anti-commute. Within this group, we now consider a finite set $\{ M_i \}$ of commuting elements and the subgroup S of the Pauli group generated by this set. This group (which is abelian as it is generated by mutually commuting elements) is called the stabilizer group. To this group, we can associate the subspace that consists of all vectors that are fixed by all elements of the group, i.e. the space

$T_S = \{ |\psi \rangle \, \text{such that} \, s|\psi \rangle = |\psi \rangle \forall s \in S \}$

This space will be the code space of our code. If the group S is generated by l independent generators, the dimension of the code space can be seen to be 2n-l.

In the example of the three bit code, we had n=3 and l=2, which gives us a two-dimensional code space, corresponding to one logical qubit (if you try work out the details, you will see that in order for this to be true, we have to assume that $-1 \notin S$ – you might want to take a look at my notes or the chapters on quantum error correction in [1] for the mathematical details).

Which errors is our code able to detect? Suppose that E is an error operator that is itself in the Pauli group. We know that any two elements of the Pauli group either commute or anti-commute. Let us assume that E does in fact anti-commute with some element s of S. If $|\psi \rangle$ is a state in the code space, we can then calculate

$s E |\psi \rangle = - E s |\psi \rangle = - E |\psi \rangle$

Therefore the state $E |\psi \rangle$ is now in the -1 eigenspace of s. Thus if we measure all elements of S, the outcome -1 for one of the measurements will tell us that an error has occurred. A similar argument shows that if the product of any two errors anti-commutes with at least one element of S, then we can also correct the error based on only the measurement results of elements in S. Mathematically speaking, the set of all elements of the Pauli group that commute with all elements of S is the centralizer of S, which in this case turns out to be equal to the normalizer N(S) of S, and a set $\{E_\alpha \}$ of errors can be detected and corrected if and only if

$E_\alpha E_\beta \in S \cup (\mathcal{G} - N(S))$

for any two error operators $E_\alpha, E_\beta$. So the elements of the Pauli group that are neither in S nor in N(S) correspond to correctable errors.

What do the elements of the normalizer correspond to? Suppose that n is an element in the normalizer and therefore commutes with all elements of S. Then, for any element s in S and any element $|\psi \rangle$ of the code space, we have

$s n |\psi \rangle = n s |\psi \rangle = n |\psi \rangle$

Consequently, the element $n |\psi \rangle$ is again fixed by all elements of S and therefore in the code space TS. Thus the elements of N(S) generate automorphisms of the code space, i.e. logical operations. It is one of the strengths of the stabilizer formalism that once we have the stabilizer group S, we can not only derive the code space and the correctable errors, but can also use group theoretic considerations to describe logical operations – this will become clearer as we study surface codes in a later post.

For later reference, let us briefly summarize what we have found. Given l independent generators si of the stabilizer S, we can define a code space as the set of all vectors that are fixed by the si. This subspace will encode n – l logical qubits in n physical qubits. To detect an error, we measure all the si. Each measurement will give us minus one or plus one. Any occurrence of minus one indicates an error (this is why the combined result of all measurements is usually called the error syndrome) and will also tell us which operation we have to apply to correct the error. To implement logical quantum gates on our code space, we can apply elements in the normalizer N(S) that map the code space into itself. Thus we have expressed error detection, error correction and logical operations entirely in terms of the stabilizer group and the language of group theory.

But how do we find useful stabilizer codes? One approach is given by the check matrix formalism, which allows us to express stabilizer codes in terms matrices and metrics, i.e. in terms of linear algebra. This approach is generalizing the parity check matrix from the classical theory of error correction. A second source of stabilizers, however, comes from a more surprising direction – algebraic topology. In fact, given a surface described in terms of a cell complex, we can associate elements of the Pauli group with every cycle and every co-cycle. For certain choices of cycles and cocycles, this will give us abelian subgroups of the Pauli group which in turn create error correction codes. This is how geometrical constraints in the actual implementation enter the scene and leads to a class of codes known as surface codes and toric codes that we will start to study in my next post.

1. E. Rieffel, W. Polak, Quantum computing – a gentle introduction, MIT Press, Cambridge 2011
2. D. Gottesman, Stabilizer Codes and Quantum Error Correction, Caltech Ph.D. Thesis, available as arXiv:quant-ph/9705052
3. J. Kempe, Approaches to quantum error correction, S\’eminaire Poincar\’e 1 (2005), pp. 65–93, available as arXiv:quant-ph/0612185

## Fault tolerant quantum computing

In the previous post, we have looked at the basic ideas behind quantum error correction, namely the encoding of logical states so that we are able to detect and correct errors like bit flip and phase flip errors. Unfortunately, this is not yet good enough to implement quantum computing in a fault-tolerant way.

What is still missing? First, we have only discussed errors that impact saved quantum states, i.e. random bit flips or phase flips that invalidate a given state stored in a quantum register. We have not yet, however, understood how we can protect quantum states during processing, i.e. deal with errors that do not occur on a saved qubit but that occur while we manipulate these qubits by executing quantum gates.

Second, and maybe even worse, implementing error correction involves adding additional qubits and quantum gates to our circuit. Obviously, these additional elements will again be subject to errors and require protection. So we might need an additional correction step to account for these errors. This additional step will again introduce additional errors and so forth. It is not a priori clear that we do not introduce more errors than we can correct when following this path. Fortunately, the famous threshold theorem asserts that under certain conditions, we can win this race and correct errors faster than we create them.

## Error propagation and transversal quantum gates

But let us start with the first question – how can we implement quantum gates in a fault tolerant way? To illustrate the challenge, we start with a simple (though unrealistic) example. Suppose that we wanted to perform a CNOT gate on two states encoded with the three-bit code. The obvious approach – decode the two input states, apply an ordinary CNOT and encode again – does not appear to be a good choice, as uncontrolled errors might sneak in while the states are unencoded and therefore vulnerable. So let us try to implement a CNOT operation directly on the encoded states. This can be done with the following circuit.

Let us see why this circuit works. Suppose the control word is the encoded (logical) state

$|0 \rangle_L = |000 \rangle$

Then all three qubits in the control word are zero. Consequently, the three CNOT gates do nothing and the target word is not changed. If, on the other hand, the control word is

$|1 \rangle_L = |111 \rangle$

then all three qubits in the target word will be toggled, and thus the target word $|0 \rangle_L$ will be mapped to $|1\rangle_L$. Thus our circuit does in fact implement a CNOT gate on the logical level.

However, there is a problem with this circuit. To see this, suppose that there is a bit flip error in the first qubit of the control word. Then this error will make all three CNOT gates toggle their target state, and a logical $|0 \rangle_L$ on the target qubits will be flipped into a logical $|1 \rangle_L$. Thus even a single error in the input will render the result unusable. This is an example for uncontrolled error propagation: a single error in the control word input has propagated across all three qubits that make up the second input and our code – which is only able to protect against single qubit errors – fails.

Obviously, the propagation of errors can never be entirely avoided. However, what we can avoid is the uncontrolled propagation. To explain this, let us take a look at the following circuit.

This circuit also realizes a CNOT on the level of logical states. However, for this circuit, an error in one of the qubits of the block of three qubits that encode the logical control bit can only propagate into one qubit in the second block – the block of three qubits that encode the target bit. Therefore one error per input block will only propagate into one error in each output block. If we are able to fully recover from one qubit errors per block, we can detect and correct the errors even though they have propagated through the circuit (again as long as only one error occurs). Thus the trick is to somehow spread potential errors across multiple blocks so that no single block is fully corrupted. This connectivity property of a circuit is sometimes called transversality.

Transversal gates are fault-tolerant, as an error in one qubit can only propagate into one qubit of any other code block, a situation from which we can recover. For many codes, many basic gates can be implemented qubit-wise and therefore transversally on the level of logical qubits. Unfortunately, a full set of universal quantum gates cannot be constructed in such a way – typically one gate remains for which no transversal implementation can be found. Fortunately, there are other (considerably more complicated) methods available to construct fault-tolerant logical gates. In [1], Shor has described the first fully universal set of fault-tolerant logical gates for his nine qubit code. In many modern implementations, a class of codes called surface codes are used, that also allow us to implement a universal set of quantum gates in a fault tolerant way and that we will study in a later post. Similar approaches work for measurements that can have errors as well and that need to be organized in a way so that errors can be detected and corrected, and the same applies for the actual error correction circuits.

## Fault tolerant quantum computation and the threshold theorem

Even if we are able to implement quantum gates in a fault tolerant way, one challenge remains. So far, we have only discussed how to protect against single qubit errors. Of course, the rationale behind this was the assumption that errors in two or more qubits are much less probable than single qubit errors. Thus, we have reduced the error probability. For real applications, our new, reduced error probability might still not be sufficient. We therefore might have to add additional error correction circuits. Now the probability of failure for a circuit is not only a function of the reliability of every single component, but also a function of the number of components. The more error correction mechanisms we add, the higher the number of components and the higher the overall fault probability. In the worst case, the increase of the fault probability by adding the additional components needed for the error correction might eat up all the benefits of the error correction and lead to an overall higher probability of failure.

To be able to better grasp the problem, let us introduce some terminology. Suppose we are given a circuit C that describes a quantum computation. We can think of the overall computation as being organized in individual time steps. At each time step, every qubit in the circuit takes part in a basic set of possible operations (or, to mention a different term that is sometimes used, locations). This can be a gate, the preparation of a qubit in a initial state, a measurement or a “wait step”, i.e. the identity operation leaving the involved qubits unchanged. At each time step, a qubit is involved in exactly one of these operations.

Given such a circuit, we can now try to build a fault tolerant circuit performing the same computation as follows. We replace each qubit in the circuit C by a block of qubits encoding a logical qubit. Similarly, we replace each operation in C by the corresponding fault tolerant equivalent, i.e. we replace each gate by the corresponding logical gate, a measurement by a fault tolerant measurement, and a preparation by a fault tolerant state preparation. In addition, we place a (fault tolerant) error correction circuit in each block which is the output of a logical gate. The result is called a simulation of the original circuit. An example is displayed below, using (roughly) the graphical language described in [2].

The upper part of this diagram shows an original circuit, in which we prepare two qubits in a known state, then apply the unitary transformation U and finally measure both qubits. The result of the replacement procedure described above is displayed in the lower part of the picture, using again the (unrealistic) example of a three qubit code. Here we have replaced each qubit by a corresponding block of three qubits holding the code word. The operation U has been replaced by its logical implementation, acting on the encoded states, and both after the preparation and after applying U, we add a circuit performing error correction.

Let us now study how the fault probability changes when we pass to a simulation. In general, this is a long and complicated argument, and we refer the reader to [3], [4] and [5] for the details. However, there is a simple example that can be used to illustrate some of the key ideas.

In fact, let us go back to the circuit above that we have used for a fault-tolerant implementation of the CNOT gate. To analyze the probability that this circuit fails, let us assume that the only error that can occur is a bit flip error in the target qubit of each physical CNOT gate and that this error occurs with probability p. Thus we assume that error correction, state preparation and measurement work in an idealized, error free way.

Let us now look at the different combinations of faults that can occur during a specific run of the simulation circuit, i.e. a different fault paths. A first fault path is given by a single bit flip error in one of the target qubits after executing the CNOT operation. However, this affects only one qubit and can therefore be corrected by the error correction circuit. Thus, for this fault path, the simulation works correctly.

There is, however, a different fault path, given by a fault in target qubit one and target qubit two, whereas the CNOT acting on target qubit three operates correctly. Assuming that the individual faults are statistically independent, the probability for this fault path is

$p \cdot p \cdot (1 - p) = p^2 - p^3$

This would introduce an error from which we could not recover, as our code is only able to correct errors in one qubit. But this is not yet the overall probability of failure for our new circuit, as there is a second fault path that we need to consider – we could as well have a failure in the first and the third instead of the first and second qubit. Obviously, the probability for this fault path is the same. And there is a third fault path with two errors, given by an error in the second and third target qubit. These three different fault paths, all with two errors, correspond to the

${{3} \choose {2}} = 3$

possibilities to choose two error locations out of the three target qubits. And finally, it might happen that faults occur in all three locations, an event which clearly has probability p3. Thus the overall probability to have at least two faults is

$3 (p^2 - p^3) + p^3 = 3 p^2 - 2 p^3 \leq 3 p^2$

Thus the overall probability that the circuit does not operate correctly does not reduce from p – for the physical CNOT operation without error protection – to p2, but to 3 p2, where the factor 3 reflects the growth of the circuit when passing from the physical circuit to the simulation, i.e. the impact of the additional gates.

Of course this is a very simplified example, but the hard work done in the original papers in the references demonstrates that the result is true in general: if we pass from a circuit to its simulation, the change of the error rate is described by a rule of the form

$p \mapsto c p^{t+1}$

where c is a constant that reflects the growth of the circuit and t is the number of errors that the code that we use can still correct. Obviously, this does not help unless the right hand side is smaller than the left hand side, i.e. unless p is smaller than a threshold given by

$p < c^{-\frac{1}{t}} = p_{crit}$

If this is true, we can even iterate the procedure, and reduce the error rate below every prescribed level by adding additional layers of simulation. Thus given physical operations with error rate less than pcrit, we can implement each quantum circuit in a fault-tolerant way with arbitrary reliability.

Unfortunately, the exact value of the threshold is not known. The initial estimate provided in [3] was in the range of 10-6. Later, this was improved by several authors, and is mostly assumed to be in the range between 10-4 and 10-2 (see [6] for an overview of some results in this direction).

As c is in the order of pcrit-t, this implies that we need as many as a few hundred to a few thousand physical gates and qubits to simulate one logical qubit. Note that in general, the overhead depends on the desired accuracy and the level of simulations needed as well as on the factors entering the constant c, i.e. the encoding, the architecture of the logical gates, as well as how close the actual physical error rate is to the threshold. Improving the threshold and thereby reducing the overhead remains one of the most active areas of current research in quantum computing and is essential for paving the road towards a large-scale fully fault tolerant universal quantum computer.

At the time of writing, a class of codes known as surface codes appear to be good candidates for the implementation of fault-tolerant quantum computers, not only because these codes have good error correction properties, but also because they are well adapted to the geometry of the actual quantum computer. In the next few posts, we will work towards understanding surface codes and other topological codes, starting with the formalism of stabilizer codes that will be the focus of the next post.

1. P. Shor, Scheme for reducing decoherence in quantum computer memory, Phys. Rev. A Vol. 52 (1995) Issue 4, pp. 2493–2496
2. D. Gottesman, An Introduction to Quantum Error Correction and Fault-Tolerant Quantum Computation, arXiv:0904.2557v1
3. D. Arahonov, M. Ben-Or, Fault Tolerant Quantum Computation with Constant Error, arXiv:quant-ph/9611025
4. E. Knill, R. Laflamme, W.H. Zurek, Threshold accuracy for quantum computation, arXiv:quant-ph/9610011,
5. P. Aliferis, D. Gottesman, J. Preskill, Quantum accuracy threshold for concatenated distance-3 codes, arXiv:quant-ph/0504218v3
6. S. J. Devitt, W. J. Munro, K. Nemoto, Quantum Error Correction for Beginners, arXiv:0905.2794v4

## Basics of quantum error correction

Do usable universal quantum computers exist today? If you follow the recent press releases, you might believe that the answer is “yes”, with IBM announcing a 50 qubit quantum computer and Google promoting its Bristlecone architecture with up to 72 qubits. Unfortunately, the world is more complicated than this – time to demystify the hype a bit.

## The need for error correction

The important point is that it is not just the number of qubits that matters, but also their quality. When we study quantum algorithms like Shor’s algorithm, we are working with idealized qubits that behave exactly the way an isolated two-state quantum system is supposed to behave – these idealized qubits are often called logical qubits. However, in a real world implementation, there is no fully isolated two-state quantum system. Every system interacts to some extent with the environment, an interaction that we can try to reduce to a minimum, for instance by cooling our device down to very low temperatures, but never fully avoid. A trapped ion could, for instance, interact with radiation entering our device from the environment, and suddenly is part of a larger quantum system, consisting of the ion, the photons making up the radiation and maybe even the source of the photon. This will introduce errors into our system, i.e. deviations of the behavior of the system from the idealized theoretical model.

In addition to unwanted interactions with the environment, other errors could creep into our computation. When manipulating qubits to realize gates, we might make mistakes, for instance by directing a microwave pulse with a slightly incorrect frequency at our qubit, and we can make mistakes during each measurement.

Thus the real qubits in a quantum computer – called physical qubits – are prone to errors. These errors might be small for one qubit, but they tend to propagate through the circuit and add up to a significant error that will render the result of our quantum computation unusable. Thus we need error correction, i.e. the ability to detect and correct errors during our computation.

So how would you do this? Of course, errors can also occur in classical systems, and there are well developed methods to detect and correct them. Unfortunately, these approaches typically rely on the ability to copy and measure individual bits. This is easy for a classical bit, but more complicated for a qubit, as a measurement will collapse our system into an eigenstate of the observed operator and thus interfere with quantum algorithm.

The good news is that quantum error correction is still possible. In this post, I will try to explain the basics, before we then dive into more advanced topics in the next few posts.

## Encoding logical states

In order to understand the basic ideas and structures behind quantum error correction, it is useful to study a simplified example – the three qubit code. To introduce this code, let us suppose that we have access to a communication channel across which we can send individual qubits from one quantum device to another one. Suppose further that this transmission is not perfect, but is subject to a bit flip error with a probability p. Thus, with probability p, a one-qubit state $|\psi \rangle$ will be changed to $X |\psi \rangle$ during transmission, where $X$ is the usual bit flip operator, and with probability 1-p, the transmission does not change the state. The aim is to construct an encoding of a qubit such that these errors can be detected and corrected.

To achieve this, we encode every single qubit state in a three-qubit state before transmitting it. Thus we use the following encoding

$|0 \rangle \mapsto |000 \rangle$

$|1 \rangle \mapsto |111 \rangle$

It is not difficult to see that this encoding can in fact be realized by a unitary circuit – the circuit below will do the trick.

To transmit one qubit, we use this encoding to obtain a three-qubit message. We then send those three qubits through our communication channel. After the transmission, we apply a procedure known as syndrome measurement, using the following circuit.

Let us see what this circuit is doing. First, let us suppose that the original qubit was $|0 \rangle$, encoded as $|000 \rangle$, and no error did occur during the transmission. Then the three qubits at the top of the circuit will still be $|000 \rangle$. In this case, the CNOT gates act as the identity, and the overall state after passing the circuit is

$|000 \rangle |00 \rangle$

Similarly, if the original qubit is $|1 \rangle$ and no error occurred, all CNOT gates will act as inversion. Thus the ancilla qubits will be inverted twice, and we end up with the state

$|111 \rangle |00 \rangle$

The situation is a bit different if a bit flip error has affected one of the qubits during the transmission, say the first one. Suppose the original state was again $|0 \rangle$. After encoding and transmission with a bit flip on the first qubit, we will receive the state $|100 \rangle$. Therefore both ancilla qubits will be inverted, and we obtain the state

$|100 \rangle |11 \rangle$

Let us now generalize these considerations. Assume that we are encoding the state $a |0 \rangle + b |1 \rangle$, so that the encoded state will be $a |000 \rangle + b |111 \rangle$. When we go through the above exercise for all possible cases, we arrive at the following table that shows the transmission error and the resulting state after passing the syndrome measurement circuit.

Error Resulting state
No error $(a |000 \rangle + b |111 \rangle ) |00 \rangle$
Bit flip on first qubit $(a |100 \rangle + b |011 \rangle ) |11 \rangle$
Bit flip on second qubit $(a |010\rangle + b |101 \rangle ) |10 \rangle$
Bit flip on third qubit $(a |001\rangle + b |110 \rangle ) |01 \rangle$

Now let us see what happens if we measure the ancilla qubits. First, note that all the states are already eigenstates for the corresponding measurement operator. Thus measuring the ancilla qubits will not change the state of the first three qubits and it will not reveal any information on the encoded state. The second important observation is that the value of the ancilla qubits tells us the exact error that has occurred. Thus we have found a way to not only find out that an error has occurred without destroying our superposition, but also to figure out which qubit was flipped. Given that information, we can now apply a bit flip operator once more to the affected qubit to correct the error. Again, this will not reveal the values of a and b and not collapse our state, and we can therefore continue to work with the encoded quantum state, for instance by running it through the inverse of the encoding circuit to get our original state back.

## More general error models

So what we have found so far is that it is possible, also in the quantum world, to detect and protect against pure bit flip errors without destroying the superposition. But there is more we can learn from that example. In fact, let us revisit our original assumption that the only thing that can go wrong is that the operator X is applied to some of the qubits and allow a more general operator. Suppose, for instance, that our error is represented by applying to at most one of our qubits the operator

$E = (1-\epsilon) 1 + \epsilon X$

for a small $\epsilon$. In contrast to our earlier assumption that the error operator is discrete, i.e. is either applied or not applied, this operator is now continuous, depending on the parameter $\epsilon$, i.e. it looks as if we had to deal with a full continuous spectrum of errors. A short calculation shows that after transmitting and applying the syndrome measurement circuit above, the state of our quantum system will now be

$(1-\epsilon)(\alpha |000 \rangle + \beta |111 \rangle)|00\rangle + \epsilon (\alpha |100 \rangle + \beta |011\rangle) |11 \rangle$

Now let us again apply a measurement of the ancilla qubits. Then, according to the laws of quantum mechanics, the system will collapse onto an eigenstate, i.e. it will – up to normalization – end up in one of the states

$(\alpha |000 \rangle + \beta |111 \rangle)|00\rangle$

and

$(\alpha |100 \rangle + \beta |011\rangle) |11 \rangle$

But these are exactly the states in which we end up if no error occurs or a single bit flip error occurs. Thus, our measurement forces the system to somehow decide whether an error occurred or not and if yes, which error occurred – we are opening the box in which Schrödingers cat is hidden. This is a very important observation often referred to as the digitization of errors – it suffices to protect against discrete errors as the syndrome measurement will collapse any superposition of different errors states.

So far we have worked with a code which is able to protect against a bit flip error. But of course this is not the only type of error that can occur. At the first glance, it looks like there is a vast universe of potential errors that we have to account for, as in theory, the error could be any unitary operator. However, using arguments similar to the discussion in the last paragraph, one can show that it suffices to protect against two types of errors: the bit flip error discussed above and the phase flip error, represented by the matrix

$Z = \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix}$

Note that the phase flip error has no classical equivalent, other than the bit flip error which can classically be interpreted as a bit flipping from one to zero or vice versa randomly. The first error correction code that was able to handle both, bit flip errors and phase flip errors (and combinations thereof, ), and therefore any possible type of error (as long as the number of errors is limited) was described in 1995 by P. Shor. This code uses nine qubits to encode one logical qubit and is somehow a repeated application of the three bit code, based on the observation that the Hadamard transform turns a bit flip error into a phase flip error and vice versa. Later, different types of codes were discovered that are also universal in the sense that they protect against any potential one qubit error, i.e. against any error that only affects one qubit.

Let us now look at the structure that these codes have in common. First, the encoding can be described as identifying a 2-dimensional subspace C, called the code space, in the larger space spanned by all qubits. In the case of the three qubit code, the code space is spanned by the states $|000 \rangle$ and $|111 \rangle$, corresponding to a logical zero and a logical one. More generally, the space C can have dimension 2n and the full Hilbert space can have dimension 2k, in which case we can encode n qubits in a larger set of k qubits (in the nine qubit code example, k = 9 and n = 1).

Hence the code space is spanned by a set of 2n basis vectors $|\psi_i \rangle$ that we call code words. In the case n = 1, it is common practice to denote the codewords by $|0 \rangle_L$ and $|1 \rangle_L$ to indicate that they represent a logical qubit encoded using k physical qubits.

In addition, there is a set of error operators, i.e. a finite set of operators $\{ E_\alpha\}$ like the phase flip or bit flip operators acting on the larger Hilbert space. These operators represent the impact of discretized noise and will generally move the code words out of the code space, i.e. they will rotate the code space onto different subspaces of the entire Hilbert space.

In order for a code to be useful, we of course need a relation between code space and errors that tells us that the errors can be detected and corrected. What are these conditions? A first condition is that no matter which errors occur, we can still tell the code words apart. This is guaranteed if the various error operators map different code words onto mutually orthogonal subspaces, in other words if

$\langle \psi_i | E_\alpha^{+} E_\beta | \psi_j \rangle = 0$

whenever $i \neq j$ and for all $\alpha, \beta$. Thus, even in the presence of errors, the different code words will never overlap.

What about the case that both code words are the same? It is tempting to ask for the condition

$\langle \psi_i | E_\alpha^{+} E_\beta | \psi_i \rangle = \delta_{\alpha \beta}$

i.e. to require that different errors map the same codeword to orthogonal subspaces. This would make recovery very easy. We could perform the measurements that correspond to projections onto these subspaces to detect the error and correct them by applying the inverse of the error operator. However, this condition is too restrictive. In the case of the nine qubit code, for example, it might very well happen that two different errors map a code word to the same state. However, the same applies for the correction, i.e. we do not have to distinguish between these two errors as the act similarly on the code space. Therefore, a more general condition is usually used which captures this case:

$\langle \psi_i | E_\alpha^{+}E_\beta | \psi_j \rangle = C_{\alpha\beta} \delta_{ij}$

for a hermitian matrix $C_{\alpha \beta}$. If the matrix has full rank, the code is called non-degenerate, otherwise – as in the case of the nine qubit code – the code is called degenerate.

Of course this encoding generates some overhead. To represent one logical qubit, we need more than one physical qubit. For the Shor code, we have to use nine physical qubits to encode one logical qubit. It is natural to ask what the minimum overhead is that we need. In 1996, Steane discovered a code that requires only seven physical qubits. In the same year, a code that requires only five qubits was presented and it was shown that this is a lower bound, i.e. there is no error correction code that requires less than five qubits.

So there is an unavoidable overhead – and the situation is even worse. To implement error correction, you need again quantum gates, which can of course also experience errors. Thus you need additional circuitry to protect the error correction against errors, which can again introduce errors and so forth. That there is a way out of this vicious circle is not obvious and the content of the famous threshold theorem that we will study in a later post – but even this way out is very hard to implement and might require thousands of physical qubits to implement one single logical qubit.

So even with a 72 qubit device, we are still far away from implementing only one logical qubit – and having a few thousand logical qubits to use Short’s algorithm to break an RSA key with a realistic key length is yet another story. So it is probably a good idea to take claims about universal supremacy of quantum computing within a few years with a grain of salt.

In this post, we have looked at some of the essential ideas behind quantum error correction, i.e. the ability to detect and correct errors. However, this is not enough to build a reliable quantum computer – after all, adding an error correction circuit introduces additional qubits that can also create new errors. In addition, we need to be able to perform calculations on our encoded states. So there is more that we need for fault-tolerant quantum computing which is the topic of the next post.