You might think in this era of social networks that a Lie group is a bunch of people on Facebook but no it’s not. Group theory is about describing symmetry and you can have discrete or continuous symmetries. The continuous symmetries are described by Lie groups and Lie algebras. Although discrete symmetries are technically less demanding they are just as important.

The occurrence of symmetries is actually all around us. We associate symmtery with esthetics and balance. Medieval castles exhibit typically left-right symmetry. We consider symmetric faces more attractive than assymetric ones. Even software patterns can be considered as a kind of invariance across multiple implementations. Mathematically speaking, the most basic continous symmetry is the fact that when you rotate a circle (keeping the centrer fixed) is remains the same. Rotate it around any angle you like and it’s indistinguishable from the original. This symmetry is called $U(1)$ and it’s a continuous, one-dimensional symmetry group. It’s called a group because you can combine rotations and everything remains well-organized. For example, if you have rotations $R_\alpha, R_\beta$ around respectively angles $\alpha,\beta$ you get a rotation $R_{\alpha+\beta}$. This fact can be represented by defining $R_\alpha := e^{i \alpha}$ and this kinda mapping is called a representation of the group. One says that $U(1)$ has a representation in the complex numbers. Representations are not unique, you can use $R_\alpha := e^{i\,f(\alpha)}$ provided that $f(\alpha+\beta) = f(\alpha)+f(\beta)$. In fact anything which obeys $R_\alpha . R_\beta = R_{\alpha+\beta}$ will do. This multiplicity is in fact necessary to describe the symmetry in different contexts.

The $U(1)$ symmetry is the little kid in a large family of continuous groups. The general theory underpinning all this is called Lie group theory and is a beautiful domain on its own linked to solving differential equations and tons of other things. For our purposes we’ll only need a sibling, called $SU(2)$. The ‘U’ in these names refers to ‘unitary’ and is key to keeping quantum states and two-dimensional objects in place.

The quantum gates $X, Y, Z$ we have seen in previous articles were defined as

$$1 =\begin{pmatrix}1 & 0\\ 0 & 1\end{pmatrix} \\ X=\begin{pmatrix}0 & 1\\ 1 & 0\end{pmatrix}\\ Y = \begin{pmatrix}0 & -1\\ 1 & 0\end{pmatrix}\\ Z = \begin{pmatrix}1 & 0\\ 0 & -1\end{pmatrix}$$

these four matrices are crucial in most quantum algorithms but also have a significance way beyond this. In fact, we’ll change them slighly like this:

$$1 =\begin{pmatrix}1 & 0\\ 0 & 1\end{pmatrix} \\ \sigma_x=\begin{pmatrix}0 & 1\\ 1 & 0\end{pmatrix}\\ \sigma_y = \begin{pmatrix}0 & -I\\ i & 0\end{pmatrix}\\ \sigma_z = \begin{pmatrix}1 & 0\\ 0 & -1\end{pmatrix}.$$

If you multiply these four matrices in any way you like you will always find that it results in (a multiple) one of them; the set is closed under multiplication. You can go beyond this and look at things like


and with some effort you will find that these objects are also closed under multiplication. They form the $SU(2)$ group, the special unitary group of dimension two. The $\sigma$-matrices are called the Pauli matrices.

Note along the way the similarity with $e^{i\alpha}=\cos\, \alpha + i\sin\,\alpha$. One can in fact create a similar formula with the Pauli matrices.

The general product formula for the Pauli matrices is:

$$\sigma_a.\sigma_b = i\epsilon_{ab}^c \sigma_c$$

where summation of repeated indices is implied and the symbol $\epsilon$ is called the completely anti-symmetric tensor. These two ideas are explained as an appendix below. If you take the commutator of the matrices you get

$$[\sigma_a.\sigma_b] = 2\,i\epsilon_{ab}^c \sigma_c$$

This simple formula is called the structure equation of the Lie algebra. The relation between Lie groups and Lie algebras can be made precise but for our aims it’s sufficient to understand that $\sigma$’s form an object referred to as algebra and the exponentiated of this forms a Lie group.

The $SU(2)$ group has deep roots in relativity and QFT, it’s both the prototypical example of a non-commutative symmetry group and it underpins the concept of spin.

One of crucial bits you need to grasp at this point is that the relation $\sigma_a.\sigma_b = i\epsilon_{ab}^c \sigma_c$ is valid with the given matrices above but actually defines the $SU(2)$ group. The matrices are just one way of satisfying the relation, there are many more. This is the central idea of group representations.

Another crucial bit you need to understand is that exponentiating Pauli matrices with some numbers $\alpha$ can be done with a function depending on points in space (and spacetime). So, instead of having


you can have


with $\alpha$ a function of space. Much like temperature in your room can be considered as a field returning a real number for every point in space, you can return a group element which differs across space. Ultimately this leads to the idea of gauge symmetry and gauge invariance.

The Einstein convention

The Einstein convention or ‘implied sum’ is nothing but the agreement that we drop the summation sign if indices are present multiple times. That is, if you see something like

$$p_a\, \Omega^a$$

(whatever the meaning of the symbols $p$ and $\Omega$) you should read it as

$$\sum_a \, p_a\, \Omega^a.$$

Similarly, if you see

$$\sigma_k\, \epsilon^{kpm} \omega_{pm}$$

you should read

$$\sum_{k,p,m}\sigma_k\, \epsilon^{kpm} \omega_{pm}.$$

Note that all of this also implies that you can alter the names of the indices in any way you like. Furthermore, in the context of QFT and relattivity it’s understood that the summation is over 0, 1, 2, and 3.


If you are a fan of TensorFlow (and deep learning) in general you will wonder whether the concept of a ‘tensor’ is the same here as it is in the context of machine learning. The answer is; well, a bit. In the context of machine learning (ML) one uses ‘tensor’ to denote vectors, tuples, matrices and higher-dimensional matrices. It’s equivalent to arrays of any dimension. In the context of general relativity tensors are key to describe four-dimensional spacetime and curvature (and by extension, gravitation). They are multi-dimensional arrays which satisfy constraints (how they transform between observers and points in space). In the context of QFT tensors are things which transform under a symmetry group called the Lorentz group.

The tensor $\epsilon_{ab}^c$ mentioned above is a special type of tensor defined as +1 if the indices form an even permutation of 1, 2, 3. For example, $(2,3,1)$ is an even permutation. It’s -1 if the indices form an odd permutation (for example, $(1,3,2)$ is odd) and zero in all other cases.
While it seems to be a funny simple tensor it’s a crucial bit when looking at spinors and special relativity.


The group $SU(2)$ is the group of unitary matrices of rank 2 with determinant 1 over the complex numbers. So, a generic element would be

$$U= \begin{pmatrix}a & b\\ c & d\end{pmatrix} $$

with $ad-bc = 1$. Unitarity means

$$U\,U^\dagger = 1$$

and if we write $U = e^{i\xi}$ with $\xi$ a complex 2×2 matrix the conditions become

$$\xi = \xi^*,\, \text{Tr}(\xi) = 0.$$

Any matrix with these constraints is a linear combination of the Pauli matrices, hence any element of $SU(2)$ is of the shape

$$U = \exp i\bar{\xi}.\bar{\sigma}.$$

Often this is also written as

$$U = \exp i\theta\,(\bar{n}.\bar{\sigma})$$

where $\bar{n}$ is a unit vector. The Pauli matrices form together the $su(2)$ (note the use of lower-case letters) and the exponentiated form is the $SU(2)$. This is the generic situation; a Lie group $G$ has a Lie algebra $g$ with generators $T_a$ which when combined and exponentiated give an element of the group. Sometimes one says that the tangent space of the group is the algebra.

The group admits matrix representations $F$ in any dimension and they all form a basis for the Hermitian matrices. That is, for any Hermitian matrix $A$ one has

$$A_ij = c_0\delta_{ij} + c_aF^a_{ij}$$

with $c_0, c_a$ the components in the representation $F$. In addition, all the basis elements can be normalized to

$$\text{Tr}(F_aF_b) = \frac{1}{2}\delta_{ab}.$$

So, if you multiply the component expansion with $\delta_{ij}$

A_{ij}\delta_{ij} &= c_0\delta_{ii}+c_aF^a_{ii}\
&= c_0N+c_a\text{Tr}(F^a)\
&= c_0N
hence $c_0 = A_{ii}/N.$ Similarly, the multiplication with $F^b_{ji}$ leads to
A_{ij}F^b_{ij} &= c_aF_{ii}F^b_{ji}\
&= c_a\frac{1}{2}\delta_{ab}\
&= \frac{1}{2}c_b

and therefore
$$A_{ij} = \frac{1}{N}\delta_{ij}A_{ij} + 2A_{lm}F^b_{ml}F^b_{ij}.$$

Taking the common $A$ out of the terms then gives

$$A_{lm}(\delta_{li}\delta_{jm} – \frac{1}{N}\delta_{ij}\delta_{lm} – 2F^b_{ml}F^b_{ij})=0$$

and is valid for any Hermitian matrix $A$ so the expression between brackets is necessarily zero:

$$F^b_{ml}F^b_{ij} = \frac{1}{2}\delta_{li}\delta_{jm} – \frac{1}{2N}\delta_{ij}\delta_{lm}. $$

This is called the second Casimir operator $c_2(F)$ and is an expression we will use when discussing the relation between knot polynomials and Chern-Simons theory.